4,958
Views
3
CrossRef citations to date
0
Altmetric
Articles

On the use of register data in educational science research

ORCID Icon
Pages 106-118 | Received 15 Aug 2016, Accepted 25 Mar 2017, Published online: 24 Apr 2017
 

ABSTRACT

Register data are described, in general and specific terms, by focusing on informational content from an educational science perspective. Arguments are provided on the ways in which educational scientists can benefit from register data. It is concluded that register data contain a great deal of information relevant to educational science. Furthermore, two specific features of register data are considered: their panel data nature, implying that register data analyses under certain conditions can account for aspects on which the registers are not informative, and the intergenerational links that these data contain that facilitate the separation of genetic and environmental influences on learning. It is observed that while register data do not contain direct links between students and teachers, this shortcoming can be overcome by merging register data with survey data on these links. As population data, register data enable analyses that are not feasible to conduct with survey data. An illustration is provided concerning how quantitative and qualitative researchers can benefit from combining register-based statistical analyses with in-depth case studies. The use of register data in evaluations of the causal effects of educational interventions is also described. To exploit these advantages, a discussion on how to access register data is included.

Acknowledgements

I am grateful to Lena Tibell for encouraging me to teach on this topic; my lecture notes were the main input for the article. The helpful comments from Mary James, Caroline Hall, and Sara Martinson are also much appreciated.

Disclosure statement

No potential conflict of interest was reported by the author.

Notes

1. In some non-Nordic countries, register data exist on specific issues. For example, in England there are data for all students who attend state schools concerning their results in national achievement tests (Crawford, Dearden, & Meghir, Citation2010). However, unlike in the Nordic countries, this register cannot be linked to other registers. For instance, it cannot be linked to information on the students’ parents to enable analyses of the possible influences of family background on the test results.

2. See http://ki.se/en/research/the-swedish-twin-registry, accessed February 2017. For analyses of the return to education utilising this register, see Isacsson (Citation1999, Citation2004).

3. For a study that uses this register to analyse the relation between performance in primary school and psychosocial problems in young adulthood among individuals in foster care, see Berlin et al. (Citation2011).

4. Cedefop, the European Centre for the Development of Vocational Training, is a decentralised agency of the European Union (EU). It was founded in 1975 and is based in Thessaloniki, Greece. Cedefop supports the development of European vocational education and training policies and contributes to their implementation.

5. The AES is part of the EU Statistics on lifelong learning and covers people in the age range of 25–64 years. Representative samples of individuals are randomly selected for interviews in each country. The first and second waves of the AES were conducted in 2007 and 2011, respectively. Denmark, Finland, Norway, and Sweden participated on both occasions. In 2011, the samples of the Nordic countries comprised 5000–6000 people, and the response rates were 55–65%. The third wave takes place in 2016 and 2017. See http://ec.europa.eu/eurostat/web/microdata/adult-education-survey, accessed February 2017.

6. The PIAAC is conducted by the Organisation for Economic Co-operation and Development (OECD) and targets individuals aged 16–65. The PIAAC’s primary purpose is to assess skills in literacy, numeracy, and problem solving by means of information and communication technology, but extensive information is also collected on education and training. The first wave of the PIAAC was conducted in 2011–2012 (23 countries) and 2014–2015 (nine countries). Denmark, Finland, Norway, and Sweden participated in 2011–2012; the number of respondents varied between approximately 4500 (for Sweden) and approximately 7300 (for Denmark). For further information, see OECD (Citation2013, Citation2016).

7. For simplicity, other important aspects, such as family background, are disregarded here.

8. To fully control for genetic factors with this approach, one must consider a very specific group of siblings, namely, identical twins.

9. Survey data may also include information that is classified as ‘sensitive’ in the Danish, Icelandic, Norwegian, and Swedish Personal Data Acts. Beside health information, data on political, religious, and sexual disposition are classified as sensitive. Any inclusion of these types of data requires an ethics review.

10. The country’s central ethics committee can always advise whether an ethics review is required. The web addresses of the central committees are: Denmark: www.cvk.sum.dk; Finland: www.tenk.fi; Iceland: www.vsn.is; Norway: www.etikkom.no; and Sweden: www.epn.se (accessed February 2017).

11. The national statistical agencies are Statistics Denmark, Statistics Finland, Statistics Iceland, Statistics Norway, and Statistics Sweden.

12. Information regarding which institutions to contact can be obtained from the national statistical agencies.

13. Backward identification is illegal; researchers handling pseudonymised data are often required to acknowledge, in writing, that they are aware of this restriction.

14. Presumably, this difference exists because, unlike ethics review committees, data-administering organisations consider only the risks of violating personal integrity, not the project’s social benefits.

15. At the time of the writing of this article, remote access was not used in Norway; instead, the researchers were provided with copies of the data, stored locally.

16. Of course, educational science is not the only discipline forgoing the opportunities provided by multi-methodological approaches.

17. For example, a logistic regression can be applied (Hosmer & Lemeshow, Citation2000). In such a regression, the dependent variable is binary: 1 if the student chooses a science track in upper secondary school and 0 otherwise. The explanatory variables are binary variables that indicate gender and school, respectively, and the products of these two variables, plus non-binary variables representing family background and scholastic achievements.

18. This measure is given by the estimated parameter for the binary gender variable, coded 1 for girls and 0 for boys.

19. This information is provided through the estimated parameters for the gender × school variables.

20. Another complication arises if the students can choose which school to attend. This results in a selection problem, similar to the selection problem discussed in connection with the qualitative approach. The outlined analysis then needs to be preceded by an analysis of school choice.

21. Since 1997, the starting age has been 6 years in Norway (e.g. Mellander & Fremming Anderssen, Citation2015).

22. For similar analyses of Norway and England, see Black, Devereux, and Salvanes (Citation2011) and Crawford et al. (Citation2010), respectively.

23. In English: Natural Sciences and Technology for All. For a description of this programme, see Mellander and Svärdh (Citation2015, 2017).

24. Randomised experiments are also denoted randomised control trials (RCTs).