390
Views
0
CrossRef citations to date
0
Altmetric
Articles

Linking Scottish vital event records using family groups

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon show all
 

Abstract

The reconstitution of populations through linkage of historical records is a powerful approach to generate longitudinal historical microdata resources of interest to researchers in various fields. Here we consider automated linking of the vital events recorded in the civil registers of birth, death and marriage compiled in Scotland, to bring together the various records associated with the demographic events in the life course of each individual in the population. From the histories, the genealogical structure of the population can then be built up. Rather than apply standard linkage techniques to link the individuals on the available certificates, we explore an alternative approach, inspired by the family reconstitution techniques adopted by historical demographers, in which the births of siblings are first linked to form family groups, after which intergenerational links between families can be established. We report a small-scale evaluation of this approach, using two district-level data sets from Scotland in the late nineteenth century, for which sibling links have already been created by demographers. We show that quality measures of up to 83% can be achieved on these data sets (using F-Measure, a combination of precision and recall). In the future, we intend to compare the results with a standard linkage approach and to investigate how these various methods may be used in a project which aims to link the entire Scottish population from 1856 to 1973.

Acknowledgements

We wish to thank Alice Reid of the Department of Geography, University of Cambridge and her colleagues, especially Ros Davies, for the work undertaken on the Kilmarnock and Isle of Skye databases.

Funding

This work was supported by ESRC Grants ES/K00574X/2 “Digitising Scotland” and ES/L007487/1 “Administrative Data Research Centre – Scotland.”

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Notes

2 The Scottish civil registers are of very high quality, with many more fields than are typically available in historic sources.

3 Forenames are also known as first or given names.

4 Preliminary evaluation of the transcription quality of the Scottish records, after processing of around 13% of the records, indicates a per-character transcription accuracy rate of 99.5% based on a quality assurance sample of up to 3% of the records. The original records may also contain other errors of various types.

5 In some cases multiple spouses may be listed on a death record.

6 In separate experiments we have explored using date-aware distance functions for comparing dates, and phoneticising names before comparison, without observing any significant improvements in linkage quality.

7 There would be around 1014 record pairs to compare. If a pair could be compared in one microsecond, for example, the overall process would take around three years.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.