
Exploring Insights from Millions of Historical Records Presented at FreeUKGenealogy Conference 2019
Delve into the possibilities of analyzing vast historical records, as discussed in Dr. Oliver Duke-Williams' paper at the FreeUKGenealogy Conference 2019. Discover use cases, sample data, and methodologies for constructing new tables from these records, providing valuable insights into multi-generational households and more.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
What can we do with millions of records? Paper presented at: FreeReg and FreeCen @ Twenty : FreeUKGenealogy Conference, 2019 King s Manor, University of York, 29/9/19 Dr Oliver Duke-Williams o.duke-williams@ucl.ac.uk @oliver_dw www.ucl.ac.uk/dis/people/oliverdukewilliams
Typical FreeCen use case Source: freecen.org.uk
Finding more information about Minnie Source: freebmd.org.uk
FreeCen: Norfolk sample Sample of raw data from 1861 c. 40,000 records in sample 423,000 records in county (100% transcription) Source: freecen.org.uk
County-of-birth of sample members Generally, county coded as Registration County Yorkshire as single county General references to England , Wales , Scotland Source: freecen.org.uk
Constructing new tables It is possible to produce tables that were never published at the time Multi-generational households Link generation coding to relation to head of household Identify earliest and latest generations per household Aggregate over other constraints
Multi-generational households Number of generations in household 1 2 3 4 Observed cases 2,381 5,114 740 17 Source: freecen.org.uk
Cornwall in FreeCEN Census 1841 1851 1861 1871 1881 1891 Records 340,901 354,744 362,111 358,043 326,187 318,637 %complete Cornwall is the only county that is 100% coded for the complete FreeCENrun 100 100 100 100 100 100 Source: freecen.org.uk
Popularity of the name Florence Number of Florences in Cornwall, 1891 140 120 Longfellow, (1857) Santa Filomena, The Atlantic Monthly 1(1) 100 Florences "Lo! in that house of misery A lady with a lamp I see " 80 60 40 20 0 1800 1820 1840 1860 1880 1900 Approx. year of birth Source: freecen.org.uk
Constructing longitudinal data Longitudinal data sets are constructed from modern censuses ONS Longitudinal Study; Scottish Longitudinal Study; Northern Ireland Longitudinal Study These rely on name and date of birth for linkage Date of birth was first asked in the 1971 Census Can we attempt to build longitudinal data from FreeCen data? Can we find the same person in two or more censuses
Matching people by name ? ? Source: freecen.org.uk
Matching people by name Source: freecen.org.uk
How unique are names? 100% 90% Cumulative % of people accounted for 80% 70% 60% 50% 40% One person in group: The name is unique in the data set 30% 20% 10% 0% 0 100 200 300 400 500 Number with matching name 1841F 1851F 1861F 1871F 1881F 1891F Source: freecen.org.uk
100% Cumulative % of people accounted for 90% 80% 90% / 100 70% 90% of people in the sample are accounted for by names with 100 or fewer persons sharing that name 60% 50% 40% 30% 20% 10% 0% 0 100 200 300 400 500 Number with matching name 1841M 1851M 1861M 1871M 1881M 1891M Source: freecen.org.uk
Increasing the chance of people being unique We can increase our chances of finding unique people by looking at more characteristics than just name Age Place of birth We might increase the chance of having a match by ignoring initials and second names etc
Uniqueness: surname, first forename, birthplace Cumulative % of people accounted for 100% 90% 80% 70% 60% 50% 0 5 10 Number of people with same characteristics 15 20 25 30 35 40 45 50 1851 1861 1871 1881 1891 Source: freecen.org.uk
How many people are unique? Characteristics used Surname, forename (male) Surname, forename (female) Surname, forename, birth place (persons) Surname, forename, age (persons) Surname, forename, age, birth place (persons) 1851 25% 33% 64% 67% 98% 1891 45% 52% 76% 84% 99% Source: freecen.org.uk
A very high proportion of people are unique on Name Age (years) Birth place This bodes well for potential to match across years As we move from Cornwall to UK, uniqueness would drop
But Just because people are unique within a census, they may not match between censuses Error in completing census Variations in spelling or presentation of name Variation in spelling of birth place Errors introduced by transcribers
Error in completion: failing to follow instructions 100 100 90 90 80 80 Age in years Age in years 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 20,000 10,000 0 10,000 20,000 10,000 5,000 0 5,000 10,000 Persons (1841) Persons (1851) Male Female Male Female Source: freecen.org.uk
Transcription errors Marital age differences, Cornwall, 1881 30000 25000 Number of observations 20000 15000 10000 5000 0 80 60 40 20 Age difference (years) 0 20 40 60 80 Wife older Husband older
Transcription errors: extreme marital age differences TREZISE TREZISE WAITE WAITE KNIGHT KNIGHT KNIGHT ALLAN ALLAN ALLAN William Mary Elizabeth William T William Abigail William Josia Mary A T K - - - - - - - - - - Head Wife Dau Grnson Head Wife Son Head Wife Boardr M M M - M M M M M - M F F M M F M M F M 17 73 38 4 16 69 36 16 69 2