
Future Population Estimation Methodologies: A Data-Driven Approach
Explore a comprehensive overview of the PECADO Project's innovative population estimation methods using administrative data sources. Learn about the Statistical Population Dataset (SPD), Signs of Life (SoL), and Dual System Estimation (DSE) techniques to derive accurate population statistics broken down by detailed geography. Discover how these methodologies address challenges in conducting annual Census surveys efficiently from 2024 onwards.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Demography Statistics - looking to the future John.Dunne@cso.ie Administrative Data Seminar 4thDecember 2018
Motivation Requirement: o to provide Census like statistics at detailed geography on an annual basis from 2024 onwards It is not feasible to conduct an annual Census in the traditional manner 2
Overview o Population Estimates PECADO Project and first Research Outputs o Broken down by geography (emerging thoughts) Use EIRCODE o Household Composition (emerging thoughts) A Methodological Framework o Putting the pieces together 3
PECADO Project Population Estimates Compiled from Administrative Data Only
Methodology Outline o Create a Statistical Population Dataset (SPD) using Signs of Life (SoL) o Adjust SPD counts using DSE to obtain population estimates o Validate assumptions no linkage error no erroneous data equal catchability in one list 5
Data Source Distributions (SPD or PAR)
DSE explained Na ve DSE . nx Nn= m Ideal DSE List A of size x but r is unknown List B of size n AB list match size m List B equal catchability or homogeneous capture Matching assumption no linkage error between List A and List B No erroneous records or over-coverage
First attempt population estimates, 2011 (SPD) Blocking by age, gender and nationality group - includes a source subsequently found to have erroneous records in over 65 age group Trimmed Dual System Estimation (TDSE) Extension of DSE methodologies that enables hunting for erroneous records Joint work between CSO and University of Southampton
Most recent attempt population estimates, 2016 (SPD) Note - - Different population concepts Possibility of erroneous records remaining
Innovation in Application SPD SoL approach reduces statistical problems from 4 to 1 Domain misclassification, linkage error, overcoverage, undercoverage Only one list requires equal catchability Use administrative data (Driver Licence Renewals - DLD) as list B in DSE Validate assumptions no linkage error (given) no erroneous data (TDSE extension of DSE methods) equal catchability in list B (swap admin list B for survey list B and compare results) Examine robustness of PECADO system (TDSE Trimmed Dual System Estimation) o o o o o 10
Breaking down by Geography - emerging thoughts
Geographical breakdown Dependency on EIRCODE in Public Administration Systems In the absence of EIRCODE, big challenge to code address strings 12
Methodology (emerging thinking) Business rules to choose geography based on address strings case of multiple addresses for persons DSE blocking by geography, age and gender clustering of similar small area geographies constrain by State level population estimates 13
Household Composition - emerging thoughts
Data sources Geography (EIRCODE) Relationships in administrative data sources Household surveys 15
Possible Methodological Framework 1. Classify each person on SPD by what type of household they belong to (best effort) HH(A) 2. Take Household Survey as correct Household Composition HH(B) 3. Use extension of DSE methods with missing covariates to estimate for population 4. Constrain to population size and estimated number of occupied dwellings 5. Collapse over HH(A) In Survey (B) Not in Survey HH(A)||A and B||HH(B) ||HH(B) HH(A)||In A||HH(B) HH(A)|| A, not B In SPD (A) Not in SPD HH(A)||not A, not B||HH(B) HH(A)|| B, not A ||HH(B) HH(A)||Population||HH(B) HH(A)|| In B||HH(B) An extension of DSE methods with missing covariates - research work at Utrecht University 16
Estimation Workflow Persons Persons Population Year t 2 Year t - 1 Year t Household Composition Household Composition Household Composition Geography Geography Geography Geography + Household Composition Household Composition Household Composition Geography + Geography +
Persons not on SPD/PAR allocated to existing and new households Persons on SPD/PAR allocated to existing and new households Persons within Households/ Geography Persons on SPD/PAR with known geography/ household Households/Geography Building out the SPD
Concluding comments o Objective is to o Enhance quality of data (EIRCODE + Coverage) o Create a methodological framework o Survey will be required (ground truth) o Requires a maturing NDI o Requires new methods, o Collaboration/partnership (with NSIs and Academia) o CSO developing new capabilities o Hopefully we can meet requirement for reference year 2024 Make a Virtual Census a reality 21
Thank you John.Dunne@cso.ie