
Understanding the Importance of Documentation in Household Survey Sampling
Learn why documenting the household survey sampling process is crucial for researchers, including how it helps in data understanding, quality assessment, and enhancing data credibility. Explore international standards like DDI for comprehensive metadata documentation.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
World Bank Documenting the Household Survey Sampling process
Outline Documenting sampling elements of a survey Metadata Editor Microdata library
Elements of a household sample survey What is it you want to measure\understand? Policy effectiveness Policy design Budgeting and planning Design Questionnaire Science + method to this Preparation of interviewer manuals\training of field staff Design sample (probabilistic, size) List and draw sample (n size) Interview Telephone, web, face-to-face, paper, tablet (CAPI) Collect, validate, clean, anonymize, document, disseminate, analyze
Why Document? http://dilbert.com/strip/2010-05-28
Why? Documentation, or metadata, helps the researcher to: Find the data they are interested in. Understand what the data are measuring and how the data have been created. Assess the quality of the data. To increase the credibility of the data. Users appreciate transparency in data collection and processing methods Rich metadata reduces the burden on the data producer, as it reduces the need to provide regular support to users of the data.
Documentation Provide detailed metadata For making data usable Users need to fully understand the data: why, by whom, when, and how they were collected and processed For making data discoverable in catalogs How will users know about the availability of your data? By providing searchable data catalogs. International metadata standards (in particular the DDI) and specialized software are available to help document and catalog microdata.
Documentation International Standard: DDI DDI is an XML metadata standard Standard checklist of what you need to know about a survey and its dataset(s) Documents the full survey life-cycle Developed by academic data centers Now used in most countries in the world
Documentation International Standard: DDI Quick Reference Guide for Data Archivists (Especially from Page 11 onwards)
Documentation Sampling documentation In the Metadata Editor, there is 1 section (which contains 4 sub-sections itself) about sampling:
Documentation: Sampling Checklist 1. Sampling design report with Allocation of the sample into strata Excluded strata, if any Estimation formulas (selection probabilities and weights) 2. Household listings forms 3. Sample frames For the first sampling stage/s: list of all sampling units (typically in Excel) For the last sampling stage: list of all households in each sample point 4. Non-response rates, by sample point 5. On the survey datasets Sampling weights Identification of the sample points 6. In the survey reports Standard errors, confidence intervals and design effects for key variables
Documentation Steps Organizing your files Gathering and preparing the data set Gathering and preparing the documentation Importing data and establishing relationships Importing external resources Adding metadata Running diagnostics Generating the standard survey documentation using the PDF generator Quality assessment Producing the output for publication
Documentation Sampling This item should document the design and definition of the sample size, including: sampling frame, sampling type and final size of the sample, as well as sample loss, estimation method and accurate calculation of the results. This Item is only applicable for sample surveys. This element provides information on the sampling frame and the methods and procedures used to select respondents. The desired sample size should also be mentioned.
Documentation Sampling Sampling Procedure This field only applies to sample surveys. Information on sampling procedure is crucial This section should include summary information that includes though is not limited to: - Sample size - Selection process (e.g., probability proportional to size or over sampling) - Stratification (implicit and explicit) - Level of representation - Strategy for absent respondents/not found/refusals (replacement or not) - Stages of sample selection - Design omissions in the sample - Sample frame used, and listing exercise conducted to update it It is useful also to indicate here what variables in the data files identify the various levels of stratification and the primary sample unit. These are crucial to the data users who want to properly account for the sampling design in their analyses and calculations of sampling errors.
Documentation Sampling Sometimes the reality of the field requires a deviation from the sampling design (for example due to difficulty to access to zones due to weather problems, political instability, etc). If for any reason, the sample design has deviated, this should be reported here. Major deviations from the sample design: this element is used to describe the correspondence between the units that were successfully surveyed and the planned sample. Any significant deviation should be mentioned here.
Documentation Sampling Response rate Response rate provides that percentage of households (or other sample unit) that participated in the survey based on the original sample size. Omissions may occur due to refusal to participate, impossibility to locate the respondent, or other.
Documentation Sampling Weighting Provide here the list of variables used as weighting coefficient. If more than one variable is a weighting variable, describe how these variables differ from each other and what the purpose of each one of them is. Example: Sample weights were calculated for each of the data files. Sample weights for the household data were computed as the inverse of the probability of selection of the household, computed at the sampling domain level (urban/rural within each region). The household weights were adjusted for non- response at the domain level, and were then normalized by a constant factor so that the total weighted number of households equals the total unweighted number of households.
Documentation The Metadata Editor and Pacific Data Library Nesstar Publisher Live Demo Pacific Data Library pdl.spc.int