
Causes and Applications of Causal Inference in Statistics
Explore the life and work of a renowned American statistician who revolutionized causal inference in social sciences, development economics, and field experiments. Learn about the critical approach to dealing with missing data and noncompliance, with a focus on transparent notation and real-world examples. Discover the significance of instrumental variables and the impact of key statistical figures like Popper, Fisher, Bayes, and Neyman in advancing causal inference methodologies.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
The Statistician Who Caused a Stir Fresh from university, he took on a consulting role at the US Educational Testing Service. Unleashed to research what he wanted (within reason), he set about establishing the causal model that would later be named after him. While the police concentrate on looking for missing people, This person focuses his attention on dealing with missing data. This world-famous American statistician spent much of his illustrious career hunting out causes, effects, potential outcomes and data that had gone AWOL. His work on statistics and, in particular causal inference, has helped bring causality to the heart of social science revolutionizing development economics and randomized field experiments, not to mention psychology and medicine by addressing dropout and noncompliance. This important contribution has been widely recognized, earning him numerous awards and positions. Born in Washington D.C., he was an excellent student and embarked on an accelerated PhD physics program at Princeton University. Along the way, he switched to phycology, before being told to swot up on stats. Which he did with a PhD in statistics not before dabbling in computer science and teaching himself to program in Fortran.
PERSPECTIVE ON STATISTICS/DATA SCIENCE SEEK TRANSPARENT NOTATION CRITICAL EFFORT ESCHEW CLUTTER EXCEPT FOR MOTIVATING EXAMPLES WORK FROM REAL EXAMPLES THAT ARE GENERALIZABLE TRY TO USE INTUITIVE BUT PRECISE LANGUAGE
Critical application of approach is applied causal inference Popper / Fisher / Bayes / Neyman & implementing the combination of their contributions Illustrate ideas using example: treatment/control randomized experiment with the complication of one-sided non- compliance (some assigned treatment don t take treatment) Complete-data analysis is Fisher s great idea of randomization test of Popperian sharp null of no treatment effect How to deal with missing compliance status for those assigned control? Ignore compliance status? Medically stupid! Controls got no meds! Other inferior methods: Toss non-compliers, include them with control group (as treated) They both destroy theoretical balance created by the randomization Violate Fisher and Neyman Better: Use IV = Instrumental Variables (resulting answer same as Sommer Zeger 1991)
Same basic idea as Angrist & Imbens Nobel in Economics IV was explicated statistically in Angrist, Imbens & Rubin (1996), where Neyman s 1923 insightful notation was the key to resulting transparency and increased use But IV estimator has poor repeated sampling properties (Neyman again!) Because compliance status is missing for some control units, best to multiply-impute compliance status for controls IF they had been assigned to be treated, thereby using all randomized units and preserving the balance created by randomization, which was lost by discarding them Even worse is considering non-compliers as complying controls they are not! Evaluate how well this approach works in long-run practice using Neyman s repeated sampling approach
These four steps, Popper(state precise null) - Fisher(evaluate p-value) - Bayes(MI for missingness) - Neyman(evaluate calibration) This approach works in broad generality, especially if the last step is refined to be conditional calibration rather than simple global calibration Rubin (Stat in Medicine, 2012, CC)