Propensity Score Matching Methods in Educational Research

applying propensity score matching methods n.w

1 / 87

Embed Share

Explore the importance of rigor in educational research, the lack of clarity in causal claims, policy changes driving towards research rigor, and the role of cause and effect in randomized control trials. Discover methods to enhance methodological rigor through propensity score matching.

chec_a Follow

Uploaded on Mar 19, 2025 | 3 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Applying Propensity Score Matching Methods in Institutional Research Stephen L. DesJardins Professor Center for the Study of Higher and Postsecondary Education School of Education and Professor, Gerald R. Ford School of Public Policy University of Michigan CA AIR Conference Workshop November 20, 2014 1

Organization of the Workshop Examine conceptual basis of non- experimental methods This is a necessary but not sufficient condition for conducting methodologically rigorous research Survey conceptual foundations of matching methods, esp. PSM methods Provide & discuss Stata commands to estimate PSM models Share references to readings & sources of code to enhance post-workshop learning 2

Importance of Rigor in Research Systematically improving education policies, programs, practices requires understanding of what works Goal: Make causal statements Without doing so it is difficult to accumulate a knowledge base that has value for practice or future study (Schneider, 2007, p. 2). However, education research has lacked rigor & relevance Quote 3

Why the Lack of Rigor? Often lack of clarity about the designs & methods optimal for making causal claims Many researchers were not educated in the application of these methods Many lack time to learn new methods; may feel they are to complicated to learn Hard to create & sustain norms & common discourse about what constitutes rigor 4

Policy Changes Driving Push Toward Rigor NCLB Act (2001): Included definition of scientifically-based research & set aside funds for studies consistent with definition Education Sciences Reform Act (2002) replaced Office of Ed Research & Improvement (OERI) with IES Funding from IES, NSF, & other federal agencies tied to rigorous designs/methods Many reports focused on need to improve the quality of education research 5

Cause and Effect In randomized control trials (RCTs) the question is: What is effect of a specific program or intervention? Summer Bridge program (intervention) may cause an effect (improved college readiness) Shadish, Cook, & Campbell (2002): Rarely know all the causes of effects or how they relate to one another Need for controls in regression frameworks 6

Cause and Effect (contd) Holland (1986) notes that true causes hard to determine unequivocally; seek to determine probability that an effect will occur Allows opportunity to est. why some effects occur in some situations but not in others Example: Completing higher levels of math courses in HS may improve chances of finishing college more for some students than for others Here we are measuring likelihood that cause led to the effect; not true cause/effect 7

Determining Causation RCTs are the gold standard to determine causal effects Pros: Reduce bias & spurious findings, thereby improving knowledge of what works Cons: Ethics, external validity, cost, errors that are also inherent in observational studies Measurement problems; spillover effects, attrition Possibilities: Oversubscribed programs (Living Learning Communities, UROP ) 8

The Logic of Causal Inference Need to distinguish between inference model specifying cause/effect relation & statistical methods determining strength of relation The inference model specifies the parameters we want to estimate or test The statistical technique describes the mathematical procedure(s) to test hypotheses about whether a treatment produces an effect 9

A Common Causal Scenario Observed or Unobserved Confounding Variable(s) Effect Cause (e.g., Educational Outcome) (e.g., Treatment) 10

The Counterfactual Framework Owing to Rubin (1974, 1977, 1978, 1980) Intuition: What would have happened if individual exposed to a treatment was NOT exposed or exposed to a different treatment? Causal effect: Difference between outcome under treatment & outcome if individual exposed to the control condition (no treatment or other treatment) Formally: i = Yit Yic 11

The Fundamental Problem of causal inference is that if we observe Yit we cannot simultaneously observe Yic Holland (1986) ID d two solutions to this problem: One scientific, one statistical Scientific: Expose i to treatment 1, measure Y; expose i to treatment 2, measure Y. Difference in outcomes is causal effect Assumptions: Temporal stability (response constancy) & causal transience (effect of 1st treatment does not affect i s response to 2nd treatment) 12

Fundamental Problem (contd) Second scientific way: Assume all units are identical, thus, doesn t matter which unit receives the treatment (unit homogeneity) Give treatment to unit 1 & use unit 2 as control, then compare difference in Y. These assumptions are rarely plausible when studying individuals Maybe when studying twins, as in the MN Twin Family Study And this is not a study of baseball team! 13

The Statistical Solution Rather than focusing on units (i), estimate the average causal effect for a population of units (i s). Formally: i = E(Yt Yc) where Y s are average outcomes for individuals in treatment & control groups Assume: i s differ only in terms of treatment group assignment, not on characteristics or prior experiences that could affect Y 14

Example If we study the effects of being in a summer bridge program on GPA in 1st semester of college, maybe students who select into treatment are materially different than peers If we could randomly assign students to the program (or not) then we could examine causal impact of program on GPA. Why? Because group assignment would, on average, be independent of any measured or unmeasured pretreatment characteristics. 15

Problems with Idealized Solution Random assignment not always possible, so pretreatment characteristics & treatment group assignment independence violated Even when randomization is used, statistical methods are often used to adjust for confounding variables By controlling for student, classroom, school characteristics that predict treatment assignment & outcomes But this approach is often sub-optimal 16

Criteria for Making Causal Statements Causal relativity: Effect of cause must be made compared to effect of another cause Causal manipulation: Units must be potentially exposable to both the treatment & control conditions. Temporal ordering: Exposure to cause must occur at specific time or within specific time period before effect Elimination of alternative explanations 17

Issues in Employing RCTs May be differences in treated/controls even under randomization: Small samples Employ regression methods to control for diffs Cross-study comparisons & replication useful Avg effect in population may not be of most interest: ATT; Heterogeneous treat. effects Test for sub-group differences of treatment Mechanism for assignment to treatment may not be independent of responses Merit-based programs & responses ( halo ) 18

Issues in Employing RCTs (contd) Responses of treated should not be affected by treatment of others ( spillover effects) e.g.: New retention program initiated; controls respond by being demoralized (motivated), leading to bias upward (downward) of the treatment effects. Treatment non-compliance & attrition Random assignment of students to programs; but some will leave programs before completion ITT analysis; remove non-compliers; focus on true compliers 19

Quasi/Non-Experimental Designs Compared to RCTs, no randomization Many quasi-experimental designs Many are variation of pre-test/post-test structure without randomization Apply when non-experimental ( observational ) data used, which is often case in ed. research Pros: When properly done may be more generalizable than RCTs Main Problem: Internal validity Did the treatment really produce the effect? 20

Causation with Observational Data Often difficult to ascertain because of non- random assignment to treatment Example: Students often self-select into courses, interventions, programs, may result in biased estimates when na ve methods employed to ascertain treatment effects Goal? Mimic desirable properties of RCTs Solution? Employ designs/methods that account for non-random assignment; will demonstrate some today 21

Counterfactuals When using observational data the idea is: Find a group that looks like the treated on as many dimensions as you can measure Establishing what counterfactual is & how to create legitimate control group is difficult The best counterfactual is one s self! Adam & Grace time machine example Often why you see repeated measures designs Twins study in MN 22

The Nave Statistical Approach Y = + ?1X + ?2T + where Y is outcome of interest; X is set of controls; T is treatment dummy ; & ? are parameters to be estimated, with ?2 being parameter estimate of interest; is error term accounting for unmeasured or unobservable factors affecting Y. Problem: If T & are correlated, then estimate of ?2will be biased (1) is known as the outcome or structural equation or sometimes stage 2 ( ) 23

Selection Adjustment Methods Fixed effects (FE) methods, instrumental variables (IV), propensity score matching (PSM), & regression discontinuity (RD) designs all have been used to approximate randomized controlled experiment results All are regression-based methods Each have strengths/weaknesses & their applicability often depends on knowledge of DGP & richness of data available 24

Matching Methods Compare outcomes of similar individuals where only difference is treatment; discard other observations Example: GEAR UP effects on HS grad Low income (on avg) have lower achievement & are less likely to graduate from HS Na ve comparison of GEAR UP to others likely to give biased results because untreated tend to have higher HS graduation rates Use matching methods to develop similar non- treated group to compare HS grad rates 25

One Remedy: Direct Matching Find control cases with pre-treatment characteristics that are exactly the same as those of the treated group Strategy breaks down because as number of X s increases, pr(match) goes to zero Known as the curse of dimensionality e.g., Matching on 20 binary variables results in 220 or 1,048,576 possible values for X s! If you add in continuous vars (e.g., GPA, income) problem becomes even more intractable 26

Propensity Score Matching Solution: Estimate the propensity score (PS) & match treated with control cases based only on this single number This approach controls for pre-treatment differences by balancing each group s set of observable characteristics on a single number Goal: Estimate treatment effects for individuals with similar observable characteristics, as indexed by the PS 27

Estimating the Propensity Score Estimate Pr(treatment) Typically done using logistic regression, but some software uses probit Use PS to find control(s) with same score as treated observation Establishes counterfactual ( control group) Test for differences in outcomes between treated & counterfactual ( controls ) Often done using regression methods 28

Goal of PS Matching When done correctly, probability that treated observation has specific trait (X=x) is same as Pr(untreated) has (X=x) PSM is basically a resampling or even oversampling method, which involves a bias & variance tradeoff e.g., When matching with replacement, avg. match quality increases & bias decreases, but fewer distinct controls are used, increasing the variance of the estimator 29

PSM Assumptions: Conditional Independence Assumption Conditional on observables, there is no correlation between the treatment & outcome that occurs absent the treatment Mathematically: (Y1,Y0) D | X After controlling for observables, the treatment assignment is as good as random Upshot: Untreated observations can serve as the counterfactual for the treated 30

Assumption: Common Support The probability of receiving treatment for each value of X lies between 0 and 1 Mathematically: 0 < P(D = 1| X ) <1 AKA the overlap conditionbecause ensures overlap in characteristics of treated & untreated to find matches (common support) Upshot: A match can actually be made between the treated and untreated observations 31

Assumptions (contd) When CIA & common support are satisfied, treatment assignment is strongly ignorable Though not an assumption, observed characteristics need to be balanced across the treated & untreated groups If not, then regardless of whether assumptions hold there willbe biased from selection on observable characteristics Can check for balancing & how much bias is reduced by matching on observables 32

Plan of Action for This Portion Discuss logical folder structure to store do files (programs), data, & output files Learn how Stata works & some basic commands Simulate DGP to examine consequences of violations of assumptions Later examine code to undertake PSM modeling & discuss how these techniques might be used in your research 33

Importance of Good Structure My bet is that IR folks like you know this already but Creating a logical folder structure for each project is important step in analysis process If you use a similar structure all the time you will be able to come back to projects at later date & understand what was done Also very important to provide comments in your do files so you know what you did Maybe someone else will pick up your work 34

Folder Structure CA AIR 2014 (folder located on C: drive) Articles (contains articles/chapters) Data (contains data files) Do Files (contains do files) Graphs (place to send graphs created by code) Results (place to send output created by code) Powerpoint (contains PowerPoints) Examples of path names: log using C:\CA AIR 2014\Log Files\CA AIR Log 1.log , replace use C:\CA AIR 2014\Data\CA AIR PSM DataSub.dta , clear 35

How Stata Works Command or point & click driven software Software resides in: C:\Program Files (x86) Stata13 (or Stata12) Type: adopath on command line to find paths to the ado files used Role of ado files Examine ado & help files Discuss user written ado & help files 36

The Look of Stata Toolbar contains icons that allow you to Open & Savefiles, Print results, control Logs, & manipulate windows Of particular interest: Opening the Do-File Editor, the Data Editor and the Data Browser. Data Editor & Browser: Spreadsheet view of data Do-File Editor allows you to construct a file of Stata commands, save them, & execute all/parts The Current Working Directory is where any files created in your active Stata session will be saved (by default). Don t save stuff here, direct to folders discussed above 37

Windows in Stata Review, Results, Command, & Variables windows Help: Search for any command/feature. Help Browser, which opens in Viewer window, provides hyperlinks to help pages & to pages in the Stata manuals (which are quite good) May search for help using command line Role of findit & ssc install Locate commands in Stata Technical Bulletin & Stata Journal; Demo loading the psmatch2 command On command line type: ssc describe psmatch2 then ssc install psmatch2 & then help psmatch2 38

Stata Program Files Called do files; contain Stata code/commands we run to produce results Do File Name: CA AIR PSM Violations Simulation.do in the Do Files sub-folder in CA AIR 2014 main project folder Later will use: CA AIR PSM.do in same place There are also menu options to run commands in Stata, but we won t do this May be useful for some on the fly analysis, but it is NOT a good way to do most projects Reasons: Reproducibility & transportability 39

Simulating Condition Violations Before delving into real application of propensity score matching in education research, we will examine effects of a few condition/assumption violations on results To do so, we ll create fake data set so we know true parameters & can therefore figure out bias due to such violations 40

Effect of Selection Bias Under Different DGP Scenarios Examine effectiveness of different statistical methods to remedy selection bias Create artificial data using regression model: y = = + + ?x + + w + + where x is a control, w is treatment; data is created for y, x, w, and parameters are: y = = + + x + + w + + True treatment effect known; evaluate bias under different scenarios/using alt. methods 41

Simulations Conducted Relax following conditions: No correlation between x and No correlation between x and w 42

Scenario 1: The Ideal Condition Conditional on observables (x), treatment (w) is independent of the error ( ) The scenario mimics the data that would be generated from a randomized study x is created as an ordinal variable, taking on the values 1, 2, 3, 4 If we regress y on x (controls) and w (treatment indicator) we obtain 43

Scenario 2: Ignorable Treatment Assignment Assumption Violated Conditional on observables (x), the treatment (w) is NOT independent of the error ( ) All other conditions hold This is a classic selection bias condition Given the correlation between treatment and the error, we d expect na ve regression to result in biased estimate of treatment effect 44

Scenario 3: Multicollinearity In this scenario, conditional on observables (x), treatment (w) is independent of the error ( ) (ignorable treatment assignment) But we allow x & w to be correlated (there is multicollinearity) Often happens in social science research This scenario should not affect the size of the treatment effect, but SEs should be incorrect, thus significance tests wrong 45

Scenario 4 There is correlation between the regressors and non-ignorable treatment assignment Correlation between x and error & t x is continuous instead of ordinal All other assumptions from Scenario 1 hold Pattern in graph is produced by correlation between treatment & error term Happens when control variables (x s) are omitted Known as "selection on unobservables" 46

Scenario 5 In this scenario t and x correlated with the error term; w and x are also correlated This scenario assumes the weakest conditions for data generation The results produced by both the na ve regression and the matching methods result in substantial bias in the estimation of the treatment effect 47

Does Failure of Parents to Provide Required Support Hinder Student Success? Some parents provide the support they are required to, others do not Inferential problem: Students who do not get support ( treated ) may be different (on observed & unobserved factors) than those who receive support Correlation between Pr(no support) & educational outcomes makes parsing causal effects from observed & unobserved differences in students very difficult 48

Empirical Example Examine whether lack of expected parental financial support causes differences in: Loan use; attending part-time; worked 20+ hours/week in college; whether student dropped out in year one; completion of a bachelor s degree within 6 years Treatment variable: T = 1 if student did not receive required funds from their parents to pay for college expenses; 0 otherwise 49

PSM: Charting the Way, Step 1 Estimate conditional probability of receiving treatment; the propensity score Remedy imbalance in treated/controls using variables affecting selection into treatment; choose functional form (logit or probit) e.g. ln p/1-p = + ?x + w + Pairs of treated/control cases with similar PS are viewed as comparable even though they may have different covariate values 50

Propensity Score Matching Methods in Educational Research

Download Presentation

Presentation Transcript

Related

More Related Content