
Risky Business: Correlation and Causation in Longitudinal Studies of Skill Development
Explore the challenges of understanding correlation and causation in longitudinal studies of skill development, as discussed in various research papers. Skepticism over certain types of analyses is highlighted, urging a critical evaluation of skill-building theories in relation to school achievement.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Risky Business: Correlation and Causation in Longitudinal Studies of Skill Development Drew Bailey Greg J. Duncan Tyler Watts Doug Clements Julie Sarama UC, Irvine NYU University of Denver
Developmental Psychology 2007, Vol. 43, No. 6, 1428 1446 0012-1649/07/$12.00 DOI: 10.1037/0012-1649.43.6.1428 School Readiness and Later Achievement Leon Feinstein University of London Mimi Engel Northwestern University Jeanne Brooks-Gunn Columbia University Holly Sexton University of Michigan Kathryn Duckworth University of London Crista Japel University de Quebec a` Montreal Greg J. Duncan Northwestern University Chantelle J. Dowsett University of Texas at Austin Amy Claessens Northwestern University Katherine Magnuson University of Wisconsin Madison Aletha C. Huston University of Texas at Austin Pamela Klebanov Princeton University Linda S. Pagani Universite de Montreal 5390 Citations
Risky Business: Correlation and Causation in Longitudinal Studies of Skill Development Drew H. Bailey, Greg J. Duncan, Tyler Watts, Douglas Clements and Julie Sarama American Psychologist (2018). 73(1), 81-94. DOI: 10.1037/amp0000146
Risky Business paper: Duncan et al. (2007) reached the wrong conclusions because its correlational patterns do not constitute a risky test of skill-building theories. Longitudinal analyses of other cognitive constructs over time are likely wrong for the same reason Skepticism over many kinds of longitudinal analyses may be warranted for similar reasons
Skill-building models of learning Counting addition problem solving subroutine of multiplication problem solving (Baroody, 1987; Lemaire & Sigler, 1995) Phonemic awareness written word recognition vocabulary reading comprehension (Stanovich, 1986) Cuhna & Heckman (2007) models of multiple skills and dynamic complementarities
Is the product of these links so strong that building skills at time 1 is a strong cause of skills at times 3 ,4, etc? MS1 MS2 Math skill at time 2 Math skill at time 1 Math skill at time 3
Questions in the 2007 paper: Assess skill-building theories of school achievement by answering: To what extent do school-entry academic, attention and socioemotional skills relate to later child well-being?
Analyses in the 2007 paper: we implement rigorous analytic methods that attempt to isolate the effects of school- entry academic, attention, and socioemotional skills by controlling for an extensive set of prior child, family, and contextual influences that may be related to children s achievement.
Analyses in the 2007 paper: Six non-experimental but longitudinal data sets Relate school-entry measures of academic, attention and socioemotional skills to later achievement Controls for family background Controls for measures of child IQ and behavior/temperament PRIOR TO school entry
Analyses in the 2007 paper (n=238 correlations and coefficients) Zero-order correlations with later achievement Regression coefficients with later achievement School-entry measures of: .44 .17** (.03) Reading .47 .34** (.04) Math .25 .10** (.01) Attention skills -.14 .01 (.01) Externalizing problems .21 .01 (.01) Social skills
Analyses in the 2007 paper (n=238 correlations and coefficients) Regression coefficients with later achievement School-entry measures of: .17** (.03) Reading .34** (.04) Math .10** (.01) Attention skills .01 (.01) Externalizing problems .01 (.01) Social skills
Why is this the wrong conclusion? Because the estimates don t hold up to riskier tests Because an alternative theory (modest skill building effects + bias from persistent unmeasured factors) explains the data better
Risky tests We borrow from Meehl s (1978, 1990) insight that when diverse theories make the same predictions, it is important to conduct risky tests that have the ability to distinguish among them. For us: the probability of observing large longitudinal correlations, in the absence of strong skill-building processes, is high, not low, because other processes might produce them.
Riskier tests RCT experimental manipulation of early skills Within-domain pattern of correlations Sibling difference models
Riskier tests RCT experimental manipulation of early skills Within-domain pattern of correlations Sibling difference models
Riskier tests RCT experimental manipulation of early skills Assess the effect of a one-unit change in baseline math skills on later math achievement
RCT evidence from the Building Blocks Pre-K math intervention Doug Clements and Julie Sarama 15-20 minute daily enhancement to existing pre-K curricula Play-based, theory-based, includes a computer component Cluster random-assignment evaluations in this case Buffalo and Boston public schools
At the end of the pre-K year: .63 sd (s.e.=.05) impacts on overall math .45 sd (s.e.=.06) impacts on counting .36 sd impacts on patterning .67 sd impacts on geometry .20 sd impacts on measurement
Building Blocks First use Building Block control children to perform a Duncan et al. (2007)-type analysis Then look at patterns of experimental impacts
Regression-adjusted correlations based on TRIAD control children 1.20 0.75 1.00 0.63 Math correlation or impact 0.80 0.50 TRIAD regression-adjusted correlations 0.60 0.38 0.40 0.25 0.20 0.13 0.00 0.00 5 6 7 9.5 10 11 -0.20 -0.13 Age -0.40 Note: All correlations are p<.05. Vertical lines depict 95% confidence intervals. -0.25
Regression-adjusted correlations and experimental impacts in TRIAD 1.20 0.75 1.00 0.63 Math correlation or impact 0.80 0.50 TRIAD regression-adjusted correlations 0.60 0.38 0.40 0.25 TRIAD treatment impacts 0.20 0.13 0.00 0.00 5 6 7 9.5 10 11 -0.20 -0.13 Age -0.40 Note: All 4th and 5th grade impacts are p>.05. All correlations and other impacts are p<.05. Impacts are rescaled to be 1.0 in the spring of pre-K, Right scale shows non-rescaled impacts. Vertical lines depict 95% confidence intervals. -0.25
Riskier tests RCT experimental manipulation of early skills FAILED! Within-domain pattern of correlations Sibling difference models
Riskier tests RCT experimental manipulation of early skills Within-domain pattern of correlations Sibling difference models
Correlational patterns If skill building models are correct, then: early skills within a domain should be predictive of later skills within that domain, but the correlation should be declining
Bivariate correlations with Fall of Kindergarten measures 1.00 Reading to reading Math to math 0.75 Zero-order correlation 0.50 Anti-social to anti- social 0.25 0.00 5 6 7 8 9 10 11 Age -0.25 Source: ECLS-K 1998-1999 cohort. All correlations are p<.05
Correlational patterns If skill building models are correct, then: early skills within a domain should be predictive of later skills within that domain, but the correlation should be declining Cross-domain correlations should be much weaker
Bivariate correlations with Fall of Kindergarten measures 1.00 Reading to reading Math to math 0.75 Zero-order correlation Math to reading 0.50 Anti-social to anti- social 0.25 Math to anti- social 0.00 5 6 7 8 9 10 11 Age -0.25 Source: ECLS-K 1998-1999 cohort. All correlations are p<.05
Riskier tests RCT experimental manipulation of early skills Within-domain pattern of correlations FAILED? Sibling difference models
Riskier tests RCT experimental manipulation of early skills Within-domain pattern of correlations Sibling difference models
Sibling models Relating sibling difference in later skills to sibling differences in early skills controls for both observable and unobservable conditions shared by siblings
Sibling models Use NLSY-CS to estimate models with and without sibling fixed effects Regress test scores at age 7/8, 9/10 and 11/12 on test scores at ages 5/6 and control for tons of child and family factors and then add controls for family fixed effects
Reminder of the Gold Standard 1.20 0.75 1.00 0.63 Math correlation or impact 0.80 0.50 0.60 0.38 TRIAD treatment impacts 0.40 0.25 0.20 0.13 0.00 0.00 5 6 7 9.5 10 11 -0.20 -0.13 Age -0.40 -0.25
Simple, regression-adjusted and sibling-based math correlations in the NLSY 1.00 Bivariate correlations 0.80 Simple controls 0.60 0.40 0.20 0.00 5 6 7 8 9 10 11 12 Simple controls include child race/ethnicity, gender, PPVT at age 5/6 and maternal years of schooling.
Simple, regression-adjusted and sibling-based math correlations in the NLSY 1.00 Bivariate correlations 0.80 Simple controls 0.60 0.40 0.20 Full controls 0.00 5 6 7 8 9 10 11 12 Simple controls include child race/ethnicity, gender, PPVT at age 5/6 and maternal years of schooling. Full controls include minimal controls plus child school-entry reading achievement and behavior problems, the quality of the preschool home environment and maternal cognitive ability and other measures.
Simple, regression-adjusted and sibling-based math correlations in the NLSY 1.00 Bivariate correlations 0.80 Simple controls 0.60 0.40 0.20 Full controls Sibling fixed-effects 0.00 5 6 7 8 9 10 11 12 Simple controls include child race/ethnicity, gender, PPVT at age 5/6 and maternal years of schooling. Full controls include minimal controls plus child school-entry reading achievement and behavior problems, the quality of the preschool home environment and maternal cognitive ability and other measures.
Riskier tests RCT experimental manipulation of early skills Within-domain pattern of correlations Sibling difference models FAILED? Further reduction from sibling FE models suggests that unobservables may still be biasing the estimates
Why did Duncan et al. (2007) reach the wrong conclusion? Because the estimates don t hold up to riskier tests Because an alternative theory (modest skill building effects + bias from persistent unmeasured factors) explains the data better
MS1 MS2 Math skill at time 2 Math skill at time 1 Math skill at time 3 Math test time 2 Math test time 1 Math test time 3
An alternative math skill-building model with unmeasured persistent influences Unmeasured persistent factor MS1 MS2 Math skill at time 2 Math skill at time 1 Math skill at time 3 Math test time 2 Math test time 1 Math test time 3 Latent state-trait model (Steyer, 1987)
Estimates of MS (Math Skills) paths from observational and experimental data Implied 1-year MS estimate Source Sample size Period State-trait estimatesBailey et al., 2014b: Missouri Math Study 292 Grade 1-Grade 2 0.26 Bailey et al., 2014b: Missouri Math Study 292 Grade 2-Grade 3 0.18 Bailey et al., 2014b: Missouri Math Study Bailey et al., 2014b: SECCYD Bailey et al., 2014b: SECCYD Watts et al., 2016: TRIAD Watts et al., 2016: TRIAD Watts et al., 2016: TRIAD 292 1124 1124 834 834 834 Grade 3-Grade 4 Grade 1-Grade 3 Grade 3-Grade 5 PreK-K K-Grade 1 Grade1-Grade 4 0.20 0.58 0.30 0.25 0.04 0.51 Simple average Unweighted study average Weighted study average 0.29 0.31 0.35
An alternative math skill-building model with unmeasured persistent influences Unmeasured persistent factor MS1 MS2 Math skill at time 2 Math skill at time 1 Math skill at time 3 Math test time 2 Math test time 1 Math test time 3 Average 1-year MS estimate from 3 datasets analyzed in Bailey et al. (2014), Watts et al. (2016): .35
Regrettably for early skill promoters, these yearly path coefficient multiply: K to 1st grade K to 2nd grade K to 3rd grade K to 4th grade etc. .35 .35 * .35 = .35 * .35 * .35 = .35 * .35 * .35 * .35 = etc. .35 .12 .04 .02
Correlations inferred from MS path estimates 1.20 0.75 1.00 0.63 Math correlation or impact 0.80 0.50 TRIAD regression-adjusted correlations 0.60 0.38 0.40 0.25 TRIAD treatment impacts 0.20 0.13 0.00 0.00 MS paths estimated in state-trait models 5 6 7 9.5 10 11 -0.20 -0.13 Age -0.40 -0.25
An alternative math skill-building model with unmeasured persistent influences Unmeasured persistent factor Treatment MS1 MS2 Math skill at time 2 Math skill at time 1 Math skill at time 3 Math test time 2 Math test time 1 Math test time 3 Estimate MS paths from experimental data
Estimates of MS (Math Skills) paths from observational and experimental data Implied 1-year MS estimate Source Sample size Period State-trait estimatesBailey et al., 2014b: Missouri Math Study 292 Grade 1-Grade 2 0.26 Bailey et al., 2014b: Missouri Math Study 292 Grade 2-Grade 3 0.18 Bailey et al., 2014b: Missouri Math Study Bailey et al., 2014b: SECCYD Bailey et al., 2014b: SECCYD Watts et al., 2016: TRIAD Watts et al., 2016: TRIAD Watts et al., 2016: TRIAD 292 1124 1124 834 834 834 Grade 3-Grade 4 Grade 1-Grade 3 Grade 3-Grade 5 PreK-K K-Grade 1 Grade1-Grade 4 0.20 0.58 0.30 0.25 0.04 0.51 Simple average Unweighted study average Weighted study average 0.29 0.31 0.35 Experimental estimates Current paper: TRIAD Current paper: TRIAD Current paper: TRIAD Current paper: TRIAD Hofer et al., 2013: TRIAD Hofer et al., 2013: TRIAD Smith et al., 2013 834 834 834 834 1192 1129 320 PreK-K K-Grade 1 Grade 1-Grade 4 Grade 4-Grade 5 Pre-K-K K-Grade 1 Grade 1-Grade 2 0.46 0.48 0.48 N/A 0.28 0.67 0.22 Simple average Unweighted study average Weighted study average 0.43 0.39 0.44
Correlations inferred from MS path estimates 1.20 0.75 1.00 0.63 Math correlation or impact 0.80 0.50 TRIAD regression-adjusted correlations 0.60 0.38 0.40 0.25 MS paths inferred from experimental impacts TRIAD treatment impacts 0.20 0.13 0.00 0.00 MS paths estimated in state-trait models 5 6 7 9.5 10 11 -0.20 -0.13 Age -0.40 -0.25
We dont know what this is. Unmeasured persistent factor MS1 MS2 Math skill at time 2 Math skill at time 1 Math skill at time 3 Math test time 2 Math test time 1 Math test time 3 Steyer s model is an elegant admission of ignorance
Some caveats/implications: The existence of unmeasured persistent individual or environmental factors: does not imply that they are immutable does not rule out skill building processes Indeed, environmental factors may suppress skill building because early skills lie fallow Because of this possibility, Building Blocks developed a booster with K/1 teachers
Some caveats/implications: We focused on math. Would our results generalize to reading? EF? Anti-social behavior? We need contrasting theories and riskier tests Resist the temptation to estimate longitudinal models with similar constructs measured at various points in time
Acknowledgments Building Blocks/ TRIAD districts, teachers, & students