Using Secondary Analysis for Researching Individual Behavior in Job Training

Using Secondary Analysis for Researching Individual Behavior in Job Training
Slide Note
Embed
Share

This presentation at the ESRC Research Methods Festival 2014 discusses the utilization of secondary analysis, focusing on exploring individual behavior in the context of on-the-job training. It delves into accounting for endogeneity using BHPS longitudinal data, presented by Genevieve Knight and Michael White on July 8, 2014.

  • Research Methods
  • Secondary Analysis
  • Individual Behavior
  • Job Training
  • Longitudinal Data

Uploaded on Mar 02, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. ESRC Research Methods Festival 2014: Using Secondary Analysis to Research Individual Behaviour On the job training and accounting for endogeneity using BHPS longitudinal data Genevieve Knight and Michael White July 8 2014

  2. Genevieve Knight g.knight@psi.org.uk

  3. The impact of economic conditions on the outcomes of job mobility and the mediating role of training: Sectoral differences in the private returns to in-job training in the 1990s UK recession This working paper has been prepared as part of the outputs of a UK ESRC-funded project [ES/K00476X/1 The 1990 s: sectoral rebalancing, mobility and adaptation the employment, self-employment and training policy lessons for the current UK recession]. Part of the first the ESRC Secondary Data Analysis Initiative (SDAI) 2013. ESRC

  4. Sectoral differences in the private returns to in-job training in the 1990s 1. Other outputs: A detailed description of the mobility of public sector employees relative to market employees during the 1990 s period with extensive public sector cuts; A comparative sectoral analysis of movements into non-employee destinations (see The public sector in the 1990s recession: employee exits to non-employee destinations ). An analysis of the earnings and job-satisfaction outcomes consequent upon employee mobility from public to market sector and within the public sector (see The 1990s recession consequences of mobility from and in the Public Sector ); 2. 3. ESRC

  5. Outline 1. The story 2. The data 3. Causality and the analysis methods 4. The results

  6. 1. The story To inform the large scale UK public sector job cuts announced within an economic recession we looked for evidence on what happened last time . UK Public sector contraction and forced movement of public service employees to the private sector has happened before under the John Major government of the 1990s, so we can use the experience then to gauge the effects of public-to-private job moves now

  7. UK public sector employment Chart 1:Public Sector Employment: United Kingdom; 1991 to 2005, second quarter headcount; not seasonally adjusted. Source: ONS: Hicks et al. (2005) Figure 3.1, p.5 Public Sector Employment Trends.

  8. 1. The story We examine returns to in-job training over the period 1991-8, a period when the UK public sector underwent a severe contraction, and which also experienced widespread turbulence as a result of technological and organizational change. Using publically available longitudinal data from the British Household Panel Survey (BHPS) The effects on earnings are estimated twice (thrice! Why?) - through a treatment effects matching method and also by fixed effects panel regression (+OLS). THE NEW BIT we get sectoral returns to in-job training

  9. 1. Why? The methods and data are the most difficult part of the puzzle How do we achieve causality attribution? (? endogeneity???) How can we explore timing dynamics in the follow-up period? (we use t and t+1 from panel data)

  10. 1. The difficulties of predicting the effects of training in an analytical way. Both employers and employees are involved in choices concerning training, and training itself can be regarded as an outcome variable. Selection, including self-selection, into training is an ever- present complication. A further difficulty raised by self-selection is the possibility that training is sought by people of relatively high ability, and that the earnings gains reported in the literature are upwardly biased by ability differences between the trained and non-trained.

  11. 1. How to get causality attribution for training? Use the methods and design of analysis 1. Treatment effect methods (matching) 2. Fixed Effect/regression methods 2 criteria to infer a causal relationships Covariation (correlation) of causal (X, explanatory) and outcome (Y, dependent) variables AND time order cause comes before effect

  12. 1. How to get causality attribution for training? We focus the analysis by defining a training treatment period time-point We ensure the X covariates are measured at or before the treatment We measure outcomes after the treatment We include appropriate X

  13. 1. Why Treatment effect methods (matching)? What does it mean when we use these methods? How does a matching analysis differ from regression methods (FE/OLS)? Matching focuses on the outcome(s) of an intervention, or treatment (here, the receipt of in-job training) that takes place at a particular time for some people but not others. Since it is impossible to observe the same individual being both treated and not treated in the same period, the treated individual is instead matched to one or more non-treated individuals whose characteristics and circumstances are so similar that they have virtually the same propensity, or probability, of receiving treatment. Thus the matching method provides between-person comparisons.

  14. 14 1. Matching - Impact evaluation the counterfactual Individual A receives training BUT She then earns 280 per week WE OBSERVE THIS FACT She would earn 220 per week If she had received no training COUNTERFACTUAL Impact on individual A = 280 - 220 = 60

  15. 15 1. Matching - Impact evaluation & the counterfactual Clearly do not observe the situation where training is not received for those who actually do receive training Matching impact analysis involves carefully trying to estimate the counterfactual for those who receive training.

  16. 1. Why FE regression? What does it mean when we use these methods? FE regression provides estimates that can be interpreted as approximately causal , especially in removing bias from unobserved constant individual differences such as ability or personality (see Allison 2009). However, the method is not based on a formal causal model in the sense that the method of matching is. See Wooldridge (2002).

  17. 1. Why FE regression? What does it mean when we use these methods? Fixed effect (FE) panel regression. (AKA within regression since estimates reflect within-person variation around her/his mean values rather than comparisons between people). A chief advantage accounts for the influence of unobserved personal factors (such as ability or personality) that are constant over time.

  18. 1. Why FE regression? Causation: X, Y, Z relationships complicated! Z X Y direct causal: (X Y). indirect causal: (X Z Y), Spurious: both (Z X and Z Y). a combination of direct/indirect/spurious

  19. 20 1. Why FE regression? Why matching? Impact evaluation - The experimental ideal: an example Enters programme Outcome = Op Programme Baseline data Eligible population Allocation At random The Counterfactual Outcome = Oc Control Two groups statistically equivalent at allocation/assignment Random allocation ensures no systematic differences between control and programme groups at assignment/allocation No systematic differences in what we can observe about the two groups and, importantly, what we can t observe Impact of programme = difference in means or proportions between 2 groups

  20. 1. Why FE regression? Why matching? Matching tries to replicate the experimental control design FE uses panel to replicate the control of what we can t observe for individual How well they work in practice with the data is An empirical question .

  21. 1. The teaser We find: Positive overall effects for some, but with sectoral differences (no effect market, 7.5% public) , and phasing (timing), with public sector training providing a more persistent gain in protecting earnings when employees change sector or change employment.

  22. 1. Final teaser There is a smaller effect indicated by FE, relative to matching. We suspect FE does a better job of allowing for individual unobserved ability/financial motivation that no amount of X will get rid of.

  23. 2. The data: BHPS The British Household Panel Survey (BHPS) Interviewed respondents at regular annual intervals. The initial sample was representative of the British population in 1990. Identify people who were employees during the 1990s and among these the individuals who received in-job training within the period 1991-97 (the training question was discontinued in 1998).

  24. 2. The data: Y = earnings Y = log usual monthly earnings, either in the current year of the present job where training is received or in the year following training. natural logarithm Earnings are used, rather than wage, since such a measure reflects paid hours worked as well as the wage, and maintaining hours, hence earnings capability, is an important objective for most job-movers (the scant availability of full-time jobs and the enforced shortening of hours are frequently noted issues in the adverse British conditions post-2008). Usual earnings rather than most recent earnings are used so as to reduce variation for reasons such as absence or exceptional overtime working.

  25. 2. The data: mobility We use information about employment between two consecutive waves to classify labour mobility. 1) change in sector, i.e. the mobility between public and market sectors;

  26. 2. The data: on the job training Somewhat formal training provision rather than informal provision. It asked in the whether, in the past year, the individual had taken part in any education or training schemes or courses as part of your present employment ? For our analyses, training is represented as a dummy variable taking value 1 when training has taken place during a one-year period of employment. This variable defined on the year 1994-5 for the matching analyses, on the year prior to the outcome year in the panel FE. 1. 2.

  27. 2. The data: X controls educational and professional qualifications, broad occupational level as an indicator of acquired skill (Tahlin 2007), age as a proxy for experience, family structure variables(marital status, employment status of spouse, and age of youngest child) separately specified for men and women (for the FE modeling of interactions involving fixed variables, see Allison 2009). employer variables known to affect wages or training: sector, living in the prosperous London or South-East region, working in a small (less than 50 employees) workplace, and presence of a recognized union. FE: age in quadratic form. Matching: age a set of five decade dummies representing 20s through to 60s with 16- 19 as omitted category.

  28. 2. The data: X controls Matching: 16 Card-Sullivan variables which summarize profiles of individual advancement over the period 1991-94. Card and Sullivan (1988) originally devised the method as a form of exact matching, but we follow the application of Dolton et al. (2006) who use similar derived variables as matching regressors. It is hoped that this removes bias from unmeasured ability, on the assumption that underlying ability tends to be recognized in patterns of earnings prior to the focal training episode. Matching: a variable for the number of waves observed in employee status, plus its square, over 1991-5. This intended to correct for variations in recent experience resulting from years out of employment. FE: year dummies to control for movements in economic conditions that are likely to affect all employment. PLUS compensate for the impossibility of weighting, we incorporated variables that were used in the original construction of the strata and weights for the survey sample. These included variables representing non-labour income and wealth assets, notably car ownership and home ownership.

  29. 3. The methods We use a combination of (1) treatment effect estimation by the method of matching, and (2) panel data analysis by fixed effect (FE) regression. We use matching for large samples (all, market, public) and FE for mobility groups as we can pool across waves

  30. 3. Comparison of the methods? (1) how to contrast matching with the FE analysis? (2) how does a matching analysis differ from regression methods?

  31. 3. Comparison of the methods? What does it mean when we use these methods? how does a matching analysis differ from regression methods? We want to test whether T (leads to) Y (along with correlates X) Some other variables Z that have not been measured could have led to the change in Y (unless we can find a way to measure Z or the holy grail of an instrument for Z) Matching gives a different weighting to the cases than (OLS) regression would. But it still cannot doesn t solve the problem of Z. Both control for observed X, rely on CIA. Suffer omitted variable bias Heckman et al.1998 matching as an econometric estimator & characterising selection bias using experimental data Imai and Kim (2011) on the use of linear fixed effects regression estimators for causal inference

  32. 3. Comparison of the methods? What does it mean when we use these methods? Does FE offer much more than OLS/matching for training analysis? A little we can account for individual ability Important in training context There appears to be heterogeneous treatment (training) effects by sector, important in our context But we still can t account for time varying confounders (Z)

  33. Thats not all folks. 3. Other methods/data problems Observational data (sigh!) Violation of ignorable treatment assignment [there are unobserved variables related to both treatment assignment (who gets training) and the outcome (wages)] self selection The only true solution get better data! In practice, we have to implement solutions to try to fix up these issues .

  34. 3. Other problems to solve in (panel) data Analytical problems (biases) Attrition survey drops outs can unbalance the information leading to selection bias and should be accounted for (a form of unit missing data). Missing (Y or covariate values) the methods of accounting for missing data will affect the results (another bias). X =Simple Mean Imputation: missing dummy indicators in the propensity; missing dummies in the FE. Y? see above Choose The variance estimates for matching there is still debate on the best way to account for the uncertainty of the propensity estimate not being the True propensity. Just an estimate. (it leads to more conservative wider confidence interval on the impact than necessary ! ) bias and variance reduction trade off decisions we use abadie&imbens You have to choose a matching method more bias and variance reduction trade off decisions We specify two-match nearest neighbour matching with sample replacement You have to choose how much covariate balance is enough For FE - repeated observations on the same individuals: use robust variance estimator For FE - unbalanced panel approach (Wansbeek and Kapteyn 1989), since restriction to the balanced panel would lose too much data. Inclusion of numerous controls for time-varying variables helps to strengthen the causal interpretation of results as well as substituting for sample weights. We compensate for the absence of weighting by including a wide range of control variables that have been used by the survey originators in structuring the survey (see Taylor et al. 2011).

  35. Finally 3. what can we say about causality: panel Fixed Effects Intuitively the removal of fixed effects is likely to be very important in removing selection bias in a model of training effects. However there is also the possibility of unobserved time-varying bias. We strive to minimize such bias by the inclusion of an extensive set of time-varying control variables, as described above. But some such bias is likely to remain, especially because of a rather limited set of information about employers. Accordingly we do not argue that all sources of bias have been removed nor that a definitive causal effect has been identified.

  36. 4. The results: matching Table 2. Estimates of average treatment effects of training on the trained, for matching models of log usual monthly earnings The treatment is in-job training in the year 1994-5, as reported by employees. Matching method: two nearest neighbours with replacement, non-trained matched to trained. The outcome is log usual monthly earnings as reported by employees, either for 1995 or 1996. earningsb in year ATETa Sample matching set |t| N match quality analysis # 1 2 3 full sample analyses 1995- all 1995 -all 1995-all sectoral analyses 1995- public sector 1995- public sector 1995 market sector 1995- market sector mean bias(%) 1.96 2.29 2.66 bias reductionc 78.4 81.2 69.2 sig. biasd 0 0 1 1994-5 1994-5 1995-6 (1) (2) (1) 0.059 0.062 0.035 2.73** 2.71** 1.68+ 3576 3576 3217 4 1994-5 (3) 0.054 1.59 1157 3.14 59.2 0 5 1995-6 (3) 0.104 3.09** 1066 3.49 53.4 0 6 1994-5 (1) 0.080 3.09** 2440 3.13 66.9 1 7 1995-6 (1) 0.066 2.60** 2168 2.44 73.6 0 Notes. a: average treatment effect on the treated b: log of usual monthly earnings c: 100*[(bias(before match)- bias(after)/bias(before)] d: number of t-tests (treated v. control) significantly different from zero (|t|>=2.00). Matching sets: (1) full set of matching variables including Card-Sullivan prior earnings dummies; (2) as (1) but omitting the Card-Sullivan dummies; (3) as (1) except that union representation dummy is omitted.

  37. The FE results model 1 FE mean marginal predictions: the difference associated with having received training for each of the mobility conditions- 1. Staying in the public sector after previous-year training improved earnings by 3.2 per cent by comparison with staying in the public sector without such training. 2. Moving public to market sector: 7.5 per cent better off than those who lack recent training when they change sector. Or, earnings loss of 7.5 per cent unless protected by prior training. Significant at the 10 per cent significance level 3. no difference to earnings from training for those staying in the market sector, 4. no difference (a small (not statistically sign.) negative effect) of training when moving from market-to-public.

  38. Final, final conclusion We suspect FE does a better job of allowing for individual unobserved ability/financial motivation that no amount of X will get rid of.

  39. Follow PSI on Twitter: @PSI_London

More Related Content