Linking Feasibility Study: STATS19 and TARN Data Analysis

stats19 and tarn data linking feasibility study n.w
1 / 26
Embed
Share

Explore the potential of linking STATS19 police-reported data with TARN hospital datasets to enhance post-crash care insights and improve road safety strategies. This study assesses the feasibility and implications of merging these datasets at a national level, offering new perspectives on the impact of road traffic incidents on healthcare systems.

  • Feasibility Study
  • Road Safety
  • Data Analysis
  • Traffic Incidents
  • Healthcare

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. STATS19 and TARN Data linking feasibility study Matthew Tranter, Road Safety Statistics Department for Transport 1 March 2022 OFFICIAL

  2. Introduction and motivation

  3. Introduction Police reported STATS19 data basis of published DfT road casualty statistics. Detailed accident circumstances but limited clinical outcomes (historically just fatal/serious/slight) Safe systems approach central to DfT future road safety strategic framework - evidence gaps around post-crash care pillar Hospital datasets, such as TARN, provide better detail on post-crash outcomes for more serious casualties Linking can address limitations of STATS19 and potentially add insights e.g. impact of RTCs on trauma caseload of the NHS This work: Phase 1: assess feasibility of linking STATS19 and TARN (this presentation) Phase 2: analysis of linked data (further work by September, and beyond) STATS19 and TARN: Data linking feasibility study

  4. Related work DfT has already linked STATS19 to Hospital Episode Statistics (inpatient admissions) Used understand trends in reporting levels, and validate injury-based recording for relevant forces (27 of 43 as it stands) However, lacking clinical detail and increasingly difficult to obtain data via NHS-Digital Other studies have linked TARN and STATS19 e.g. TRIP (Targeting Road Injury Prevention) data for Cambridgeshire, using home postcode data for linkage DocBike focus on motorcyclist casualties What is new here? Attempting to link at national level Assessing how possible to create links with only STATS19 open data TARN has more clinical detail available than HES STATS19 and TARN: Data linking feasibility study

  5. TARN data Coverage Patients of all ages reaching hospital alive after injury and who subsequently die or require critical care, inter-hospital transfer and, or acute inpatient care for >72 hours Extract for vehicle collision/incident Data for 2018 to 2020 ~ 35,800 records Further extract for validation based on East of England region (with additional first part of home postcode) ~ 2,800 records Variables For linkage: age, sex, date and time of incident, incident location (postcode), incident description, position in vehicle For initial analysis: length of stay (overall and critical care), outcome, injuries sustained Further variables available by request to TARN (with their permission/agreement) STATS19 and TARN: Data linking feasibility study

  6. TARN data - vehicle collisions Incident location: Other Including home, industrial, mountain, other, not known 81% Road 12% 7% Public area Average per year, 2018-2020 [number of patients/casualties]: Trend 2018-2020 [2018 = 100]: 110 TARN road incident: dead 481 100 STATS19: fatal 1,509 TARN road incidents HES traffic, MAIS3+ 5,496 90 TARN road incident: alive HES traffic admissions STATS19 KSIs 8,779 80 STATS19: serious HES all traffic admissions 24,425 70 31,457 2018 2019 2020 STATS19 figures are England and Wales HES = Hospital Episode Statistics for England

  7. Data linking approach

  8. Outline No common unique identifiers in each dataset need a fuzzy matching approach based on agreement on common variables (with some tolerance allowed) Probabilistic linkage using the well established Fellegi-Sunter method Variables for linking and allowed tolerance Variable Tolerance allowed Notes Age Sex Date Time Location Within +/- 1 year Exact match Exact match Within 60 minutes Within 2 miles any direction (if present) Using lookup (shown on next slide) Fatal/serious assumed as correct Near complete in both sources Near complete in both sources Must be present for linkage TARN data 75% complete Incident postcode available for 45% TARN records Casualty type/class TARN variable is position in vehicle STATS19 severity STATS19 and TARN: Data linking feasibility study

  9. Mapping TARN position in vehicle to STATS19 casualty type and class [In STATS19, c5 is the casualty class variable and c16sum is casualty road user type. Not all values can be linked, and in in these cases the variable is considered as unavailable for matching] STATS19 and TARN: Data linking feasibility study

  10. Fellegi-Sunter method TARN data as supplied, and full STATS19 dataset (all severities) TARN STATS19 Link on date only to create candidate links TARN x STATS19 Generate candidate linked TARN-STATS19 pairs (same date) Weights for each variable based on: probability of agreement, if true match probability of variables agree by chance whether STATS19 and TARN records agree Add weights for each linked TARN-STATS19 pair based on variable agreement Select candidate link(s) with highest weight and classify Use test dataset to establish threshold for determining whether to classify as a match Further reference: https://www.robinlinacre.com/maths_of_fellegi_sunter/ https://www.gov.uk/government/statistics/linking-stats19- and-tarn-an-initial-feasibility-study/linking-stats19-and-tarn- an-initial-feasibility-study Likely non match Likely match Possible match

  11. Probabilities for weights Approach TARN data for East of England supplied with additional first part of home postcode Used with date and incident description used to compile dataset of ~900 correct matches allows assessment of likelihood of variable agreement for matched records Likelihood of agreement for non-matched records based on STATS19 frequencies Examples Variable P(agree if match) P(agree if non-match) Most likely to agree by chance = least useful for deciding between candidate matches Sex 0.994 0.6 (male) 0.4 (female) Location 0.966 ~0.00003 on average If present, most power to discriminate = most useful variable STATS19 and TARN: Data linking feasibility study

  12. Thresholds for matches 250 Location postcode in TARN >14 classified as matched 200 Incorrect linkages 150 Correct linkages 100 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Overall linkage weight No location postcode in TARN 250 >8 classified as matched 200 150 Incorrect linkages 100 Correct linkages 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Overall linkage weight STATS19 and TARN: Data linking feasibility study

  13. Initial [provisional] results

  14. Linkage quality: test dataset Links made using first part of home postcode and incident description TARN records with known STATS19 match: 945 Excludes public area , home and others Location type= road : 826 Location postcode: 475 [58%] No location postcode: 351 [42%] Linked: 445 [94%] Not linked: 30 Linked: 137 [39%] Not linked: 214 Correct 443 [99.6%] Incorrect 2 [0.4%] Incorrect 4 [2.9%] Correct 133 [97.1%] 93% records correctly linked 0.4% bad matches 6% missed matches 38% records correctly linked 1.1% bad matches 61% missed matches STATS19 and TARN: Data linking feasibility study

  15. Linkage results: full dataset TARN records for vehicle incident or collision : 35,817 Excluding: - invalid year - cases where year and month of incident and admission differ - description indicate scope of STATS19 - public area , home etc Location type= road : 27,780 No location postcode: 14,751 Location postcode: 13,029 [47%] Not linked: 4,208 Linked: 4,517 [31%] Linked: 8,821 [68%] Not linked: 10,234 Correct ? Incorrect ? Correct ? Incorrect ? 13,338 linked records overall. Non-links due to: data issues/errors (especially lack of postcode in TARN) TARN records out of scope of STATS19 (medical, suicide, abroad etc) Casualty not in STATS19 (i.e. not known to police) STATS19 and TARN: Data linking feasibility study

  16. Linkage by year TARN records with postcode location of incident recorded, % linked to STATS19 overall and excluding pedal cyclists 0% 20% 40% 60% 80% 69% 2018 74% 69% 2019 73% Lower overall rate for 2020 largely due to more pedal cyclist casualties (impact of Covid/lockdown) 65% 2020 73%

  17. Linkage by TARN outcome TARN records with postcode location of incident recorded, % linked to STATS19 Dead 84% 30+ days 76% Alive, by number of days in hospital 21-30 days 74% 11-20 days 71% 1-10 days 62% 0% 20% 40% 60% 80% 100%

  18. Linkage by road user type TARN records with postcode location of incident recorded, % linked to STATS19 0% 20% 40% 60% 80% 100% Pedestrian 79% Motorcyclist 72% [Car/van/HGV] driver 72% [Car/van/HGV] passenger 71% Bus passenger 55% Only 8 powered transporter records in the TARN dataset Mobility scooter 43% Pedal cyclist 42% Powered transporter 25% STATS19 and TARN: Data linking feasibility study

  19. Linkage by road user type Split by whether incident description mentions fall , fell or came off or otherwise 0% 20% 40% 60% 80% 100% 79% 76% Pedestrian 76% 61% Motorcyclist 73% 55% [Car/van/HGV] driver 71% 53% [Car/van/HGV] passenger 77% 32% Bus passenger 74% 15% Mobility scooter 63% 19% Pedal cyclist Again, very small numbers 40% 0% Powered transporter

  20. Linked records: STATS19 severity Linked records where TARN location information exists, by TARN outcome and STATS19 severity STATS19 severity Killed Serious Slight 574 26 4 Dead 604 TARN outcome 27 7,024 1,166 Alive 8,217 601 7,050 1,170 Around 13% of linked records coded as slight in STATS19

  21. Linked records: STATS19 severity Linked records where TARN location information exists, where linked to a STATS19 record for a force using injury-based reporting (with sub-division of serious category) Very Moderately serious Less Slight All serious serious Average length of stay (days) 24.8 13.9 11.6 10.0 15.7 % with period in critical care 56% 26% 15% 13% 29% Average length of stay in critical care (days) 12.9 5.2 5.1 4.5 9.1

  22. Conclusions and next steps

  23. Conclusions Feasibility of linkage Where location information (postcode) available in TARN, a good quality linkage can be made without patient home postcode When location is missing, match rate and quality lower patient postcode, or hospital location, needed for better results Initial [provisional] results Likely most of the more serious road casualties within scope are captured in STATS19 with the exception of falls Some degree of misclassification of severity in STATS19, with some trauma cases likely being coded as slightly injured consistent with other work However a correlation between the more detailed STATS19 severity and length of stay in critical care STATS19 and TARN: Data linking feasibility study

  24. Next steps [all subject to TARN approval] 1. Development and validation of linkage Review tolerances to assess if marginal improvements can be made Approach TARN networks for permission to use home postcode (where location does not exist in TARN)? 2. Further years data In particular, 2021 data may provide insight into the completeness of e-scooter casualties in STATS19 3. More in-depth analysis Explore what the linked data can add to STATS19 and help to quantify the burdens of road collisions on the NHS 4. Sharing Link once, use many times via making code or linkages available (to TARN) subject to interest/permission STATS19 and TARN: Data linking feasibility study

  25. Future work DfT statistics Post-crash care More focus on TARN data No longer linking to HES Injury-based reporting Publication of more detailed severity reporting and most severe injury for CRASH forces (initial factsheet) Future data strategy Linking STATS19 to other data e.g. Home Office fire incidents STATS19 and TARN: Data linking feasibility study

  26. Thank you Acknowledgements We are grateful to: The TARN team, University of Manchester (Antoinette Edwards, Rachel Bentley and Husna Ghafoor) for project approval, data supply and engagement Dr Simon Lewis, TARN network lead for East of England, for permission to obtain and use partial home postcode to develop and validate the methodology RAC Foundation (Ivo Wengraf) for advice and assistance in developing the linkage methodology and suggestions for future improvements Any feedback appreciated: Matthew.Tranter@dft.gov.uk or roadacc.stats@dft.gov.uk Methodology paper: https://www.gov.uk/government/statistics/linking-stats19-and-tarn-an- initial-feasibility-study/ STATS19 and TARN: Data linking feasibility study

Related


More Related Content