Efficient Error Detection in VAT Data
Investigating methods for detecting errors in VAT data, including selective editing, outlier detection, and combining conclusions. Background on reducing respondent burden and introduction of administrative data. Details of initial automatic editing and micro-level selective editing approach. Analysis of score functions, imputation, and performance evaluation.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Investigating methods of efficient detection of errors in VAT data Katie Davies, Office for National Statistics
Overview Background Approach 1: Selective Editing Approach 2: Outlier Detection Comparisons Approach 3: Combine Conclusions & Recommendations
Background Reduce respondent burden introduction of administrative data Distributive Trade Transformation Magnitude of data Terminology: o RU: Reporting Unit o Cell level: RUs aggregated based on industry and employment size band
Background Initial Automatic Editing Thousand Pound rule Quarterly pattern o [x, x, x, x] except x = 0 o [x, x, x, y] o [0, 0, 0, y]
Approach 1: Selective Editing Micro level Targeted Score function Ratio of means imputation
Approach 1: Selective Editing Score Function ? ? ? ? ? ? ? ? ? ? ? ??,? 1 ? ??,? 1 ????? = ? ??? ? = ???????? ??? ??????? ?????? ? = ???? ???????? ???? ? ? ???? 3 ????? ? =???????? ???????? ??? ?????????? ?? ??????? ?????? ???????? ???????? ??? ?????? ?? ??????? ?????? ??,? 1= ????? ???????? ??? ?????? ? ?? ???????? ?????? ? 1 ? = ???????? ????????? ?? ???????? ???? ? ? ???? 3 ?????
Approach 1: Selective Editing good performance
Approach 1: Selective Editing further consideration
Approach 1: Selective Editing further consideration
Approach 1: Selective Editing further consideration
Approach 1: Selective Editing further consideration
Approach 1: Selective Editing further consideration
Approach 2: Outlier Detection Macro level Seas function to identify Automatic treatment = use factors from seas RU treatment = Ratio of means imputation on RU with highest score
Approach 2: Outlier Detection good performance
Comparisons Selective Editing Outlier Detection
Comparisons Selective Editing Outlier Detection
Approach 3: Combine Raw RU data Selective editing Original vs Treated plots Edited RU data Aggregation Edited RU data based on decisions Cell level data RU Outliers detected treatment Treated RU data
Approach 3: Combine good performance
Approach 3: Combine further investigation
Conclusions & Recommendations Raw RU data Selective editing Edited RU data Original vs Treated plots Aggregation Edited RU data based on decisions Cell level data RU Outliers detected treatment Aggregation Treated RU data Treated cell level data Further investigation & revisions Final cell level data