Building Models in a Tax Authority: Current Challenges & Opportunities

Building Models in a Tax Authority: Current Challenges & Opportunities
Slide Note
Embed
Share

Explore the technical, ethical, and software engineering challenges faced in implementing models in a tax authority. Learn about the unique complexities of modeling taxpayer behavior and the impact of various factors on data accuracy.

  • Tax Authority
  • Data Science
  • Predictive Modeling
  • Technical Challenges
  • Ethical Issues

Uploaded on Apr 12, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Data Science in Practice: Current Challenges, Future Opportunities Building and implementing models in a tax authority Emma Gottesman Lead Data Scientist Data & Analytics, Belastingdienst ek.gottesman@belastingdienst.nl LCDS meeting, 26th October 2016

  2. Contents Who am I? Technical challenges Ethical and legal challenges Software engineering for data science New work and opportunities Training new data scientists Summary 2 D&A, Belastingdienst

  3. Who Am I? and how did I end up here Lead data scientist at Data & Analytics in Belastingdienst this is a technical role, not a managerial one Previously worked for Accenture as a specialist in analytics for tax authorities, as part of Accenture fraud analytics team worked directly with 3 agencies (Ireland s Irish Revenue, UK s HMRC, Netherlands Belastingdienst) and collorated remotely with Accenture teams working in several other countries Worked as a hardware engineer and statistician for a computer hardware company (Sun Microsystems, and briefly Oracle) looking at failure and reliability analysis, IOT (before we called it that), computer performance analysis, software quality analysis... Previous to that worked in digital signal processing research in the bio-medical field and then video engineering (including implemetation in real-time embedded software and firmware) So basically have spent my career working at the interface between mathematics and computing. The work has always been very similar, it s just that the name for it changes every 5 years or so... 3 D&A, Belastingdienst

  4. Much of the work we are currently doing is fairly standard predictive modelling, but we have some very particular challenges... Technical Challenges they never mentioned this in college... Confounded data What we are trying to model is taxpayer behaviour, but the measurement of that is confused by: business processes these change over time, and are frequently different at different office locations human descion making at multiple points decisions are made based on business need, current capacity and expert knowledge (frequently not codified, consistent, tracked or measured) human variability auditors and other clerical workers have varying levels of training and experience economic changes this impacts on taxpayer behaviour and on political decisions about what we should be focussing on changes in tax law not just shifting thresholds And it s very, very difficult to separate out the effect of all these factors. The first 3 items can be thought of as noise, the last 2 as changes in the state space in which we operate. 4 D&A, Belastingdienst

  5. Technical Challenges they never mentioned this in college... Feedback times Like most data scientists we are building systems that learn and adapt. For this we need instances (taxpayer accounts, tax returns, individuals, companies etc) that have been checked, scored or labelled. But this checking can take a very long time: for income tax returns for individuals it takes to 2 to 3 years to get a year s batch audited for OB most audits will be finished in 8 months similarly for many other processes So robustness over time is very important to us, particularly as the space we work within (the tax system) is also changing over time. 5 D&A, Belastingdienst *and we have to wait until all, or almost all, of a year s batch are worked as the easier a case is, the faster it s finished, so we need to wait to see all the issues

  6. Technical Challenges they never mentioned this in college... Biased data, large state space, small random samples We need scored data to build our (supervised) models. Data from previous years was selected by business rules (and currently by a combination of rules and models), so is highly biased. We do have random samples for some processes, but they are not big enough to use as the sole input. Example: The income tax form is 900 items long, and covers a taxpayer s entire fiscal situation, including data such as salaries, tax credits, benefits, allowances, savings, investments, debts, insurance policies, property, financial dependents and on and on. The random sample is about 20,000 returns from the approximately 10.000.000 population (0,2%)*. It is simply not possible for this sample to cover enough of the possible variations and ways of mistakes to allow us to build a model on it alone, particularly given the amount of noise in the system. There will be many important patterns leading to incorrect tax that won t occur frequently enough to appear in the random sample. So we have to use the biased data. The samples help mitigate against the bias. 6 D&A, Belastingdienst *simple random selection

  7. Technical Challenges they never mentioned this in college... Choosing the correct metric, and hit-rate vs value Many ways to measure how good a model is, choosing which one depends on the business process being solved. Typically we have a capacity issue: more cases selected than we have capacity to work. So most of our models optimise hit-rate the number of true positives in the sample identified to be worked. But we also have a remit to identify (and hence collect) unpaid tax. Statistical models tend to me more certain about patterns they see more frequently. For tax corrections these tend to be smaller in value, as large errors are less frequent. So it is possible to build a model that raises the hit-rate, but the value of tax brought in drops. This is not good... The balance between these two metrics is something we re working on. And the selection of the appropriate metric to optimise for each process we work with is an ongoing discussion. 7 D&A, Belastingdienst

  8. Technical Challenges they never mentioned this in college... So the data is: biased noisy confounded old and hence collected from a system that is similar to, but not the same as, where you need to apply the model And of course the issues you get everywhere when you use real data: the data is dirty the data is incomplete there is always missing data you are building on data that was collected for reasons other than statistical analysis or building models. 8 D&A, Belastingdienst

  9. Legal and Ethical Challenges just because you can, doesn t mean you should Oversight: we have a legal need to treat citizens equally, so can we ethically do A/B testing? no matter their income / wealth small errors happen more often than big ones, so models have a tendancy to select taxpayer accounts with low income or small errors AnotherTaxAuthority had to turn one of its models aimed at businesses off, because it not only was it selecting more small businesses than large business, but it was repeatedly selecting the same small businesses. We are not allowed to discriminate on the basis of age, gender, race .... but it s surprisingly easy to do this by accident, for instance using location data can end up being a proxy for race as many ethnic minorities live in clusters. And sometimes discriminatory, in the legal sense, factors are discriminatory, in the statistical sense: recent immigrants tend to be more likely than the general population to have incorrectly declared their tax because getting to know a new tax system, in a language you may not understand, is difficult, so you are likely to make an error. 9 D&A, Belastingdienst

  10. Legal and Ethical Challenges just because you can, doesn t mean you should Fraud detection: We can use statistical modelling to allow us to see who is at risk of commiting fraud, but when can we act? What if the subject hasn t done anything illegal yet? This is not Minority Report ... How do we determine what fraud actually is, since we can t read minds waiting for the result of legal action is usually much too late. also, by their nature, fraudsters are always looking for new ways of commiting fraud How does the need to provide fair and equal treatment apply when we re looking for fraud which tends to be anomalous. 10 D&A, Belastingdienst

  11. S/W Engineering & Data Science: development: more than just the model build model build model build model build consultation with subject matter experts, EDA, imputation, transforms, re-coding etc etc etc model build data source 1 Feature Build: x1, x2, x3, ..., x100 Feature Reduction: x1, x2, x3 ETL rule build data source 2 rule build ID id1 id2 id3 id4 id5 id6 id7 id8 id9 id10 x1 4 2 3 2 1 0 4 0 2 1 x2 2 0 1 0 2 0 0 0 2 1 x3 1 1 1 1 1 0 1 1 1 1 ID x1 x2 x3 target id1 4 2 1 id2 2 0 1 id3 3 1 1 id4 2 0 1 id5 1 2 1 id6 0 0 0 id7 4 0 1 id8 0 0 1 id9 2 2 1 id10 1 1 1 ... x100 target 0 9 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 1 0 1 1 0 rule build rule build rule build 11 D&A, Belastingdienst

  12. S/W Engineering & Data Science: production: getting it to run in live model v1 data source 1 Feature Build: x1, x2, x3 Create Output ETL data source 2 R1: IF ... THEN ... R2: IF ... THEN ... ID r1 r2 model decision id1 0 0 0.05 id2 1 0 0.98 id3 1 1 0.37 id4 0 0 0.52 id5 1 0 0.76 id6 0 1 0.83 id7 0 0 0.14 id8 0 0 0 id9 0 0 0.66 id10 0 1 0.84 ID id1 id2 id3 id4 id5 id6 id7 id8 id9 id10 x1 4 2 3 2 1 0 4 0 2 1 x2 2 0 1 0 2 0 0 0 2 1 x3 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 0 0 0 1 12 D&A, Belastingdienst

  13. S/W Engineering & Data Science: extension: it s never just the one time data source 1 model v2 Feature Build: x1, x2, x3, x104, x205 Create Output ETL data source 2 R1: IF ... THEN ... R2: IF ... THEN ... R103: IF ... THEN ... ID r1 r2 r103 id1 0 0 id2 1 0 id3 1 1 id4 0 0 id5 1 0 id6 0 1 id7 0 0 id8 0 0 id9 0 0 id10 0 1 model 0.05 0.98 0.37 0.52 0.76 0.83 0.14 0 0.66 0.84 decision 0 1 1 0 0 1 0 0 0 1 ID id1 id2 id3 id4 id5 id6 id7 id8 id9 id10 x1 4 2 3 2 1 0 4 0 2 1 x2 2 0 1 0 2 0 0 0 2 1 x3 1 1 1 1 1 0 1 1 1 1 x104 -20 4 0 -17 0 0 8 -4 -7 2 x205 1 10 11 10 0 1 0 5 11 3 1 1 1 1 1 0 1 1 1 1 data source 3 13 D&A, Belastingdienst

  14. Practical Data Science: model building is a cycle What did we learn? Use that to help build a better model What are we trying to model? How would we use the results? What is in or out of scope? What is the current state-of-play? How do we define success? What data is available? What date range should we look at? Do we have the right resources? Do we understand enough to start work? Explore, Investigate, Scope Test / Deploy on Live Cases Build a Model Consult with Subject Matter Experts Bring all the data together Process raw data & create features Define the target Build a model Report theorectical results Test the model on real cases With actual audits Do the results match what we predicted? Report practical results D&A, Belastingdienst 14

  15. New Work and Opportunites New methods Unsupervised methods because we want to discover new patterns but it can be hard to justify the cost of developing these, as it s difficult to put a monetary value on what we don t know data driven segmentation we have lots of different business segmentations, but many of the boundaries are old and some are somewhat arbitary micro segmentation anomoly detection next best action unstructured data text mining, sentiment analysis failure analysis can we predict when a change in behaviour is going to happen ideas? 15 D&A, Belastingdienst

  16. New Work and Opportunites New data At the moment all the data we use is: internal to Belastingdienst (your tax return) or, disclosed by law (data from your employer, your bank etc) or public and directly related to tax (Chamber of Commerce: registering a new business). But what about new sources: criminal records, social media, scraping websites... Could we use these, should we use these .... 16 D&A, Belastingdienst

  17. The Education and Training of Data Scientists the well-known three domain model Not just languages (R, SAS, Python..) but also computer science & software engineering principles. Discipline around creating, documenting and using code makes everyone s life easier. Statistics: including, non-parametric tests*, transformations, rules of thumb, why big data can make everything significant (& what to do about it). Maths & Statistics Programming Maths: O() notation, & why some operations do not scale, basic information theory Domain Knowledge This you learn on the job, like an apprenticeship. Understanding what the data represents how it s created and modified how your results will be used is essential to building a useful model. Plus, critical thinking: spurious correlations and underlying variables abound... 17 D&A, Belastingdienst *It s a general rule that we never have a normal distribution...

  18. The Education and Training of Data Scientists the skill that s often forgotten visualisation story-telling Maths & Statistics Programming clear explanations Communication engagement Domain Knowledge asking the right questions listening This is often missed, and it s vital. Communication not only with your peers and technical leadership, but also with non-technical colleagues, managers, stakeholders. And the communication needs to go both ways in order to understand your data you need to be able to understand what the people interacting with your data are telling you, and what they are doing. It s a skill, and it can be taught. 18 D&A, Belastingdienst

  19. Summary Data science in practice is a craft as much as a science. Many of the issues we face are not just statistical. We are always working in non-ideal situations. And in many ways that makes it more interesting. There is lots of interesting stuff we could be doing, but since we are a public body we need to justify our actions both fiscally and ethically. 19 D&A, Belastingdienst

More Related Content