Learning and Interpreting Complex Distributions in Empirical Data

learning and interpreting complex distributions n.w
1 / 28
Embed
Share

Explore the methods and examples of fitting complex distributions in empirical data for decision-making, growth dynamics, network science, survival analysis, and computer science. Learn how to interpret generative dynamics from parametric distributions to understand uncertainties and risks in various fields.

  • Complex Distributions
  • Empirical Data
  • Decision Making
  • Growth Dynamics
  • Network Science

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Learning and Interpreting Complex Distributions in Empirical Data ??????? ?????, ???? ????, ????? ???? 1Tsinghua University

  2. Method Introduction Experiments Conclusion 2 Introduction: Distribution Fitting for Decision Making Decision Making Cross-Sectional Empirical Data DistributionFitting For Uncertainty Quality of Apples Selling Price Expected Income etc. 60 50 40 30 20 10 0 40 60 80 100 GRAM 120 140 160 180

  3. Method Introduction Experiments Conclusion 3 Introduction: Growth Dynamics for Decision Making Time-related Decision Making Cross-Sectional Empirical Data Distribution Fitting For Uncertainty GrowthDynamics 60 Quality of Apples Selling Price Expected Income Future Income Early Marketing How grow 90 50 80 70 40 60 50 30 40 30 20 20 10 10 0 0 0 50 100 150 40 60 80 100 GRAM 120 140 160 180 Time

  4. Method Introduction Experiments Conclusion 4 The Problem To fit empirical data by parametric distributions To interpret their generative dynamics from the distributions Distribut ion Fitting Generative Dynamics Empirical Data 100 60 50 40 20 0 0 40 60 80 100 120 140 160 180 0 100 200

  5. Method Introduction Experiments Conclusion 5 Example 1. Network Science Network data Distribution Fitting Growth Dynamics Richer-get-richer Preferential attachment Matthew effect

  6. Method Introduction Experiments Conclusion 6 Example 1 cont. Citations Market Values Tumor Size Citation Growth of Einstein ? Growth of WeChat SN ? Growth of Tumor ?

  7. Method Introduction Experiments Conclusion 7 Example 2. Survival Analysis Distribution Fitting Time to Event Data Growth Risks of Diseases

  8. Method Introduction Experiments Conclusion 8 Example 3.Computer Science Deep Model Data from Arbitrary Distribution Generative Dynamics of Deep Model Distribution Fitting by GAN

  9. Method Introduction Experiments Conclusion 9 The Problem Again To fit empirical data by parametric distributions To interpret their generative dynamics from these distributions Distribut ion Fitting Generative Dynamics Empirical Data 100 60 50 40 20 0 0 40 60 80 100 120 140 160 180 0 100 200

  10. Method Introduction Experiments Conclusion 10 Challenges Distribut ion Fitting Generative Dynamics Empirical Data 1. Distributions of Empirical Data are always Complex. 2. Existing Distribution Fitting Models Cannot Capture Complex Empirical Data accurately. 3. How to Infer Generative Dynamics from Cross-sectional Distributions is largely unknown

  11. Method Introduction Experiments Conclusion 11 Challenges 1. Distributions of Empirical Data are Complex. 1. Distributions of Empirical Data are Complex Heavy tailed Mixture model, multiscale complexities Even for 1D Inter-event-time of adding friends in WeChat by an active user Terrorist attacks worldwide from 1968 to 2006

  12. Method Introduction Experiments Conclusion 12 Challenges 2. Existing Distribution Fitting Models Cannot Capture Complex Empirical Data Well. 2. Existing Distribution Fitting Models Cannot Capture Complex Empirical Data Well. Narrow-tailed: Gaussian, Deep model GAN Heavy-tailed : Power law, Weibull (or stretched exponential distribution) ,

  13. Method Introduction Experiments Conclusion 13 Challenges 3. How to Infer Generative Dynamics from Distributions is largely unknown. 3. How to Infer Generative Dynamics over Time from Cross- sectional Data? From cross-sectional to longitudinal? Output ? Complex Food, Ecological , Social, Biological, Technological Systems (Time) Input

  14. Introduction Method Experiments Conclusion 14 Method Framework 1. To build complex distributions by hazard rates in survival analysis. Distribution 2. To generate the distributions by a dynamic system. 2. 3. 1. 3. 3. 3. To learn the distributions and the dynamic system from empirical data. Data 2. Survival Analysis Dynamic System

  15. Introduction Method Experiments Conclusion 15 Method Details: Step 1. 1. To build complex distributions by hazard rates in survival analysis. Random variable X with n observations ?, {?1,?2, ,??} Distribution Tools for Distributions PDF: ?(?) = ??? ??(? ?<?+ ?) ? ?? ? ?? ? ?? F ? = ?0 CDF: Tools for Survival analysis Survival function (CCDF): Hazard function: ?(?) = ??? S ? = 1 ?(?) Data ?? ? ?<?+ ? ? ?) ? =?(?) ?(?) ?? ? Distributions from hazard function: Survival Analysis Dynamic System ?? ? ?? ? ? = ?(?)? ??

  16. Introduction Method Experiments Conclusion 16 Step 1. Survival Analysis ? ? = ?(?)? ?? Distribution ?? ? ?? Examples: Exponential distribution: Power-law distribution: ? ? ? = ?? ?? ? ? = ? ? ? =? ?? (?+1) ? ? = ??0

  17. Introduction Method Experiments Conclusion 17 Step 1. Survival Analysis ? ? = ?(?)? ?? Distribution ?? ? ?? Examples: Exponential distribution: Power-law distribution: We propose: What distribution(s)? ? ? = ? + (? + ?) ? ? ? = ?? ?? ? ? = ? ? ? =? ?? (?+1) ? ? = ??0 ? ?

  18. Introduction Method Experiments Conclusion 18 Step 1. Survival Analysis ? ? = ?(?)? ?? Complex Distribution ?? ? ?? Examples: Exponential distribution: Power-law distribution: We propose: What distribution(s)? ? ? = ? + (? + ?) ? ? ? = ?? ?? ? ? = ? ? ? =? ?? (?+1) ? ? = ??0 ? ?

  19. Introduction Method Experiments Conclusion 19 Step 1. Survival Analysis ? ? = ?(?)? ?? Complex Distribution ?? ? ?? Examples: Exponential distribution: Power-law distribution: We propose: What distribution(s)? ? ? = ? + (? + ?) ? ? ? = ?? ?? ? ? = ? ? ? =? ?? (?+1) ? ? = ??0 ? ?

  20. Introduction Method Experiments Conclusion 20 Method Details: Step 2. 2. To generate above distributions by a dynamic system. Sketch of the System: Simple input: Distribution Dynamic Systems Data Survival Analysis Dynamic System Complex output: ? ? ? = ? + (? + ?) ?

  21. Introduction Method Experiments Conclusion 21 Method Details: Step 2. 2. To generate above distributions by a dynamic system. Sketch of the System: Input: A new agent ? comes to the system following a Poisson Process at ??. Dynamics: The state of agent ?, denoted as ??(?) grows over time according to the differential equation ???(?) ?? = ??0 . Output:A cross-sectional observation of this system at time ?, namely {?1(?),?2(?), ,??(?)} follows distributions specified by this hazard function. Distribution (??? +?)? ?(??? +?)??+ ??, with initial state Data Survival Analysis Dynamic System ? ? ? = ? + (? + ?) ?

  22. Introduction Method Experiments Conclusion 22 Method Details: Step 2. 2. To generate above distributions by a dynamic system. Sketch of the System: Distribution (??? + ?)? ?(??? + ?)?? + ?? ???(?) ?? = Data Survival Analysis Dynamic System ???(?) ?? ?(??? + ?)?? + ?? (??? + ?)? ? = ? ? = ? + (? + ?) ?

  23. Introduction Method Experiments Conclusion 23 Method Details: Step 3. 3. To learn the distributions and the dynamic system from empirical data Distribution Shared parameters in Distribution, Survival Analysis, and Dynamic System. E.g. MLE in survival analysis / hazard rate space: Construction and proof. Optimization, Generator code @ www.calvinzang.com Data Survival Analysis Dynamic System ???(?) ?? ?(??? + ?)?? + ?? (??? + ?)? ? = ? ? = ? + (? + ?) ?

  24. Introduction Method Experiments Conclusion 24 Experiments - Data Dynamics records of social behaviors Cross-sectional observations 1. #words in novel Moby Dick 2. #deaths in terrorist attack 3. #mammals in earth 4. #people in the blackouts in the U.S. 5. #population of the U.S. cities 6. acre sizes of wildfires in the U.S. 7. Earthquake intensities in California 8. Degree of movie-actor network Time intervals of adding friends in WeChat social network Time intervals of SMS by mobile users Response time of Einstein s correspondence Response time of Freud s correspondence Time intervals of emails Time intervals of information cascades in Tencent Weibo Time intervals of group chatting in Tencent QQ Time intervals of consecutive revision of one Wikipedia 1. 2. 3. 4. 5. 6. 7. 8.

  25. Introduction Method Experiments Conclusion 25 Experiments Distribution Fitting Most widely used method Systematic bias v.s. Good fitting

  26. Introduction Method Experiments Conclusion 26 Experiments Inferring Growth Dynamics Recap: Generating distributions by a dynamic system. Sketch of the System: Empirical Dynamics ???(?) ?? Empirical Dynamics

  27. Introduction Method Experiments Conclusion 27 Conclusions Tools for Fitting Complex Empirical Distributions Leveraging survival analysis A Generative Interpretation in a Dynamic System view Cross-section Longitudinal One Framework of Connecting these Dots Future works More complex parametric distributions Dynamics from Non-parametric distributions Generative Dynamics, Physical meanings of existing distributions, Validations Code & Data @ www.calvinzang.com Distribution Data Survival Analysis Dynamic System ???(?) ?? ?(??? + ?)?? + ?? ? (??? + ?)? ? ? = ? + = (? + ?) ?

  28. Thanks & QA 28 Learning and Interpreting Complex Distributions in Empirical Data Distribution ??????? ?????, ???? ????, ????? ???? 1Tsinghua University Data Survival Analysis Dynamic System ???(?) ?? ?(??? + ?)?? + ?? ? (??? + ?)? ? ? = ? + = (? + ?) ? www.calvinzang.com

More Related Content