Predictive Analytics for Economists: Building Regression Models with Data Mining Techniques

eco 6380 n.w
1 / 9
Embed
Share

Explore the process of building predictive regression models using data mining techniques in the context of economics. Learn about selecting the best subset regression methods, evaluating predictive performance, and adjusting p-values in linear regressions. Dive into real-world examples with the Boston Housing Data Set and understand the importance of avoiding over-fitting for accurate predictions.

  • Predictive Analytics
  • Regression Models
  • Data Mining
  • Economics
  • Boston Housing Data

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU

  2. Presentation 5 Building a Predictive Regression Model using Data Mining Techniques (Backward, Forward, Stepwise, All Regressions, Cp Statistic) Chapter 6 in SPB

  3. Building a Data Mining Model: Example with Linear Regression The Boston Housing Data Set Selecting a Best Subset Regression Selection Methods: Forward, Backward, Stepwise, and All Subsets Methods. See the file Multiple Linear Regression and Subset Selection.pdf on the class website. The critical p-value as a tuning parameter Thus the best architecture of the regression equation is determined by the choice of best subset selection method as well as the choice of the p-value tuning parameter of the respective subset selection methods

  4. Building a Model: Example with Linear Regression The Boston Housing Data Set continued Demonstrate the danger of over-fitting a multiple regression equation by running the SAS simulation program Cross Validation.sas The Validation data set reveals the true bogus nature of spurious regressions

  5. Evaluation of Predictive Performance I. II. Various Measures of Predictive Accuracy Comparisons of Predictive Accuracies across competing prediction models in the validation data set Is the Predictive Accuracy of one method statistically better than the Predictive Accuracy of a competing method? III.

  6. Now go directly to the file Scoring Measures for Prediction Problems.pdf on the class website

  7. Building a Model: Example with Linear Regression The Boston Housing Data Set continued Rule of Thumb Adjustment of p-values in linear regressions arising from multiple testing vis- -vis subset selection methods. See the paper on Data Mining by Michael C. Lovell I have posted on the class website in the file Data Mining_Lovell.pdfand, in particular, equation (3).

  8. Building a Model: Example with Linear Regression The Boston Housing Data Set continued Demonstrate the Best Subset Selection Methods on the Boston Housing Data Set using the SAS program Boston Best Subset Partitioned.sas that can be found on the course website Demonstrate Best Subset Selection Procedure in SPSS Modeler

  9. Classroom Exercise: Exercise 3

More Related Content