Goodness of Fit in Statistics: Calibration and Refinement

goodness of fit another view n.w
1 / 12
Embed
Share

Learn about the concepts of calibration and refinement in goodness of fit, as proposed by Cox in 1958. Explore external validation methods for model testing, fitting logistic models, and evaluating model performance across different datasets.

  • Statistics
  • Goodness of Fit
  • Calibration
  • Refinement
  • Logistic Models

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Goodness of fit, another view Cox (1958) Suggests that there are two aspects to goodness of fit: Calibration (roughly, average probability correct) Refinement (roughly discrimination or spread of probabilities is correct.

  2. External validation Suppose we have a model based on an old dataset and what to know if it is adequate for use on a new dataset. Cox suggests we are interested in testing: ?0:??= ?(??= 1) = ?? ??=probability in the new data ??=probability in the new data estimated based on the model from external data

  3. ?? ?? ?0:??= ?? ?0:log = log 1 ?? 1 ?? So, fit a univariate logistic model on the new dataset: ??= outcomes on the new dataset ??=estimated logits based on the model from the old dataset

  4. The model ?? ?? = ?0+ ?1logit logit 1 ?? 1 ?? If?0= 0 and ?1= 1 ?? ?? = logit logit 1 ?? 1 ??

  5. An example, can we use a model based on males to predict chd in females? data data males; set a.chd2018_a; where male; run run; data data females; set a.chd2018_a; where not male; run run;

  6. Fit model on males, score females %let target=chd; %let continuous_1=age chol fvcht sbp bmi; /*note male is no longer necessary (or possible)*/ %let categorical_1=diab currsmok; proc proc logistic logistic data=males descending; model chd=&continuous_1 &categorical_1; score data=females out=scored; run run; proc proc print print data=scored(obs=25 25);run run;

  7. Calculate logits on scored data set, using probabilities calculate for males. data data scored; run run; set scored; est_logit=log(p_1/(1 1-p_1));

  8. Test beta=1 proc proc logistic logistic data=scored descending; model chd=est_logit; /*test beta=1*/ test est_logit=1 1;/*these two statements are equivalent*/ test est_logit-1 1; run run;

  9. Data Splitting Comparison Selection Tuning Validation Training Final Assessment Test 9

  10. Data Splitting 75% Training 25% Test

  11. PROC SURVEYSELECT %let target=chd; %let continuous_1=age chol fvcht sbp bmi; %let categorical_1=diab male currsmok; proc proc surveyselect surveyselect data=a.chd2018_a out=chd2018Sampled method=srs samprate=.25 .25 outall seed=73321 73321; run run; proc proc freq freq data=chd2018sampled; tables selected; run run; data data train test; set chd2018sampled; if selected then output test; else output train; run run;

  12. %let target=chd; %let continuous_1=age chol fvcht sbp bmi; %let categorical_1=diab male currsmok; proc proc logistic logistic data=train descending; model chd=&continuous_1 &categorical_1; score data=test out=scored; run run; data data scored; set scored; est_logit=log(p_1/(1 1-p_1)); run run; proc proc logistic logistic data=scored descending; model chd=est_logit; test est_logit=1 1; run run;

Related


More Related Content