ROC Curves and Operating Points in Machine Learning

1 / 11

Embed Share

Explore the concepts of ROC curves, changing thresholds in logistic regression, comparing models using ROC curves, precision-recall curves, AUC, and finding optimal operating points for better classification performance. Learn how to balance mistakes for your application and make informed decisions in model evaluation.

ryde_5 Follow

Uploaded on Apr 13, 2025 | 2 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

ROC Curves and Operating Points Geoff Hulten

Classifications and Probability Estimates Logistic regression produces a score (probability estimate) Use threshold to produce classification What happens if you vary the threshold?

Example of Changing Thresholds Threshold = .5 False Positive Rate 33% Score Prediction Y False Negative Rate 0% .25 0 .45 0 Threshold = .6 False Positive Rate 33% .55 1 False Negative Rate 33% .67 0 Threshold = .7 .82 1 False Positive Rate 0% .95 1 False Negative Rate 33%

ROC Curve

Comparing Models with ROC Curves

More ROC Comparisons

Some Common ROC Problems

Precision Recall Curves PR Curves Incremental Classifications More Accurate Everything Classified Correctly Incremental Classifications Less Accurate First Set of Mistakes Everything Classified as 1

Area Under Curve -- AUC AUC ~ .97 Integrate the Area Under Curve Perfect score is 1 Higher scores allow for generally better tradeoffs AUC ~ .89 Score of 0.5 indicates random model Score of < 0.5 indicates you re doing something wrong

Operating Points Balance Mistakes for your Application Spam needs low FP Rate Use separate hold out data to find threshold

Pattern for using operating points # Train model and tune parameters on training and validation data # Evaluate model on extra holdout data, reserved for threshold setting xThreshold, yThreshold = ReservedData() # Find threshold that achieves operating point on this extra holdout data potentialThresholds = {} for t in range [ 1% - 100%]: potentialThresholds[t] = FindFPRate(model.Predict(xThreshold, t), yThreshold) bestThreshold = FindClosestThreshold(<target>, potentialThresholds) # Evaluate on test data with selected threshold to estimate generalization performance performanceAtOperatingPoint = FindFNRate(model.Predict(xTest, bestThreshold), yTest) # make sure nothing went crazy if FindFPRate(model.Predict(xTest, bestThreshold), yTest) <is not close to> potentialThresholds[bestThreshold]: # Problem?

ROC Curves and Operating Points in Machine Learning

Download Presentation

Presentation Transcript

Related

More Related Content