Evaluating Models Part 1

Slide Note

In this article, the author Geoff Hulten discusses the process of evaluating models, specifically focusing on the creation aspect. The content explores the importance of thorough evaluation in model development and offers insights into techniques for creating effective evaluations. Readers will gain valuable knowledge on enhancing their modeling practices through comprehensive evaluation strategies.

lagrange_m Follow

Uploaded on Feb 19, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Evaluating Models Part 1 Evaluation is Creation Geoff Hulten

Training and Testing Data 1) Training set: to build the model 2) Validation set: to tune the parameters of the model 3) Test set: to estimate how well the model works Common Pattern: for p in parametersToTry: model.fit(trainX, trainY, p) accuracies[p] = evaluate(validationY, model.predict(validationX)) bestPFound = bestParametersFound(accuracies) finalModel.fit(trainX+validationX, trainY+validationY, bestPFound) estimateOfGeneralizationPerformance = evaluate(testY, finalModel.predict(testX))

Risks with Evaluation Failure to Generalize: 1) If you test on the same data you train on, you ll be too optimistic 2) If you evaluate on test data a lot as you re debugging, you ll be too optimistic Failure to learn the best model you can: 3) If you reserve too much data for testing you might not learn as good a model

Types of Mistakes: Confusion Matrix Actual 0 0 0 0 0 1 1 1 1 1 Prediction 1 1 1 0 0 0 0 1 1 1

Basic Evaluation Metrics Actual 0 0 0 0 0 1 1 1 1 1 Prediction 1 1 1 0 0 0 0 1 1 1 Accuracy: What fraction does it get right (# TP + # TN) / # Total 3 2 Precision: When it says 1 how often is it right # TP / (# TP + # FP) 3 2 Recall: What fraction of 1s does it get right # TP / (# TP + # FN) False Positive Rate: What fraction of 0s are called 1s # FP / (# FP + # TN) False Negative Rate: What fraction of 1s are called 0s # FN / (# TP + # FN)

Example of Evaluation Metrics Accuracy: 91% (# TP + # TN) / # Total 90 0 False Negative Rate: 0% # FN / (# TP + # FN) 9 1 False Positive Rate: 90% # FP / (# FP + # TN)

Evaluating Models Part 1

Download Presentation

Presentation Transcript

Related

More Related Content