
Precision and Recall in Information Retrieval Techniques
Learn about the concepts of precision and recall in information retrieval techniques as presented by Dr. Adnan Abid. The lecture covers computing recall/precision points, interpolating recall/precision curves, and provides examples to illustrate these important measures in information retrieval evaluation.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
INFORMATION RETRIEVAL INFORMATION RETRIEVAL TECHNIQUES TECHNIQUES BY DR. ADNAN ABID Lecture # 26 Precision and Recall 1
ACKNOWLEDGEMENTS ACKNOWLEDGEMENTS The presentation of this lecture has been taken from the underline sources 1. Introduction to information retrieval by Prabhakar Raghavan, Christopher D. Manning, and Hinrich Sch tze 2. Managing gigabytes by Ian H. Witten, Alistair Moffat, Timothy C. Bell 3. Modern information retrieval by Baeza-Yates Ricardo, 4. Web Information Retrieval by Stefano Ceri, Alessandro Bozzon, Marco Brambilla
Outline Outline Computing Recall/Precision Interpolating a Recall/Precision Average Recall/Precision Curve R- Precision Precision@K F-Measure E Measure 3
Computing Recall/Precision Points: Example 1 n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 doc # relevant 588 589 576 590 986 592 984 988 578 985 103 591 772 990 Let total # of relevant docs = 6 Check each new recall point: x x R=1/6=0.167; P=1/1=1 x R=2/6=0.333; P=2/2=1 x R=3/6=0.5; P=3/4=0.75 R=4/6=0.667; P=4/6=0.667 Missing one relevant document. Never reach 100% recall x R=5/6=0.833; p=5/13=0.38 4
Computing Recall/Precision Points: Example 2 n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 doc # relevant 588 576 589 342 590 717 984 772 321 498 113 628 772 592 Let total # of relevant docs = 6 Check each new recall point: x x R=1/6=0.167; P=1/1=1 x R=2/6=0.333; P=2/3=0.667 R=3/6=0.5; P=3/5=0.6 x x R=4/6=0.667; P=4/8=0.5 R=5/6=0.833; P=5/9=0.556 R=6/6=1.0; p=6/14=0.429 x 5
Interpolating a Recall/Precision Curve Interpolating a Recall/Precision Curve Interpolate a precision value for each standard recall level: rj {0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0} r0 = 0.0, r1= 0.1, , r10=1.0 The interpolated precision at the j-th standard recall level is the maximum known precision at any recall level between the j-th and (j + 1)-th level: = ( ) max j r ( ) P r P r j r r + 1 j 6
Interpolating a Recall/Precision Curve: Example 1 Precision 1.0 0.8 0.6 0.4 0.2 1.0 0.2 0.4 0.6 0.8 Recall 7
Interpolating a Recall/Precision Curve: Example 2 Precision 1.0 0.8 0.6 0.4 0.2 1.0 0.2 0.4 0.6 0.8 Recall 8
Interpolating a Recall/Precision Curve: Example 1 Precision 1.0 0.8 0.6 0.4 0.2 1.0 0.2 0.4 0.6 0.8 Recall 9
Interpolating a Recall/Precision Curve: Example 2 Precision 1.0 0.8 0.6 0.4 0.2 1.0 0.2 0.4 0.6 0.8 Recall 10
Interpolating a Recall/Precision Curve: Example 1 Precision 1.0 0.8 0.6 0.4 0.2 1.0 0.2 0.4 0.6 0.8 Recall 11
Average Recall/Precision Curve Average Recall/Precision Curve Typically average performance over a large set of queries. Compute average precision at each standard recall level across all queries. Plot average precision/recall curves to evaluate overall system performance on a document/query corpus. 12
Compare Two or More Systems Compare Two or More Systems The curve closest to the upper right-hand corner of the graph indicates the best performance 1 NoStem Stem 0.8 Precision 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall 13
Interpolating a Recall/Precision Curve: Example 2 Precision 1.0 0.8 0.6 0.4 0.2 1.0 0.2 0.4 0.6 0.8 Recall 15
R R- - Precision Precision Precision at the R-th position in the ranking of results for a query that has R relevant documents. n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 doc # relevant 588 589 576 590 986 592 984 988 578 985 103 591 772 990 R = # of relevant docs = 6 x x x R-Precision = 4/6 = 0.67 x x 16
Precision@K Set a rank threshold K Compute % relevant in top K Ignores documents ranked lower than K Ex: Prec@3 of 2/3 Prec@4 of 2/4 Prec@5 of 3/5 In similar fashion we have Recall@K
F F- -Measure Measure One measure of performance that takes into account both recall and precision. Harmonic mean of recall and precision: PR F + 2 2 = = 1 1 P R + R P Compared to arithmetic mean, both need to be high for harmonic mean to be high. 18
E Measure (parameterized F Measure) E Measure (parameterized F Measure) A variant of F measure that allows weighting emphasis on precision over recall: + + 2 2 1 ( ) 1 ( ) PR = = E + 2 2 P R 1 + R P Value of controls trade-off: = 1: Equally weight precision and recall (E=F). > 1: Weight recall more. < 1: Weight precision more. 19
Combined Measures 100 80 Minimum Maximum Arithmetic Geometric Harmonic 60 40 20 0 0 20 40 60 80 100 Precision (Recall fixed at 70%) 20