Ontology Annotation in Translational Science: Interannotator Agreement Study

measuring interannotator agreement in the florida n.w
1 / 13
Embed
Share

Explore the challenges of annotating ontologies in the Florida Annotated Corpus for Translational Science, focusing on interannotator agreement. The study evaluates the effectiveness of ontologies in extracting patient-level data from unstructured text and compares annotation tasks across multiple domains. Discover the annotation process, primary goals, and outcomes of the FACTS project, shedding light on the complexities involved.

  • Ontology Annotation
  • Translational Science
  • Interannotator Agreement
  • Corpus Annotation
  • Data Extraction

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Measuring Interannotator Agreement in the Florida Annotated Corpus for Translational Science - The difficult ontological task Amanda Hicks University of Florida aehicks@ufl.edu 6thAnnual CTSOG Workshop, Ann Arbor MI 1

  2. Overview Overview of FACTS Interannotator Agreement scores The difficult task Can ontologists and philosophers do it better? 2

  3. The Big Goal Comparatively evaluate the adequacy of ontologies for extracting patient-level data from unstructured text. 1. Create a gold standard, ontologically annotated corpus for clinical and translational science. 2. Annotate with multiple and, as far as possible, competing ontologies 3

  4. The Florida Annotated Corpus for Translational Science (FACTS) Currently consists of 20 annotated case reports on hypertension Full text Freely available through PubMed English Within last 6 years Stratified by race, ethnicity, gender and age (<18 or 18+) Annotated with VSO, in the process of annotating with DOID Can be extended to other domains or document types 4

  5. The Annotation Tasks 1. Identify the assertions about a person mentioned in the corpus 2. Annotate the entities referred to in those assertions with ontology classes 3. Annotate the relations between individuals thereby representing the full assertion current tasks 5

  6. The Annotation Process Follows the annotation procedure for the CRAFT corpus Two primary annotators annotated case reports with classes from the Vital Sign Ontology using BRAT One medical student, one public health specialist with training in nursing Primary annotations were sent to the lead annotator, who reviewed discrepancies Diffs were discussed at weekly meetings and consensus achieved, producing the gold standard. Annotation guidelines were used and revised during the annotation process. 6

  7. Interannotator Agreement Was Low f-measure exact matches only f-measure exact and partial matches 0.54 Hypertension 1 1st set of10 case reports Hypertension 2 2nd set of 10 case reports Full Corpus 0.50 0.60 0.69 0.57 0.60 The CRAFT corpus achieves ~.90 f-score consistently. 7

  8. What happened? One annotator had more training and experience than the other. However, IAA on Hypertension 2 is still quite low .06-.69. Our annotators performed two tasks, unlike CRAFT annotators. Identify the instance level assertions about an individual person mentioned in the corpus Annotate the entities referred to in those assertions with ontology classes 8

  9. What is the major source of disagreement? f-measure exact matches only f-measure exact and partial matches f-measure agreement of classes on matched spans only 0.93 Hypertension 1 0.50 0.54 Hypertension 2 0.60 0.69 0.87 Full Corpus 0.57 0.60 0.90 When the primary annotators agree on the span, they tend to agree on the class. This suggests that the difficult task is determining whether a token expresses an instance level assertion. 9

  10. Easy Cases "A 56-year-old man suddenly developed dyspnea after resection of choroidal melanoma "Previous studies noted a significant association between melanoma and endothelin (ET)-1. Sato K, Saji T, Kaneko T, Takahashi K, Sugi K. Unexpected pulmonary hypertensive crisis after surgery for ocular malignant melanoma. Life Sci. 2014;118(2):420-3. Epub 2014/03/19. doi: 10.1016/j.lfs.2014.03.004. PubMed PMID: 24632478. 10

  11. Difficult cases Which terms denote individuals and which do not? The subordinate clauses make this an interesting case. "First, a massive amount of ET-1, which is a proliferation factor in malignant melanoma, was released due to mechanical stimulation from endoresection, a procedure in which the tumor is cut into very small fragments and aspirated (Fig. 6)." "As this is the first report of pulmonary hypertension after endoresection, it might be useful to determine the differences between our patient and other patients treated with endoresection. We decided that that 'pulmonary hypertension' denotes a particular, but it is not clear to me that this is correct. Sato K, Saji T, Kaneko T, Takahashi K, Sugi K. Unexpected pulmonary hypertensive crisis after surgery for ocular malignant melanoma. Life Sci. 2014;118(2):420-3. Epub 2014/03/19. doi: 10.1016/j.lfs.2014.03.004. PubMed PMID: 24632478. 11

  12. Next steps Can ontologists and philosophers agree more than the specialist annotators on the hard task? We will have working ontologists annotate a sample set of case reports for instance level statements. We will have philosophy graduate students annotate a sample set of case reports for instance level statements. 12

  13. Acknowlegments Selja Sepp l , University of Cork Bill Hogan, University of Florida Carl Pepine, University of Florida Nathan Boire, University of Florida Chloe Herring, University of Florida This work was supported in part by the NIH/NCATS Clinical and Translational Science Award to the University of Florida UL1 TR000064. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the NCTE. 13

More Related Content