Exploring Distant Reading Techniques: Algorithms, Topic Models, and Issues

slide1 n.w
1 / 20
Embed
Share

Discover the world of distant reading with a focus on algorithms, topic models, and associated positive and negative issues. Learn how computers revolutionize traditional reading processes and delve into the complexities of text analysis through technology.

  • Distant Reading
  • Algorithms
  • Topic Models
  • Text Analysis
  • Technology

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. 1 Distant Reading with Hathitrust analytics.hathitrust.org Presented By: Grant Glass

  2. TABLE OF CONTENTS 2 Distant Reading Pro/Cons Word Frequency Algorithms Named Entity Recognizer Topic Models Reflection Activity

  3. DISTANT READING ISSUES -POSITIVE How can computers help us understand traditional reading processes in new ways? How can we find new ways of reading through technology? How can we use computers to understand complicated categories like emotions and themes?

  4. DISTANT READING ISSUES -NEGATIVE How does computer-assisted interpretation undermine the very point of reading? Do these techniques show us anything new, or are they all fancy ways to describe what we already know? How does reading with technology exacerbate racial, social, and economic inequalities?

  5. Algorithms 5 Making a Collection essentially sets what corpus of material you want to use in one place. It allows you to constrain the analysis by the algorithms through only using particular texts. Topic Models Named Entity Recognizer Word Frequency

  6. Algorithm Questions 6 Word Frequency Named Entity What words were most commonly occurring in the corpus? If there are titles that include Queen Anne, what were the most frequently occurring terms? What people or places exist in the corpus? Topic Models What words are closely associated with one another in the corpus?

  7. 7 queensofantiquity.web.unc.edu

  8. WHAT IS A TOPIC MODEL? Topic modeling can be described as a method for finding a group of words (i.e topic) from a collection of documents that best represents the information in the collection. It can also be thought of as a form of text mining a way to obtain recurring patterns of words in textual material.

  9. WHAT IS IT USED FOR? Topic modelling provides us with methods to organize, understand and summarize large collections of textual information. We can discover hidden topical patterns that are present across the collection. We can annotate documents according to these topics and use these annotations to organize, search and summarize texts. Keywords for journal searching.

  10. INPHO TOPIC MODEL EXPLORER Create a new topic model for each number of topics specified. The model shows 20 topics, 40 topics, 60 topics and 80 topics. Display a visualization of how topics across models cluster together. This enables a user to see the granularity of the different models and how terms may be grouped together into "larger" topics. Creates Interactive Visualization.

  11. NAMED ENTITY RECOGNIZER Generate a list of all of the names of people and places, as well as dates, times, percentages, and monetary terms, found in a workset. Result of job: table of the named entities found in a workset.

  12. WORD COUNT Identify the tokens (words) that occur most often in a workset and the number of times they occur. Create a tag cloud visualization of the most frequently occurring words in a workset, where the size of the word is displayed in proportion to the number of times it occurred. removes stop words as specified by user Result of job: tag cloud showing the most frequently occurring words, and a file with a list of those words and the number of times they occur.

  13. QUEENS OF ANTIQUITY CRITICAL VISUALIZATION What visualization is the most useful? Why? What does the visualization help you understand about the corpus? What does it obscure? What research questions can you generate from the visualization? Generate one research question based on your observations of the visualization. Answer: How can close reading help answer one of your research questions? And what texts will you use to better contextualize the visualization? Why?

  14. 20 Next Time..we discuss your research questions and the direction you will take for your response papers.

Related


More Related Content