Understand Market Structure Through Consumer-Generated Content Mining

cs548 spring 2016 n.w
1 / 23
Embed
Share

Explore how mining consumer-generated content online can provide valuable insights into market structure, competitive landscape, and consumer preferences. Learn about the challenges and opportunities in analyzing vast amounts of unstructured data to gain a top-of-mind associative network of products. Discover the methodologies and empirical applications in text mining to extract meaningful information from forums, blogs, and product reviews.

  • Market Structure
  • Text Mining
  • Consumer Content
  • Competitive Landscape
  • Data Analysis

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CS548 Spring 2016 Showcasing work by Netzer, Feldman, Goldberg, Fresko on "Mine Your Own Business: Market-Structure Surveillance Through Text Mining" Huayi Zhagn and Haiyan Liang

  2. References Netzer O, Feldman R, Fresko M. Mine Your Own Business: Market-Structure Surveillance Through Text Mining . Marketing Science, 31(3), 521-43. 2012. Conditional random field . Wikipedia: The Free Encyclopedia. Wikimedia Foundation, Inc., 19 Mar 2016. Web. 02 April 2016. <https://en.wikipedia.org/wiki/Conditional_rando m_field> 2

  3. Agenda Opportunities and challenges of mining consumer content online Objective of research Dataset used in the research Text mining methodology Two Empirical Applications-Sedans Forum & Diabetes Drug Forum 3

  4. Mining Consumer-Generated Content Abundant information posted by consumers online media Forums, blogs, product reviews Firms can gain a better understanding of Marketing opportunities Market structure Competitive landscape Competitors products 4

  5. Consumer-Generated Content is both a blessing and a curse Significant increase of data scale make information difficult to track and quantify Consumer data is unstructured and primarily qualitative Noise can make it impractical to quantify and convert data into useable information 5

  6. Objective In the author's word, Utilize large-scale, consumer generated data on the web to allow firms to understand consumer s top-of-mind associative network of products and the implied market structure insights 6

  7. Data Set Used for Sedans Forum Sedans Forum on Edmunds.com on 02/13/07 Look for co-occurrences between Car brands Car models Car brand or a model and a term used to describe it 7

  8. Data Set Used for Diabetes Drug Forums Used forums to assess consumers discussions about adverse drug reaction(ADR) 8

  9. Authors Text Mining Methodology Web Page Downloading HTML Cleaning Information Extraction Chunking Identification of semantic relationships 9

  10. Text Mining Methodology-Cont. Information Extraction Through conditional random field (CRF) approach trained on a small, manually tagged training set Rule-based approach to fine-tune the terms High recall and precision achieved 10

  11. Conditional Random Field in Wikipedia A class of statistical modelling method often applied in pattern recognition and machine learning, where they are used for structured prediction. CRF is a math concept means a pattern that similar result would happen when given similiar condition. Could encode known relationships between observations and construct consistent interpretations. Often used for labeling or parsing of sequential data, such as natural language text or biological sequences and in computer vision 11

  12. Measures of Co-Occurrence Lift Ratio of the actual co-occurrence of two terms to the frequency of what we would expect ( , ) ( ) ( ) P A P B P A B ( , ) = Lift A B 12

  13. Alternative Measures of Similarity Jaccard index x x ij = Jaccard ij + - x x j i ij Salton Cosine x x x ij C os = ine ij j i Pearson Correlation = ( , ) r corr X X ij i j TF-IDF 13

  14. Correlations Among Different Measures 14

  15. Empirical Applications-Sedans Forum 15

  16. Clustering from Sedans Forum 16

  17. Validated from Actual Marketing Survey Data 17

  18. Cadillac Case 18

  19. Cadillac Case 19

  20. Commonly Discussed Terms in Sedan Forum 20

  21. Commonly Discussed Problems in Sedan Forum 21

  22. Empirical Applications-Diabetes Drug Forums 22

  23. Conclusions Use text mining to overcome the difficulties involved in extracting and quantifying the online consumer-generated data Use network analysis tool to covert the minded relationships into co-occurrence among brands or between brands and terms Proposed approach validated with actual marketing survey and formal media data 23

Related


More Related Content