Exploring Link Semantics for Topic Distributions on the Web

topic distributions over links on web n.w
1 / 19
Embed
Share

Discover the motivations behind web users creating links with diverse intentions, impacting applications like expert finding and friend recommendations. Dive into topic distribution analysis over citations to gain insights into research fields. Address the problem of link semantic analysis and explore previous work on link influence and social analysis. Learn about the approach using Pairwise Restricted Boltzmann Machines and experimental findings for future work.

  • Web Semantics
  • Topic Distribution
  • Link Analysis
  • Research Field
  • Pairwise Machines

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Topic Distributions over Links on Web Jie Tang1, Jing Zhang1, Jeffrey Xu Yu2, Zi Yang1, Keke Cai3, Rui Ma3, Li Zhang3, and Zhong Su3 1Tsinghua University 2Chinese University of Hong Kong 3IBM, China Research Lab Dec. 7th2009 1

  2. Motivation Web users create links with significantly different intentions Understanding of the category and the influence of each link can benefit many applications, e.g., Expert finding Collaborator finding New friends recommendation 2

  3. Examples Topic distribution analysis over citations ? Researcher A an in-depth understanding of the research field? Introduction of Modern Information Retrieval An Inverted Index Implementation Topics Filtered Topic 31: Ranking and Inverted Index Document Retrieval with Frequency-Sorted Indexes Parameterised Compression for Sparse Bitmaps Topic 1: Theory Memory Efficient Ranking Topic 27: Information retrieval Topic 23:Index method Topic 21: Framework Signature les: An access Method for Documents and its Analytical Performance Evaluation Topic 34: Parallel computing Self-Indexing Inverted Files for Fast Text Retrieval VS. A Document-centric Approach to Static Index Pruning in Text Retrieval Systems Topic 22: Compression Other Vector-space Ranking with Effective Early Termination Citation Relationship Type Basic theory Comparable work Other Efficient Document Retrieval in Main Memory Static Index Pruning for Information Retrieval Systems Semantic citation network Original citation network 3

  4. Problem: Link Semantic Analysis Topic modeling over links Citation context words Link semantics 4

  5. Outline Previous Work Our Approach Pairwise Restricted Boltzmann Machines (PRBMs) Experimental Results Conclusion & Future Work 5

  6. Previous Work Link influence analysis Citation influence topic [Dietz, 07]; Social influence analysis [Crandall, 08; Tang, 09]; Social network analysis Social network analysis [Wasserman, 94] Web community discovery [Newman, 04] Small world networks [Watts, 18] Graphical model Probabilistic LSI [Hofmann, 99], Latent Dirichlet Allocation [Blei, 03], Restricted Boltzmann machines [Welling, 01] 6

  7. Outline Previous Work Our Approach Pairwise Restricted Boltzmann Machines (PRBMs) Experimental Results Conclusion & Future Work 7

  8. Pairwise Restricted Boltzmann Machines (PRBMs) Link category Latent variables defined over the link to bridge the two pages Topic distribution Link context words Pairwise Restricted Boltzmann Machines (PRBMs) Example 8

  9. Formalization of PRBMs Formalization Obj. Func: with PRBMs 9

  10. Model Learning Expectation w.r.t. the distribution defined by the model Expectation w.r.t. the data distribution Generative learning We use the Contrast Divergence to learn the model distribution PM Discriminative learning Obj. Func: Hybrid learning 10

  11. Link Semantic Analysis Link category annotation First we calculate Then we estimate the probability p(c|e) by a mean field algorithm Link influence estimation Estimate influence by KL divergence An alternative way is to generate the influence score by a Gaussian distribution, thus 11

  12. Outline Previous Work Our Approach Pairwise Restricted Boltzmann Machines (PRBMs) Experimental Results Conclusion & Future Work 12

  13. Experimental Setting Data sets Arnetminer data: 978,504 papers, 14M citations Wikipedia: 14K article pages and 25 K links Evaluation measures Link categorization accuracy Topical analysis Baselines: SVM+LDA SVM+RBM 13

  14. Accuracy of Link Categorization gPRBM: our approach with generative learning dPRBM: our approach with discriminative learning hPRBM: our approach with hybrid learning 14

  15. Category-Topic Mixture 15

  16. Example Analysis 16

  17. Outline Previous Work Our Approach Pairwise Restricted Boltzmann Machines (PRBMs) Experimental Results Conclusion & Future Work 17

  18. Conclusion & Future Work Concluding remarks Investigate the problem of quantifying link semantics on the Web Propose a Pairwise Restricted Boltzmann Machines to solve this problem Future Work Semantic analysis over social relationships Correlation between the link semantics and the information propagation 18

  19. Thanks! Q&A HP: http://keg.cs.tsinghua.edu.cn/persons/tj/ 19

More Related Content