Learning Social Knowledge Graphs with Multi-Modal Bayesian Embeddings

multi modal bayesian embeddings for learning n.w
1 / 40
Embed
Share

Explore the intricate world of social knowledge graphs through innovative approaches like deep learning and NLP. This study delves into multi-modal Bayesian embeddings and knowledge-driven methodologies to infer concepts and tackle challenges in connecting user and concept modalities.

  • Social Knowledge Graphs
  • Bayesian Embeddings
  • Deep Learning
  • NLP
  • Multi-Modal Approach

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs Zhilin Yang12, Jie Tang1, William W. Cohen2 1Tsinghua University 2Carnegie Mellon University

  2. AMiner: academic social network Research interests

  3. Text-Based Approach List of publications Research interests Infer

  4. Text-Based Approach Term Frequency => challenging problem TF-IDF => line drawing

  5. Knowledge-Driven Approach List of publications Infer Research interests Artificial Intelligence Machine Learning Data Mining Association Rules Clustering Knowledge bases

  6. Problem: Learning Social Knowledge Graphs Mike Deep Learning for NLP Kevin Jane Deep Learning Natural Language Processing Recurrent networks for NER Jing

  7. Problem: Learning Social Knowledge Graphs Mike Deep Learning for NLP Kevin Jane Deep Learning Natural Language Processing Recurrent networks for NER Jing Knowledge base Social text Social network structure

  8. Problem: Learning Social Knowledge Graphs Mike Deep Learning for NLP Kevin Jane Deep Learning Natural Language Processing Recurrent networks for NER Jing Infer a ranked list of concepts Kevin: Deep Learning, Natural Language Processing Jing: Recurrent Networks, Named Entity Recognition

  9. Challenges Mike Deep Learning for NLP Kevin Jane Deep Learning Natural Language Processing Recurrent networks for NER Jing Two modalities users and concepts How to leverage information from both modalities? How to connect these two modalities?

  10. Approach Deep Learning for NLP Kevin Jane Recurrent networks for NER Deep Learning Jing Natural Language Processing Learn concept embeddings Learn user embeddings Model Social KG

  11. Model User Embedding mr M lr y tr fr T a q mk z fk M lk tk D T Concept Embedding

  12. Gaussian distribution for user embeddings Align users and concepts User Embedding mr M lr y tr fr T a q mk z fk M lk tk D T Gaussian distribution for concept embeddings Concept Embedding

  13. Inference and Learning mr M lr y tr fr T a q mk z fk M lk tk D T Collapsed Gibbs sampling Iterate between: 1. Sample latent variables

  14. Inference and Learning mr M lr y tr fr T a q mk z fk M lk tk D T Collapsed Gibbs sampling Iterate between: 1. Sample latent variables 2. Update parameters

  15. Inference and Learning mr M lr y tr fr T a q mk z fk M lk tk D T Collapsed Gibbs sampling Iterate between: 1. Sample latent variables 2. Update parameters 3. Update embeddings

  16. AMiner Research Interest Dataset 644,985 researchers Terms in these researchers publications Filtered with Wikipedia Evaluation Homepage matching 1,874 researchers Using homepages as ground truth LinkedIn matching 113 researchers Using LinkedIn skills as ground truth Code and data available: https://github.com/kimiyoung/genvector

  17. Homepage Matching Using homepages as ground truth. Method Precision@5 GenVector 78.1003% GenVector-E 77.8548% Sys-Base 73.8189% Author-Topic NTN 74.4397% 65.8911% CountKG 54.4823% GenVector GenVector-E Sys-Base Our model Our model w/o embedding update AMiner baseline: key term extraction CountKG Author-topic NTN Rank by frequency Classic topic models Neural tensor networks

  18. LinkedIn Matching Using LinkedIn skills as ground truth. Method Precision@5 GenVector 50.4424% GenVector-E 49.9145% Author-Topic 47.6106% NTN CountKG 42.0512% 46.8376% GenVector GenVector-E Our model Our model w/o embedding update CountKG Author-topic NTN Rank by frequency Classic topic models Neural tensor networks

  19. Error Rate of Irrelevant Cases Manually label terms that are clearly NOT research interests, e.g., challenging problem. Method Error rate GenVector 1.2% Sys-Base 18.8% Author-Topic 1.6% NTN 7.2% GenVector Sys-Base Our model AMiner baseline: key term extraction Author-topic NTN Classic topic models Neural tensor networks

  20. Qualitative Study: Top Concepts within Topics GenVector Author-Topic Query expansion Concept mining Language modeling Information extraction Knowledge extraction Entity linking Language models Named entity recognition Document clustering Latent semantic indexing Speech recognition Natural language *Integrated circuits Document retrieval Language models Language model *Microphone array Computational linguistics *Semidefinite programming Active learning

  21. Qualitative Study: Top Concepts within Topics GenVector Author-Topic Image processing Face recognition Feature extraction Computer vision Image segmentation Image analysis Feature detection Digital image processing Machine learning algorithms Machine vision Face recognition *Food intake Face detection Image recognition *Atmospheric chemistry Feature extraction Statistical learning Discriminant analysis Object tracking *Human factors

  22. Qualitative Study: Research Interests GenVector Sys-Base Feature extraction Image segmentation Image matching Image classification Face recognition Face recognition Face image *Novel approach *Line drawing Discriminant analysis

  23. Qualitative Study: Research Interests GenVector Sys-Base Unsupervised learning Feature learning Bayesian networks Reinforcement learning Dimensionality reduction *Challenging problem Reinforcement learning *Autonomous helicopter *Autonomous helicopter flight Near-optimal planning

  24. Online Test A/B test with live users Mixing the results with Sys-Base Method GenVector Error rate 3.33% Sys-Base 10.00%

  25. Other Social Networks? Mike Deep Learning for NLP Kevin Jane Deep Learning Natural Language Processing Recurrent networks for NER Jing Knowledge base Social text Social network structure

  26. Conclusion Study a novel problem Learning social knowledge graphs Propose a model Multi-modal Bayesian embedding Integrate embeddings into graphical models AMiner research interest dataset 644,985 researchers Homepage and LinkedIn matching as ground truth Online deployment on AMiner

  27. Thanks! Code and data: https://github.com/kimiyoung/genvector

  28. Social Networks Mike Kevin Jane AMiner, Facebook, Twitter Huge amounts of information Jing

  29. Knowledge Bases Computer Science Artificial Intelligence System Deep Learning Natural Language Processing Wikipedia, Freebase, Yago, NELL Huge amounts of knowledge

  30. Bridge the Gap Computer Science Mike Artificial Intelligence System Kevin Jane Deep Learning Natural Language Processing Jing Better user understanding e.g. mine research interests on AMiner

  31. Copy picture Approach Social network Knowledge base Social text Concept embeddings User embeddings Model Social KG

  32. Model mr M lr y tr fr T a q mk z fk M lk tk Concepts for the user D T Documents (one per user) Parameters for topics

  33. Model mr M lr y tr fr T a q mk z fk M lk tk D T Generate a topic distribution for each document (from a Dirichlet)

  34. Model mr M lr y tr fr T a q mk z fk M lk tk D T Generate Gaussian distribution for each embedding space (from a Normal Gamma)

  35. Model mr M lr y tr fr T a q mk z fk M lk tk D T Generate the topic for each concept (from a Multinomial)

  36. Model mr M lr y tr fr T a q mk z fk M lk tk D T Generate the topic for each user (from a Uniform)

  37. Model mr M lr y tr fr T a q mk z fk M lk tk D T Generate embeddings for users and concepts (from a Gaussian)

  38. Model mr M lr y tr fr T a q mk z fk M lk tk D T General

  39. Inference and Learning Collapsed Gibbs sampling for inference Add picture Update the embedding during learning Different from LDAs with discrete observed variables Sample latent variables Update Embeddings Update parameters

  40. Methods for Comparison Method Description GenVector Our model GenVector-E Our model w/o embedding update Sys-Base CountKG AMiner baseline: key term extraction Rank by frequency Author-topic Classic topic models NTN Neural tensor networks

More Related Content