Enhancing Language Representation with K-BERT
K-BERT is a model designed to enable language representation by incorporating knowledge graphs. It addresses issues like Heterogeneous Embedding Space and Knowledge Noise to improve the consistency and accuracy of sentence meaning. The methodology involves a unique model architecture, knowledge layer, embedding layer, and seeing layer to handle knowledge incorporation effectively. The Experiments section outlines the pre-training corpora used and specific and open-domain tasks evaluated. Overall, K-BERT offers a novel approach to enhancing language understanding by leveraging knowledge graphs.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
K-BERT: Enabling Language Representation with Knowledge Graph Weijie Liu , Peng Zhou , Zhe Zhao , Zhiruo Wang , Qi Ju, Haotang Deng and Ping Wang Peking University, Beijing, China Tencent Research, Beijing, China Beijing Normal University, Beijing, China 1
Introduction Heterogeneous Embedding Space (HES): In general, the embedding vectors of words in text and entities in KG are obtained in separate ways, making their vector-space inconsistent Knowledge Noise (KN): Too much knowledge incorporation may divert the sentence from its correct meaning. 2
Methodology Notation sentence s = {?0, ?1, ?2, ..., ??} Each token ??is included in the vocabulary V, ?? V. KG, denoted as K, is a collection of triples = (??, ??, ??) 3
Methodology Model architecture 4
Methodology Knowledge layer given an input sentence s = {?0, ?1, ?2, ..., ??} and a KG, KL outputs a sentence tree t = {?0, ?1, ..., ??{(??0, ??0), ...,(???, ???)}, ..., ??}. 5
Methodology Embedding layer Tim Cook isvisiting Beijing now 6
Methodology Seeing layer The input to K-BERT is a sentence tree, where the branch is the knowledge gained from KG. However, the risk raised with knowledge is that it can lead to changes in the meaning of the original sentence, i.e., KN issue 7
Methodology Mask-Transformer Mask-Self-Attention 8
Experiments Pre-training corpora WikiZh WebtextZh Knowledge graph CN-DBpedia HowNet MedicalKG Open-domain tasks Book_review Chnsenticorp Shopping Weibo XNLI LCQMC NLPCC-DBQA MSRA-NER Specific-domain tasks Finance_Q&A Law_Q&A Finance_NER Medicine_NER 9
Experiments 10
Experiments 11