
Effective Multi-Token Completion from MLMs - Boost Your Language Model Performance
Discover the power of multi-token completion from MLMs through insightful research by Oren Kalinsky, Guy Kushilevitz, Alex Libov, and Yoav Goldberg at Amazon. Explore the motivation behind European country analysis, the benefits of MLMs over seq2seq models, and how MLMs learn multi-token phrases. Uncover the process of generating quadruples, collecting sentences, and analyzing MLM completions for improved language model success.
Uploaded on | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Simple and Effective Multi Token Completion from MLMs Oren Kalinsky, Guy Kushilevitz, Alex Libov, Yoav Goldberg guyk@amazon.com
Motivation European countries such as [MASK] and others Italy France the United Kingdom North Macedonia 2 Amazon.com
Motivation MLMs vs seq2seq: (relatively) small (relatively) easy to run Available for many domains Available for many languages 3 Amazon.com
MLMs learn Multi-Token Phrases Our work is based on the assumption that MLMs learn multi token phrases Essential property for MLMs success We experiment to show this explicitly 4 Amazon.com
MLMs learn Multi-Token Phrases Generate quadruples: Multi token phrase; e.g. new york city Single token synonym; e.g. nyc Similar phrase; e.g. chicago Random token; e.g. disco 5 Amazon.com
MLMs learn Multi-Token Phrases Collect sentences for each multi token phrase The park is in new york city Mask out another np-chunk in the sentence The [MASK] is in new york city Generate similar sentences for other quadruple phrases The [MASK] is in nyc The [MASK] is in chicago The [MASK] is in disco 6 Amazon.com
MLMs learn Multi-Token Phrases Collect MLM completions for each sentence The [MASK] is in new york city -> bank, museum, bronx, The [MASK] is in nyc -> bronx, park, bank, The [MASK] is in chicago -> stadium, bank, university, The [MASK] is in disco -> it, she, he, Compare the completions Similarity measure Results: Multi token synonym: 0.76 Multi token similar: 0.71 Multi token random: 0.63 7 Amazon.com
Straightforward approach The president traveled to the city of [MASK] [MASK]. Limitations Need to pre determine the number of tokens Completions are unconditioned Does not work well 8 Amazon.com
Dataset Corpus Wikipedia Books corpus Multi-Token vocabulary Np chunks and Entities ~93K phrases appearing more than 500 times ~10% single token, ~53% 2 tokens 9 Amazon.com
Dataset 50 sentences per vocabulary phrase New York -> Donald Trump was born in New York" ~4.5M sentences Mask-out phrase Donald Trump was born in [MASK] Label: New York Split to train (90%), dev (5%) and test (5%) 10 Amazon.com
MLM Large European countries such as [MASK] and others Italy Encoded information: country in europe large MLM decoder Contextual embedding MLM Large [MASK] and others European countries such as 11 Amazon.com
EMAT United Kingdom Extended Decoder Matrix Multi token phrases get their own embedding Only in the Decoder Matrix Enlarged MLM decoder MLM decoder Model vocabulary stays the same Grows with the completion vocabulary Contextual embedding 12 Amazon.com
RNN - Small and simple generation model - Completes until EOS - Does not grow Kingdom EOS United Completion step Completion step Completion step MLM decoder FF GRU Contextual embedding 13 Amazon.com
Architectures EOS Completion step Completion step York New York Completion step NY MLM decoder 2 3 1 New Extended MLM decoder MLM decoder GRU Contextual Embedding Contextual Embedding Contextual Embedding MLM MLM MLM I love [MASK] city I love [MASK] city I love [MASK] city 14 Amazon.com
MTC Results 15 Amazon.com
Domain Specific Results Pubmed based data Shows the pretrained is indeed utilized! 16 Amazon.com
Human Evaluation Valid: grammatically correct and makes sense Specific: (e.g. not he or it ) Correct: factually correct completion 17 Amazon.com
Thanks! Please reach out for any questions or suggestions guyk@amazon.com