
Question Answering with Memory Networks
Explore how Memory Networks revolutionize question answering in AI/NLP tasks, surpassing traditional approaches like keyword queries. Discover the bAbI dataset, Facebook's collection of 20 tasks to gauge reasoning system skills. Dive into single and multiple supporting facts scenarios, enhancing comprehension and inference abilities.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Question Answering Return the correct answer to a question expressed as a natural language question Different from IR (search), which returns the most relevant documents, to a query expressed with keywords. In principle, most AI/NLP tasks can be reduced to QA
Examples I: Mary walked to the bathroom. I: Sandra went to the garden. I: Daniel went back to the garden. I: Sandra took the milk there. Q: Where is the milk? A: garden I: Everybody is happy. Q: What s the sentiment? A: positive I: Jane has a baby in London. Q: What are the named entities? A: Jane-person, London-location I: Jane has a baby in London. A: NNP VBZ DT NN IN NNP I: I think this book is fun. Q: What is the French translation? A: Je crois que ce livre est amusant.
Difficulty No single model architecture for NLP is consistently best on all tasks Task QA Corpus bAbI SoA Memory Neural Network (Weston et al. 2015) Bi-LSTM (Moschitti et al. 2016) Bidirectional LSTM (Huang et al. 2015) Sentiment Analysis SST POS PTB-WSJ
Tasks Inference QA Reading Comprehension Story T Question q Answer q performing inference over T Story T Cloze sentence s Fill the gaps in s using information from T Sentence in which key words have been deleted
The Facebook bAbI dataset A collection of 20 tasks Each task checks one skill that a reasoning system should have. Aims at systems able to solve all tasks: no task specific engineering.
(T1) Single supporting fact (T2) Two supporting facts Questions where a single supporting fact, previously given, provides the answer. Simplest case of this: asking for the location of a person. Harder task: questions where two supporting statements have to be chained to answer the question. John is in the playground. Bob is in the office. Where is John? A: playground John is in the playground. Bob is in the office. John picked up the football. Bob went to the kitchen. Where is the football? A: playground Where was Bob before the kitchen? A: office
(T3) Three supporting facts (T4) Two argument relations To answer questions the ability to differentiate and recognize subjects and objects is crucial. Similarly, one can make a task with three supporting facts The first three statements are all required to answer this We consider the extreme case: sentences feature re-ordered words: The office is north of the bedroom. The bedroom is north of the bathroom. What is north of the bedroom? A:office What is the bedroom north of? A:bathroom John picked up the apple. John went to the office. John went to the kitchen. John dropped the apple. Where was the apple before the kitchen? A:office
(T6) Yes/No questions (T7) Counting This task tests, in the simplest case possible (with a single supporting fact) the ability of a model to answer true/false type questions: This task tests the ability of the QA system to perform simple counting operations, by asking about the number of objects with a certain property Daniel picked up the football. Daniel dropped the football. Daniel got the milk. Daniel took the apple. How many objects is Daniel holding? A:two John is in the playground. Daniel picks up the milk. Is John in the classroom? A:no Does Daniel have the milk? A:yes
(T17) Positional reasoning (T18) Reasoning about size This task tests spatial reasoning, one of many components of the classical block world This task requires reasoning about relative size of objects Tasks 3 (three supporting facts) and 6 (Yes/No) are prerequisites The Yes/No task (6) is a prerequisite. The triangle is to the right of the blue square The red square is on top of the blue square The red sphere is to the right of the blue square. Is the red sphere to the right of the blue square? A:yes Is the red square to the left of the triangle? A:yes The football fits in the suitcase. The suitcase fits in the cupboard. The box of chocolates is smaller than the football. Will the box of chocolates fit in the suitcase? A:yes
(T19) Path finding (T20) Agent s motivation the goal is to find the path between locations This task is difficult because it effectively involves search The kitchen is north of the hallway. The den is east of the hallway. How do you go from den to kitchen? A: west,north John is hungry. John goes to the kitchen. John grapples the apple there. Daniel is hungry. Why did John go to the kitchen? A: hungry Where does Daniel go?A: kitchen
Difficulty Type Task number Difficulty Single Supporting Fact 1 Easy Two or Three Supporting Facts 2-3 Hard Two or Three Argument Relations 4-5 Medium Yes/No Questions 6 Easy Counting and Lists/Sets 7-8 Medium Simple Negation and Indefinite Knowledge 9-10 Hard Basic Coreference, Conjunctions and Compound Coreference 11-12-13 Medium Time Reasoning 14 Medium Basic Deduction and Induction 15-16 Medium Positional and Size Reasoning 17-18 Hard Path Finding 10 Hard Agent s Motivations 20 Easy
Intuition Imagine having to read an article, memorize it, then get asked questions about it -> Hard! You can't store everything in working memory Optimal: give you the input data, give you the question, allow as many glances as possible
Dynamic Memory Networks Semantic Memory Module (Embeddings) Use RNNs, specifically GRUs for every module Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
DMN: The Input Module Final GRU Output for ?? sentence Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
DMN: Question Module Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
DMN: Episodic Memory, 1st hop ??? = ? ? = 1 Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
DMN: Episodic Memory, 2nd hop ??? = ? ? = 2 Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
Inspiration from Neuroscience Episodic memory is the memory of autobiographical events (times, places, etc). A collection of past personal experiences that occurred at a particular time and place. The hippocampus, the seat of episodic memory in humans, is active during transitive inference In the DMN repeated passes over the input are needed for transitive inference Slide from Richard Socher
DMN When the end of the input is reached, the relevant facts are summarized in another GRU or simple NNet Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
DMN: Answer Module Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
DMN How many GRUs were used with 2 hops? Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
Experiments: QA on bAbI (1k) Task MemNN DMN Task MemNN DMN 1: Single Supporting Fact 100 100 11: Basic Coreference 100 99.9 2: Two Supporting Facts 100 98.2 12: Conjunction 100 100 3: Three Supporting Facts 100 95.2 13: Compound Coreference 100 99.8 4: Two Argument Relations 100 100 14: Time Reasoning 99 100 5: Three Argument Relations 98 99.3 15: Basic Deduction 100 100 6: Yes/No Questions 100 100 16: Basic Induction 100 99.4 7: Counting 85 96.9 17: Positional Reasoning 65 59.6 8: Lists/Sets 91 96.5 18: Size Reasoning 95 95.3 9: Simple Negation 100 100 19: Path Finding 36 34.5 10: Indefinite Knowledge 98 97.5 20: Agent s Motivations 100 100 Mean Accuracy (%) 93.3 93.36
Focus during processing a query Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, Kumar et. al. ICML 2016
Comparison: MemNets vs DMN Similarities Differences For input representations MemNets use bag of word, nonlinear or linear embeddings that explicitly encode position MemNets iteratively run functions for attention and response DMNs shows that neural sequence models can be used for input representation, attention and response mechanisms naturally captures position and temporality Enables broader range of applications MemNets and DMNs have input, scoring, attention and response mechanisms
Question Dependent Recurrent Entity Network for Question Answering Andrea Madotto Giuseppe Attardi
Examples RQA: bAbI RC: CNN news articles Story Story Robert Downey Jr. may be Iron Man in the popular Marvel superhero films, but he recently dealt in some advanced bionic ... John picked up the apple John went to the office John went to the kitchen John dropped the apple Question Question "@placeholder" star Robert Downey Jr presents a young child with a bionic arm Where was the apple before the kitchen? Answer Answer office Iron Man
Related Work Many models have been proposed to solve the RQA and RC. Pointer Networks Memory Networks Attentive Sum Reader Memory Network Recurrent Entity Network Attention Over Attention Dynamic Memory Network EpiReader Neural Touring Machine Stanford Attentive Reader End To End Memory Network (MemN2N) Dynamic Entity Representation
Question Dependent Recurrent Entity Network Based on the Memory Network framework as a variant of Recurrent Entity Network (Henaff et a., 2017) Input Encoder: creates an internal representation Dynamic Memory: stores relevant information about entities (Person, Location, etc.) Output Module: generates the output Idea: include the question (q) in the memorization process
Input Encoder Set of sentences {s1, , st} and a question q. E R|V | dembedding matrix E(w) = e Rd {e1( s), , em {e1( q), , ek f (s/q) = { f1 , , fm } multiplicative masks with fi Rd ( s)} vectors for the words in st ( q)}} vectors for the words in q ? (?) ?? (?) ??= ?? ?=1 ? (?) ?? (?) ??= ?? ?=1
Dynamic Memory Consists in a set of blocks meant to represent entities in the story. A block is made by a key kithat identifies the entity and a hidden state hi that stores information about it. (?)= ?(??? ? (? 1)+ ????? (? 1)+ ????) ?? (?)= ?(? ? (? 1)+ ??? (? 1)+ ???) ? (?)= ? (? 1)+ ?? (?) ? (?) ? (?) ? ? (?)= ? (?)
Output Module Scores memories hiusing q embedding matrix E R|V| dto generate the output y ??= softmax(?? ?) ? u = ?? ? ?=1 ? = ??(? + ??) where y R|V|represents the model answer Training Given {(xi, yi)}n End-To-End training with standard Backpropagation Through Time (BPTT) algorithm. i =1, we used cross entropy loss function H(y, y ).
Overall picture Overall Architecture
Experiments and Results QDREN implementation1 with TensorFlow Tests on the RQA and RC tasks using: RC - CNN news articles CNN news articles split into: 380298 training 3924 validation 3198 test RQA - bAbI 1K 20 tasks each of which has: 900 training 100 validation 1000 test 1 Available at https://github.com/andreamad8/QDREN
RQA - bAbI 1K 20 20 11 15 8 Failed Tasks (>5%): 65.9 n-gram 50.8 LSTM 13.9 29.6 REN 18.6 Mean Error: MemN2N QDREN Task n-gram LSTM MemN2N REN QDREN Task n-gram LSTM MemN2N REN QDREN 1 64.0 50.0 0 0.7 0 2 98.0 80.0 8.3 56.4 67.6 3 93.0 80.0 40.3 69.7 60.8 4 50.0 39.0 2.8 1.4 0 5 80.0 30.0 13.1 4.6 2 6 51.0 52.0 7.6 30 29 7 48.0 51.0 17.3 22.3 0.7 8 60.0 55.0 10 19.2 2.5 9 38.0 36.0 13.2 31.5 4.8 10 55.0 56.0 1y.1 15.6 3.8 11 12 13 14 15 16 17 18 19 20 71.0 91.0 74.0 81.0 80.0 57.0 54.0 48.0 10.0 24.0 28.0 26.0 6.0 73.0 79.0 77.0 49.0 48.0 92.0 9.0 0.9 0.2 0.4 1.7 0 1.3 51 11.1 82.8 0 8 0.6 0 0 15.8 0.3 52 37.4 10.1 85 0.2 0.8 9 62.9 57.8 53.2 46.4 8.8 90.4 2.6 Error rates on the test set.
RC - CNN news articles Four variants, using either plain text or windows of text as input: window .. REN REN + WIND QDREN QDREN + WIND the popular @entity4 superhero films REN 42.0 42.0 LSTM 55.0 57.0 REN+WIND 38.0 40.1 Att. Reader 61.6 63.0 QDREN 39.9 39.7 MemN2N 63.4 66.8 QDREN+WIND 59.1 62.8 AoA 73.1 74.4 Validation Test Validation Test
Analysis: REN gating activation where ismary? 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Analysis: QDREN gating activation where ismary? 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
References Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (Kumar et al., 2015) Dynamic Memory Networks for Visual and Textual Question Answering (Xiong et al., 2016) Sequence to Sequence (Sutskever et al., 2014) Neural Turing Machines (Graves et al., 2014) Teaching Machines to Read and Comprehend (Hermann et al., 2015) Learning to Transduce with Unbounded Memory (Grefenstette 2015) Structured Memory for Neural Turing Machines (Wei Zhang 2015) End to end memory networks (Sukhbaataret et al., 2015) A. Moschitti, A. Severyn. 2015. UNITN: Training Deep Convolutional Network for Twitter Sentiment Classification. SemEeval 2015. M. Henaff, J. Weston, A. Szlam, A. Bordes, Y. LeCun. Tracking the World State with Recurrent Entity Network. arXiv preprint arXiv:1612.03969, 2017. J. Weston, S. Chopra, A. Bordes. Memory Networks. arXiv preprint arXiv:1410.3916, 2014. A. Madotto, G. Attardi. Question Dependent Recurrent Entity Network for Question Answering. 2017.