Deep Learning for Text Analysis
This content discusses the state of deep learning for text analysis as presented at the SwissText Conference in June 2016. It covers topics such as sentiment analysis, multilingual challenges, leveraging weakly supervised data, and the results of various methods in different languages. The presentation includes images illustrating language models, convolutional neural networks, and data phases in sentiment classification. It concludes with the winners of evaluation competitions in 2016.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Deep Learning for Text Analysis Where do we stand? Jan Deriu SwissText Conference, 9thJune 2016
Intro Swiss Text, June 2016 Jan Deriu 2
Language Model Illustration: http://sebastianruder.com/word-embeddings-1/ Swiss Text, June 2016 Jan Deriu 3
Properties Image credits: https://www.tensorflow.org/tutorials/word2vec Swiss Text, June 2016 Jan Deriu 4
Deep Learning Convolutional Neural Networks Illustration: Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification Swiss Text, June 2016 Jan Deriu 5
Task 1: Sentiment Analysis - Multilingual Swiss Text, June 2016 Jan Deriu 6
3 Phase Learning Illustration: Deriu, Jan, et al. "Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification." Swiss Text, June 2016 Jan Deriu 7
Data Distant Phase 70 60 50 30 40 40 30 30 32 20 30 20 10 10 8 0 English French German Italian Negative Positive Swiss Text, June 2016 Jan Deriu 8
Data Supervised Phase SemEval 2016 20000 18000 16000 7544 14000 12000 10000 8000 4095 7752 5319 6000 2942 4000 2892 1434 2193 2000 2748 2293 2120 1443 0 English French Negative German Neutral Italian Positive Swiss Text, June 2016 Jan Deriu 9
Results Method English French German Italian 63.49 60.46 60.61 48.60 64.79 63.25 - 53.86 65.09 62.10 - 52.40 67.79 64.08 - 52.71 SL-CNN SL-CNN (no dist.) SVM RF Swiss Text, June 2016 Jan Deriu 10
Competition Winner EvalIta 2016 SemEval 2016 Swiss Text, June 2016 Jan Deriu 11
Summary Sentiment Analysis Easy to adapt for multiple languages Data-intensive Swiss Text, June 2016 Jan Deriu 12
Task 2: Gender and Variety Swiss Text, June 2016 Jan Deriu 14
Data - PAN 2017 Number of Profiles 4800 3600 2400 1200 English Spanish Portuguese Arab Swiss Text, June 2016 Jan Deriu 15
Data: Variety Language English Australian Canadian British Irish New Zealand USA Spanish Argentina Chile Colombia Mexico Peru Spain Venezuela Portuguese Brazil Portugal Arabic Egypt Gulf Maghrebi Levantine Swiss Text, June 2016 Jan Deriu 16
F1 scores: GRU PAN 2017 98.67 100 92.04 90 79.47 79.04 79.03 78.85 80 72.45 71.46 70 60 50 40 30 20 10 0 Gender Variety English Spanish Portuguese Arab Swiss Text, June 2016 Jan Deriu 18
Results: Architectures (English only) 100 90 79.03 79.03 78.72 78.7 80 73.23 70.9 70 60 50 40 30 20 10 0 Gender Variety CNN LSTM-Att GRU-Att Swiss Text, June 2016 Jan Deriu 20
Summary Good data yields good results In Deep Learning the focus lies in finding and tuning the correct architecture Swiss Text, June 2016 Jan Deriu 22
Task 3: Community Question Answering (cQA) Swiss Text, June 2016 Jan Deriu 23
cQA - Data - SemEval 2017 25000 21960 20000 15000 12600 10000 5000 3270 0 Training Validation Test Swiss Text, June 2016 Jan Deriu 24
cQA - Approach - Siamese CNN Swiss Text, June 2016 Jan Deriu 25
cQA Results SemEval 2017 Swiss Text, June 2016 Jan Deriu 26
cQA - Summary Deep Learning supports a large variety of architectures Feautre-based approach works well Swiss Text, June 2016 Jan Deriu 27
Conclusion Deep Learning very data-intensive Not always better than feature-based approaches From feature-engeneering to archtiecture -engeneering Swiss Text, June 2016 Jan Deriu 28
Code https://github.com/spinningbytes/deep-mlsa Swiss Text, June 2016 Jan Deriu 29