
Visually Attacking and Shielding NLP Systems for Human-like Text Processing
Explore the innovative techniques of visually attacking and shielding NLP systems to enhance their robustness. Join the discussion on making NLP more human-like and resilient against spamming and toxic comments in the open domain. Dive into the experiments and results from the VIPER project conducted by the UKP Lab at Technische Universität Darmstadt, shedding light on the future of text processing.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Text Processing Like Humans Do: Visually Atta king n Shi lding NLP Systems teff n Eger, G zde G l ahin, Andr as R c l , Ji-Ung ee, la dia Schulz, Mohs n Mesgar, Krish ant Swar kar, E win Simpson, Ir na Gurevych Ubiquitous Knowledge Processing Lab AIPHES Research Training Group Technische Universit t Darmstadt https://www.ukp.tu-darmstadt.de/ https://www.aiphes.tu-darmstadt.de
Motivation Human vs Machine Text Processing Visual Perturber (VIPER) Making NLP Systems more robust Experiments & Results Recap & Conclusion 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 2
Motivaton 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 3
NLP out in the open 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 4
NLP out in the open Spamming Domain name spoofing Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 5
NLP out in the open Spamming Domain name spoofing You idiot! You have no idea! Go to hell! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 6
NLP out in the open Spamming Domain name spoofing You idiot! You have no idea! Go to hell! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 7
NLP out in the open Spamming Domain name spoofing http://w kipedia.org You id ! You have no ea! Go to e ! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 8
NLP out in the open Spamming Domain name spoofing http://w kipedia.org You id ! You have no ea! Go to e ! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 9
Huma vs Mchine ext Procesing 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 10
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 11
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! You id ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 12
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! You id ! i d i o t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 13
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! You id ! i d i o t toxic toxic 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 14
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! i d i o t toxic toxic 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 15
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! 0.1 . . 0.2 0.6 . . 0.1 . . . i d i o t toxic toxic 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 16
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! 0.1 . . 0.2 0.6 . . 0.1 . . . i d i o t toxic toxic ? ? ? ? ? ? 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 17
Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! 0.1 . . 0.2 0.6 . . 0.1 . . . i d i o t toxic toxic ? ? ? ? ? ? 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 18
VIsal PERturbe (VIPER) 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 19
Creating Perturbations with VIPER Character embeddings a s Use pixel values as initialization P Cosine similarity to obtain similar characters 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 20
Attacking NLP Systems Use VIPER to perturb characters Test data Test data f u c k i n g p = 0.9 p = 0.9 u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 21
Attacking NLP Systems Use VIPER to perturb characters p : probability of perturbing a character Test data Test data f u c k i n g p = 0.9 p = 0.9 u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 22
Attacking NLP Systems Use VIPER to perturb characters p : probability of perturbing a character How do state-of-the-art models perform on perturbed test data? Test data Test data ? ? ? ? ? ? f u c k i n g p = 0.9 p = 0.9 vs vs ? ? ? ? ? ? u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 23
Human vs Machine Performance 0% Perturbance: game over man ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 24
Human vs Machine Performance 20% Perturbance: game ove m ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 25
Human vs Machine Performance 40% Perturbance: am a ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 26
Human vs Machine Performance 80% Perturbance: m o r a ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 27
Human vs Machine Performance 80% Perturbance: m o r a ! P = 0.8: Human performance ~94% Machine performance ~20-70% 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 28
Making NL Systems more robs 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 29
How to shield against such attacks? Visually Informed Character Embeddings Data Augmentation 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 30
How to shield against such attacks? Visually Informed Character Embeddings Data Augmentation 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 31
Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 32
Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed 0.1 . . 0.2 0.7 . . 0.2 . . . random init random init T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 33
Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic 0.1 . . 0.2 0.7 . . 0.2 . . . random init random init T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 34
Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic 0.1 . . 0.2 0.7 . . 0.2 . . . random init random init pixel images pixel images T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 35
Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic 0.5 . . 0.3 0.8 . . 0.1 0.1 . . 0.2 0.7 . . 0.2 . . . . . . random init random init pixel images pixel images T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 36
Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic non non- -toxic toxic 0.5 . . 0.3 0.8 . . 0.1 0.1 . . 0.2 0.7 . . 0.2 . . . . . . random init random init pixel images pixel images T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 37
Visually Informed Character Embeddings No random initialization of character embeddings non non- -toxic toxic Use concatenated pixel values 0.5 . . 0.3 0.8 . . 0.1 . . . pixel images pixel images T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 38
Visually Informed Character Embeddings No random initialization of character embeddings non non- -toxic toxic Use concatenated pixel values Visual similarities can now be learned during training 0.5 . . 0.3 0.8 . . 0.1 . . . More likely to generalize better to new characters during testing pixel images pixel images T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 39
How to shield against such attacks? Visually Informed Character Embeddings Data Augmentation 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 40
Adversarial Training Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 41
Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] p = 0.2 p = 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 42
Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] 0.5 . . 0.3 0.8 . . 0.1 . . . . . . u c k i n g p = p = 0.2 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 43
Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] toxic toxic 0.5 . . 0.3 0.8 . . 0.1 . . . . . . u c k i n g p = 0.2 p = 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 44
Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] toxic toxic Seeing perturbed data during training allows the model to recognize it during testing 0.5 . . 0.3 0.8 . . 0.1 . . . . . . u c k i n g p = 0.2 p = 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 45
Exeriments & Results 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 46
General Setup ELMo embeddings [Peters et al, 2018] 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 47
General Setup ELMo embeddings [Peters et al, 2018] Evaluation on four downstream tasks: Character level: Grapheme to phoneme conversion Word level: PoS Tagging Chunking Sentence level: Toxic comment classification 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 48
General Setup ELMo embeddings [Peters et al, 2018] Evaluation on four downstream tasks: Character level: Grapheme to phoneme conversion Word level: PoS Tagging Chunking Sentence level: Toxic comment classification Different embedding spaces for generating perturbed test data 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 49
General Setup ELMo embeddings [Peters et al, 2018] Evaluation on four downstream tasks: Character level: Grapheme to phoneme conversion Word level: PoS Tagging Chunking Sentence level: Toxic comment classification Different embedding spaces for generating perturbed test data 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 50