Visually Attacking and Shielding NLP Systems for Human-like Text Processing

text processing like humans do visually atta king n.w
1 / 61
Embed
Share

Explore the innovative techniques of visually attacking and shielding NLP systems to enhance their robustness. Join the discussion on making NLP more human-like and resilient against spamming and toxic comments in the open domain. Dive into the experiments and results from the VIPER project conducted by the UKP Lab at Technische Universität Darmstadt, shedding light on the future of text processing.

  • Text Processing
  • NLP Systems
  • VIPER Project
  • Human-like Processing
  • UKP Lab

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Text Processing Like Humans Do: Visually Atta king n Shi lding NLP Systems teff n Eger, G zde G l ahin, Andr as R c l , Ji-Ung ee, la dia Schulz, Mohs n Mesgar, Krish ant Swar kar, E win Simpson, Ir na Gurevych Ubiquitous Knowledge Processing Lab AIPHES Research Training Group Technische Universit t Darmstadt https://www.ukp.tu-darmstadt.de/ https://www.aiphes.tu-darmstadt.de

  2. Motivation Human vs Machine Text Processing Visual Perturber (VIPER) Making NLP Systems more robust Experiments & Results Recap & Conclusion 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 2

  3. Motivaton 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 3

  4. NLP out in the open 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 4

  5. NLP out in the open Spamming Domain name spoofing Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 5

  6. NLP out in the open Spamming Domain name spoofing You idiot! You have no idea! Go to hell! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 6

  7. NLP out in the open Spamming Domain name spoofing You idiot! You have no idea! Go to hell! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 7

  8. NLP out in the open Spamming Domain name spoofing http://w kipedia.org You id ! You have no ea! Go to e ! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 8

  9. NLP out in the open Spamming Domain name spoofing http://w kipedia.org You id ! You have no ea! Go to e ! Toxic Comments 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 9

  10. Huma vs Mchine ext Procesing 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 10

  11. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 11

  12. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! You id ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 12

  13. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! You id ! i d i o t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 13

  14. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! You id ! i d i o t toxic toxic 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 14

  15. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! i d i o t toxic toxic 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 15

  16. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! 0.1 . . 0.2 0.6 . . 0.1 . . . i d i o t toxic toxic 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 16

  17. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! 0.1 . . 0.2 0.6 . . 0.1 . . . i d i o t toxic toxic ? ? ? ? ? ? 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 17

  18. Visually Attacking NLP Systems Human Human Machine Machine You id ! You id ! i d i 105 100 111 248 427 105 100 111 248 427 You id ! 0.1 . . 0.2 0.6 . . 0.1 . . . i d i o t toxic toxic ? ? ? ? ? ? 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 18

  19. VIsal PERturbe (VIPER) 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 19

  20. Creating Perturbations with VIPER Character embeddings a s Use pixel values as initialization P Cosine similarity to obtain similar characters 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 20

  21. Attacking NLP Systems Use VIPER to perturb characters Test data Test data f u c k i n g p = 0.9 p = 0.9 u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 21

  22. Attacking NLP Systems Use VIPER to perturb characters p : probability of perturbing a character Test data Test data f u c k i n g p = 0.9 p = 0.9 u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 22

  23. Attacking NLP Systems Use VIPER to perturb characters p : probability of perturbing a character How do state-of-the-art models perform on perturbed test data? Test data Test data ? ? ? ? ? ? f u c k i n g p = 0.9 p = 0.9 vs vs ? ? ? ? ? ? u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 23

  24. Human vs Machine Performance 0% Perturbance: game over man ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 24

  25. Human vs Machine Performance 20% Perturbance: game ove m ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 25

  26. Human vs Machine Performance 40% Perturbance: am a ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 26

  27. Human vs Machine Performance 80% Perturbance: m o r a ! 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 27

  28. Human vs Machine Performance 80% Perturbance: m o r a ! P = 0.8: Human performance ~94% Machine performance ~20-70% 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 28

  29. Making NL Systems more robs 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 29

  30. How to shield against such attacks? Visually Informed Character Embeddings Data Augmentation 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 30

  31. How to shield against such attacks? Visually Informed Character Embeddings Data Augmentation 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 31

  32. Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 32

  33. Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed 0.1 . . 0.2 0.7 . . 0.2 . . . random init random init T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 33

  34. Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic 0.1 . . 0.2 0.7 . . 0.2 . . . random init random init T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 34

  35. Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic 0.1 . . 0.2 0.7 . . 0.2 . . . random init random init pixel images pixel images T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 35

  36. Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic 0.5 . . 0.3 0.8 . . 0.1 0.1 . . 0.2 0.7 . . 0.2 . . . . . . random init random init pixel images pixel images T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 36

  37. Visually Informed Character Embeddings Visually uninformed Visually uninformed Visually informed Visually informed non non- -toxic toxic non non- -toxic toxic 0.5 . . 0.3 0.8 . . 0.1 0.1 . . 0.2 0.7 . . 0.2 . . . . . . random init random init pixel images pixel images T h e c a t T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 37

  38. Visually Informed Character Embeddings No random initialization of character embeddings non non- -toxic toxic Use concatenated pixel values 0.5 . . 0.3 0.8 . . 0.1 . . . pixel images pixel images T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 38

  39. Visually Informed Character Embeddings No random initialization of character embeddings non non- -toxic toxic Use concatenated pixel values Visual similarities can now be learned during training 0.5 . . 0.3 0.8 . . 0.1 . . . More likely to generalize better to new characters during testing pixel images pixel images T h e c a t 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 39

  40. How to shield against such attacks? Visually Informed Character Embeddings Data Augmentation 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 40

  41. Adversarial Training Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 41

  42. Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] p = 0.2 p = 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 42

  43. Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] 0.5 . . 0.3 0.8 . . 0.1 . . . . . . u c k i n g p = p = 0.2 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 43

  44. Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] toxic toxic 0.5 . . 0.3 0.8 . . 0.1 . . . . . . u c k i n g p = 0.2 p = 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 44

  45. Adversarial Training Enrich training data with perturbations [Goodfellow et al, 2015] toxic toxic Seeing perturbed data during training allows the model to recognize it during testing 0.5 . . 0.3 0.8 . . 0.1 . . . . . . u c k i n g p = 0.2 p = 0.2 . . . Training Training data data f u c k i n g 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 45

  46. Exeriments & Results 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 46

  47. General Setup ELMo embeddings [Peters et al, 2018] 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 47

  48. General Setup ELMo embeddings [Peters et al, 2018] Evaluation on four downstream tasks: Character level: Grapheme to phoneme conversion Word level: PoS Tagging Chunking Sentence level: Toxic comment classification 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 48

  49. General Setup ELMo embeddings [Peters et al, 2018] Evaluation on four downstream tasks: Character level: Grapheme to phoneme conversion Word level: PoS Tagging Chunking Sentence level: Toxic comment classification Different embedding spaces for generating perturbed test data 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 49

  50. General Setup ELMo embeddings [Peters et al, 2018] Evaluation on four downstream tasks: Character level: Grapheme to phoneme conversion Word level: PoS Tagging Chunking Sentence level: Toxic comment classification Different embedding spaces for generating perturbed test data 03.06.2019 | Computer Science Department | UKP Lab | Ji-Ung Lee 50

More Related Content