Community-Based Valence-Arousal Prediction Method

Community-Based Valence-Arousal Prediction Method
Slide Note
Embed
Share

Sentiment analysis explores valence-arousal dimensions to predict affective words. Learn about VA space representation and related sentiment lexicons for textual analysis.

  • Sentiment analysis
  • Affective words
  • Valence-arousal
  • Text analysis
  • Community-based method

Uploaded on Mar 07, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. A Community-based Method for Valence-Arousal Prediction of Affective Words Liang-Chih Yu Department of Information Management Yuan Ze University, Taiwan, R.O.C.

  2. Outline Introduction Categorical and Dimensional Sentiment Analysis Valence-Arousal (VA) Space Related Work VA Prediction for affective words and long/short texts The Proposed Method A Community-based Weighted Graph Model Experimental Results Conclusions and Future Work Demo 2

  3. Introduction Sentiment Analysis Identify and extract opinion/sentiment/subjective information from texts Categorical Representation (discrete class) Positive or Negative Six basic emotions (anger, happiness, fear, sadness, disgust and surprise) (Ekman, 1992) Dimensional Representation (continuous value) Valence-Arousal (VA) (Russell, 1980) Pleasure-Arousal-Dominance (PAD) (Mehrabian, 1996) 3

  4. Valence-Arousal (VA) Space Valence degree of pleasant and unpleasant (or positive and negative) feelings Arousal Arousal degree of excitement and calm A point in the VA space represents an affect state of a word/sentence/document excited II I Tense Excited High-Negative High-Positive Angry Delighted Happy Frustrated negative neutral positive Valence Depressed Content Relaxed Bored III IV Low-Negative Low-Positive Calm Tired calm 4

  5. Categorical Sentiment Analysis Classify given texts into a set of predefined categories : http://dailyview.tw/ 5

  6. Dimensional Sentiment Analysis Determine the degrees of valence and arousal of given texts Provide a more fine-grained sentiment analysis Latest one month Two months ago volkswagen 6

  7. Related Work Sentiment Lexicon Categorical (Polarity Lexicon) General Inquirer Liu's Opinion Lexicon MPQA Subjectivity Lexicon NTU Sentiment dictionary (NTUSD) SentiWordNet Linguistic Inquiry and Word Count (LIWC) Chinese Linguistic Inquiry and Word Count (C-LIWC) Dimensional (VA Lexicon) Affective Norms for English Words (ANEW) Warriner's Extended ANEW Chinese Valence-Arousal Words (CVAW) (Yu et al., submitted to LREC 2016) 7

  8. Related Work Corpora Categorical Chinese Opinion Treebank IMDB MPQA Opinion Corpus Sentiment140 SemEval Dimensional Affective Norms for English Text (ANET) Chinese Valence-Arousal Text (CVAT) (Yu et al., submitted to LREC 2016) Stanford Sentiment Treebank 8

  9. Related WorkWord Level VA lexicon construction (semi-supervised) Predicting VA ratings of unseen words from those of their similar seed words Cross-lingual: unseen and seed words Linear regression + ontology (Wei et al., 2011) Locally-weighted linear regression + similarity (Wang et al., 2015) Mono-lingual: unseen and seed words Linear regression + Kernel function (Malandrakis et al., 2013) Weighted graph model (Esuli and Sebastiani, 2007; Yu et al., 2015) SemEval 2015 Task 10 Subtask E for Determining strength of Twitter terms (single dimension) 9

  10. Related WorkSentence/Document Level Predicting VA ratings of short or long texts Lexicon-based approaches: averaging the VA ratings of all affective words in sentences/documents (Go k ay et al., 2012; Paltoglou et al., 2013) Weighted Arithmetic Mean Weighted Geometric Mean Gaussian Mixture Model 10

  11. Related WorkMethod Linear regression (Wei et al., 2011; Wang et al., 2015) capturing the relationship between the similarities and VA ratings among a set of seed words val: valence Sim: similarity between words a, b: regression coefficients a Sim w w = + ( , ) val b i w i j Linear regression + kernel function (Malandrakis et al., 2013) N N: number of seeds a: weight of a seed f: kernel function f Sim w w = + ( , ) val b a val w j w i j i j = 1 j Graph model (Pagerank) (Esuli and Sebastiani, 2007) 1 t w val Nei: neighbor nodes : decay factor e: constant = + t w (1 ) val e j | ( )| Nei w i ( ) w Nei w i j i 11

  12. The Problem Traditional methods considered all similar seeds for VA prediction, which may include those with quite different ratings (or an inverse polarity) of valence/arousal to a given word paradise 8.72 (valence) Unseen word Seed: Rank 1 heaven 7.30 Rank 2 bliss 6.95 Rank 3 beautiful 7.60 Rank 4 hell 2.24 bliss (+) Rank 5 dream 6.73 dream (+) Rank 6 swamp 5.14 Rank 7 lonely 2.17 hell (-) heaven (+) paradise Rank 8 carefree 7.54 beautiful (+) Rank 9 nightmare 1.91 12

  13. Other Examples In ANEW (1,034 words), the ratio of the same and inverse polarity is Valence: 7:3 ; Arousal: 6:4 Including such noisy words may reduce the prediction performance Actual value Predicted value Valence Error Top 10 most similar neighbors heaven (7.30), bliss (6.95), beautiful (7.60), hell (2.24), dream (6.73), swamp (5.14), lonely (2.17), carefree (7.54), nightmare (1.91), glory (7.55) paradise 8.72 6.73 1.99 millionaire (8.03), luxury (7.88), handsome (7.93), lavish (6.21), greed (3.51), riches (7.70), famous (6.98), money (7.59), modest (5.76), selfish (2.42) wealthy 7.70 5.74 1.96 Actual value Predicted value Arousal Error Top 10 most similar neighbors angry (7.17), disgusted (5.42), frustrated (5.61), displeased (5.64), unhappy (4.18), resent (4.47), startled (6.93), terrified (7.83), upset (5.86), astonished (6.58) enraged 7.97 6.07 1.90 justice (5.47), freedom (5.52), liberty (5.60), war (7.49), life (6.02), bless (4.05), dignified (4.12), disturb (5.80), hope (5.44), mind (5.00) peace 2.95 4.66 1.71 13

  14. Possible Solutions (1/2) An ideal prediction method should account for seeds with the same polarity to an unseen word exclude those with an inverse polarity (noisy words) k-NN: select top k most similar words as nearest neighbors -NN: select nearest neighbors by introducing a similarity threshold bliss (+) dream (+) High similar words with an inverse polarity could not be excluded hell (-) heaven (+) paradise beautiful (+) threshold 14

  15. Possible Solutions (2/2) Graph partition methods mincut/max-flow mincut: the edges with a lower degree of similarity to the unseen word were cut off The idea is similar to k-NN and -NN, both are similarity-based methods bliss (+) dream (+) hell (-) heaven(+) paradise beautiful (+) 15

  16. The Proposed Method Community-based weighted graph model Community detection method for selecting seeds which are both similar to and have similar ratings or the same polarity with unseen words Weighted graph model for predicting VA ratings of words from such high-quality seeds A word may have more similar neighbors with the same polarity than with an inverse polarity bliss (+) (-) dream (+) (-) hell (-) heaven (+) (-) paradise beautiful (+) 16

  17. Community-based Weighted Graph Model Given an unseen word and a set of seed words Calculate the similarities between the unseen word and seed words Construct a weighted graph where each node represents a word and each edge represents the similarity between two nodes A community detection method is used to select similar neighbors with the same polarity into the same community The VA ratings of the unseen word are estimated from its community members using the weighted graph model (weight = similarity score) 17

  18. Similarity Calculation Continuous vector representations for words Word vectors are trained on a large corpus (e.g., Wiki) using word2vec (Mikolov et al., 2013a; 2013b) Cosine distance between word vectors is adopted to measure the word similarity 18

  19. Weighted Graph Model The seeds more similar to the unseen word may contribute more to the estimation process Pagerank (Esuli and Sebastiani, 2007) unseen seed 1 t w similarity val = + t w (1 ) val e j similarity | ( )| Nei w i Nei w ( ) w similarity i j i unseen seed similarity Weighted graph model (Yu et al., 2015) + seed similarity seed similarity similarity ( , ) Sim w w val unseen i j w Nei w ( ) w = j (1 ) val val j i w w ( , ) Sim w w i i i j Nei w ( ) w j i 19

  20. Community Detection Method The community detection method divides a graph into several communities (sub-graphs) Each community tends to consist of a set of similar words with the same polarity densely connected internally sparsely connected between different communities bliss (+) (-) dream (+) (-) hell (-) heaven (+) (-) paradise beautiful (+) 20

  21. Modularity A modularity value is introduced to measure the associations within and between communities over a graph (Newman, 2006; Blondel et al., 2008) The goal is to search for a partition that maximizes the modularity over the graph This can be accomplished by iteratively repeating 2 Sim Sim = , , within C m between C m M 2 2 C = ( , ) Sim Sim w w modularity optimization step community merge step , within C i j w C w = C i j ( , ) Sim Sim w w , between C i j w C w G i j = 2 ( , ) m Sim w w i j , w w G i j 21

  22. Modularity Optimization Step (1/2) Initially, each word in the graph is assigned to a distinct community Each word is then sequentially moved from the original community to all its neighbor communities = + M M M A movement will lead to a change of modularity _ _ move in move out = after move in before move in = M M M after move out before move out M M M _ _ _ move in _ _ _ move out 2 2 + + Sim k Sim k Sim Sim k , , , within C w C between C w , , within C m between C m w m = = + j i j j i i i i 2 2 2 2 2 m m 2 2 + + Sim k Sim k k Sim Sim , , , within C w C between C w w m + , , within C m between C m i i i i i i j j 2 2 2 m m 2 2 = = ( , ) ( , ) k Sim w w k Sim w w , w i j w C i j w G w C i i j j j j 22

  23. Modularity Optimization Step (2/2) After trying the movements to all neighbor communities, the movement yielding the highest M will be taken, and only if M is positive Otherwise, the word will stay in the original community The movement procedure is performed sequentially and repeatedly for all words in the graph until no positive M is found for all movements 23

  24. Community Merge Step The communities found in the previous step are treated as new nodes to build a new weighted graph The weight of each edge between two nodes (communities) is calculated by the sum of the weights between all words in the two communities Two communities are considered neighbor nodes if they have at least one edge between them The new graph is then passed back to the previous step These two steps are iteratively performed until no more new communities are found 24

  25. VA Prediction In testing, an unseen word is moved into all communities to calculate the changes of modularity M It will be finally assigned to the community with a highest M Only the neighbors in the community are included in the prediction process Those in different communities are ignored so as to exclude noisy neighbors 25

  26. Experiment Settings (1/2) Datasets ANEW (1,034 English words) CVAW (1,653 Chinese words) Development set (20%) for optimal parameter selection Test set (80%) with 5-fold cross-validation for performance evaluation Evaluation Metrics Root mean square error (RMSE) Mean absolute error (MAE) Pearson correlation coefficient (r) 26

  27. Experiment Settings (2/2) Compare the weighted graph model to other prediction models Linear regression Kernel function pagerank Compare the community detection method to other neighbor selection methods based on the weighted graph model k-NN/ -NN mincut/max-flow mincut 27

  28. Evaluation on Weighted Graph Model (1/2) Iterative Results of Graph-based Methods Pagerank Weighted graph model 28

  29. Evaluation on Weighted Graph Model (2/2) ANEW (English) CVAW (Chinese) Valence RMSE MAE RMSE MAE r r Kernel 1.871 1.381 0.612 1.834 1.367 0.632 Linear Regression 1.813 1.322 0.624 1.786 1.298 0.645 PageRank 1.508 1.079 0.753 1.524 1.142 0.718 1.152 0.807 0.805 1.148 0.884 0.786 Weighted Graph ANEW (English) CVAW (Chinese) Arousal RMSE MAE RMSE MAE r r Kernel 1.854 1.365 0.417 1.842 1.363 0.403 Linear Regression 1.804 1.328 0.428 1.807 1.325 0.416 PageRank 1.606 1.152 0.469 1.588 1.136 0.466 1.223 0.909 0.544 1.218 0.902 0.542 Weighted Graph 29

  30. Error Analysis In ANEW (1,034 words), the ratio of the same and inverse polarity is Valence: 7:3 ; Arousal: 6:4 Actual value Predicted value Valence Error Top 10 most similar neighbors heaven (7.30), bliss (6.95), beautiful (7.60), hell (2.24), dream (6.73), swamp (5.14), lonely (2.17), carefree (7.54), nightmare (1.91), glory (7.55) millionaire (8.03), luxury (7.88), handsome (7.93), lavish (6.21), greed (3.51), riches (7.70), famous (6.98), money (7.59), modest (5.76), selfish (2.42) burial (2.05), cemetery (2.63), coffin (2.56), wedding (7.82), morgue (1.92), grief (1.69), church (6.28), family (7.65), tomb (2.94), bereavement (4.57) regretful (2.82), terrible (1.93), happy (8.21), pity (3.37), disgusted (2.45), thankful (6.89), lonely (2.17), grateful (7.37), cruel (1.97), stupid (2.31) paradise 8.72 6.73 1.99 wealthy 7.70 5.74 1.96 funeral 1.39 3.25 1.86 sad 1.61 3.44 1.83 Actual value Predicted value Arousal Error Top 10 most similar neighbors angry (7.17), disgusted (5.42), frustrated (5.61), displeased (5.64), unhappy (4.18), resent (4.47), startled (6.93), terrified (7.83), upset (5.86), astonished (6.58) hospital (5.98), taxi (3.41), bus (3.55), nurse (4.84), truck (4.84), trauma (6.33), doctor (5.86), morgue (4.84), accident (6.26), vehicle(4.63) frustrated (5.61), lazy (2.65), addicted (4.81), fatigued (2.64), confused (6.03) mad (6.76), lonely (4.51), seasick (5.80), scared (6.82), discouraged (4.53) justice (5.47), freedom (5.52), liberty (5.60), war (7.49), life (6.02), bless (4.05), dignified (4.12), disturb (5.80), hope (5.44), mind (5.00) enraged 7.97 6.07 1.90 ambulance 7.33 5.35 1.98 bored 2.83 4.62 1.79 peace 2.95 4.66 1.71 30

  31. Evaluation on Community-based Method (1/3) Optimal parameter selection 31

  32. Evaluation on Community-based Method (2/3) ANEW (English) CVAW (Chinese) Valence Inverse Polarity Inverse Polarity RMSE MAE RMSE MAE r r Weighted graph model 1.152 0.807 0.805 29.30% 1.148 0.884 0.786 28.66% with k-NN (k=10) 1.025 0.756 0.824 21.56% 1.053 0.875 0.818 20.86% with -NN ( =0.4) 1.018 0.750 0.822 20.83% 0.971 0.826 0.832 20.46% with mincuts 0.967 0.735 0.828 20.35% 0.977 0.828 0.834 19.86% with max-flow mincuts 0.915 0.728 0.835 19.78% 1.004 0.859 0.822 20.31% 0.812 0.645 0.915 0.890 0.770 0.897 with community 10.95% 10.33% 32

  33. Evaluation on Community-based Method (3/3) ANEW (English) CVAW (Chinese) Valence Inverse Polarity Inverse Polarity RMSE MAE RMSE MAE r r Weighted graph model 1.223 0.909 0.544 39.96% 1.158 0.902 0.542 35.18% with k-NN (k=10) 1.044 0.786 0.560 28.66% 1.060 0.840 0.549 30.81% with -NN ( =0.4) 0.948 0.745 0.571 28.49% 1.008 0.819 0.554 29.72% with mincuts 0.934 0.739 0.576 28.23% 0.945 0.739 0.592 28.95% with max-flow mincuts 0.923 0.716 0.583 27.96% 0.935 0.726 0.596 28.83% 0.791 0.628 0.685 0.806 0.613 0.694 with community 21.93% 20.79% 33

  34. Conclusions This study presents a community-based weighted graph model for word-level valence-arousal prediction The proposed method selects useful neighbors for each unseen word by considering overall associations between words in the graph Experiments on both English and Chinese affective lexicons show that the weighted graph model yielded better performance than previously proposed methods The use of community-based neighbor selection can further improve the performance of the weighted graph model 34

  35. Future Work Sentence-level valence-arousal prediction Sentence Embeddings based on word vectors Issue: Two sentences contain semantically similar words but their VA ratings or polarity are different Example: sad and happy may have similar word vectors, which means that two sentences containing these words may have similar sentence vectors Sentence Embeddings based on paragraph vector 35

  36. Reference V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, no. 10, pp. 10008, 2008. P. Ekman, An argument for basic emotions, Cognition and emotion, vol. 6, no. 3-4, pp. 169-200, 1992. A. Esuli, and F. Sebastiani F, Pageranking wordnet synsets: An application to opinion mining, in Proc. ACL, 2007, pp. 442-431. D. G k ay, E. I bilir, G. Y ld r m, Predicting the sentiment in sentences based on words: An Exploratory Study on ANEW and ANET, in Proc. CogInfoCom, 2012, pp. 715-718. N. Malandrakis, A. Potamianos, E. Iosif, and S. Narayanan, Distributional semantic models for affective text analysis, IEEE Trans. Audio, Speech, and Language Processing, vol. 21, no. 11, pp. 2379-2392, 2013. A. Mehrabian, Pleasure-Arousal-Dominance: A General Framework for Describing and Measuring Individual, Current Psychology, vol. 15, no. 4, pp. 505-525, 1996. T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient estimation of word representations in vector space, in Proc. ICLR, 2013a. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, in Proc. NIPS, 2013b, pp. 3111-3119. 36

  37. Reference M. E. Newman, Modularity and community structure in networks, in Proc. National Academy of Sciences, vol. 103, no. 23, pp. 8577-8582, 2006. G. Paltoglou, M. Theunis, A. Kappas, and M. Thelwall, Predicting emotional responses to long informal text, IEEE Trans. Affective Computing, vol. 4, no. 1, pp.106-115, 2013. D. Rao, and D. Ravichandran, Semi-supervised polarity lexicon induction, in Proc. EACL, 2009, pp. 675 682. J. A. Russell, A circumplex model of affect, Journal of personality and social psychology, vol. 39, no. 6, pp. 1161, 1980. J. Wang, L. C. Yu, K. R. Lai and X. Zhang, Predicting Valence-Arousal Ratings of Words Using a Weighted Graph Method, in Proc. ACII, 2015, pp. 415-420 W. L. Wei, C. H. Wu, and J. C. Lin, A regression approach to affective rating of Chinese words from ANEW, in Proc. ACII, 2011, pp. 121-131. L. C. Yu, J. Wang, K. R. Lai and X. Zhang, Predicting Valence-Arousal Ratings of Words Using a Weighted Graph Method, in Proc. ACL, 2015, pp. 788-793. L. C. Yu et al., Building Chinese Affective Resources in Valence-Arousal Dimensions, submitted to LREC 2016. 37

Related


More Related Content