AI helps you reading Science
AI Insight
AI extracts a summary of this paper
Weibo:
Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques
ICDM, pp.427-434, (2003)
EI WOS
Keywords
Abstract
We present sentiment analyzer (SA) that extracts sentiment (or opinion) about a subject from online text documents. Instead of classifying the sentiment of an entire document about a subject, SA detects all references to the given subject, and determines sentiment in each of the references using natural language processing (NLP) technique...More
Code:
Data:
Introduction
- A huge amount of information is available in online documents such as web pages, newsgroup postings, and on-line news databases.
- There has been extensive research on automatic text analysis for sentiment, such as sentiment classifiers[13, 6, 16, 2, 19], affect analysis[17, 21], automatic survey analysis[8, 16], opinion extraction[12], or recommender systems [18]
- These methods typically try to extract the overall sentiment revealed in a document, either positive or negative, or somewhere in between.
- SA detects, for each occurrence of a topic spot, the sentiment about the topic
- It produces the following output for the above sample sentences provided that Sony PDA, NR70, and T series CLIEs are specified topics: 1.
- It produces the following output for the above sample sentences provided that Sony PDA, NR70, and T series CLIEs are specified topics: 1. Sony PDA - positive NR70 - positive
Highlights
- Today, a huge amount of information is available in online documents such as web pages, newsgroup postings, and on-line news databases
- Base Noun Phrases (BNP), dBNPand Beginning Definite Base Noun Phrases (bBNP) were extracted from the review pages and the Mixture Model and Likelihood Test were applied on the respective bnp’s
- The precision scores are summarized in Table 3. bBNP-L performed impressively well
- Its performance continued improving with increasing level of restrictions in the candidate feature terms, perhaps because, with further restriction, the selected candidate terms are more probable feature terms
- The results on review articles are comparable with the state of the art sentiment classifiers, and the results on general web pages are better than those of the state of the art algorithms by a wide margin (38% vs. 91 ∼ 93%)
- Sentiment Analyzer (SA) achieves high precision (86% ∼ 91%) and even higher accuracy (90% ∼ 93%) on general Web documents and news articles
- More advanced sentiment patterns currently require a fair amount of manual validation
Tables
- Table1: Counts for a bnp [<a class="ref-link" id="c9" href="#r9">9</a>]
- Table2: The product review datasets
- Table3: Precision of feature term extraction algorithms
- Table4: Top 20 feature terms extracted by bBNP-L in the order of their rank list are more likely to be feature terms. We used the Ratnaparkhi POS tagger[<a class="ref-link" id="c14" href="#r14">14</a>] to extract bnp’s. α = 0.3 was used for the computation of the Mixture Model. (Other values of α were used, which did not produce any better results than what are reported here.) The extracted feature terms were manually examined by two human subjects and only the terms that both subjects labeled as feature terms were counted for the computation of the precision
- Table5: Performance comparison of sentiment extraction alorithms on the product review datasets
- Table6: The performance of SA and ReviewSeer on general web documents and news articles
Funding
- SA achieves high precision (86% ∼ 91%) and even higher accuracy (90% ∼ 93%) on general Web documents and news articles
- On the contrary, ReviewSeer suffered with sentences from general web documents: the accuracy is only 38% (down from 88.4%). (The accuracy is computed based on the figures from Table 14 of [3]: we have averaged the accuracies of the three equal-size groups of a test set, 21%, 42% & 50%, respectively.) The accuracy was improved to 68% after removing difficult cases and using only clearly positive or negative sentences about the given subject
- The challenge here is that these difficult cases are the majority of the sentences that any sentiment classifier has to deal with: 60% (356 out of 600) of the test cases for the ReviewSeer experiment and even more (as high as over 90% on some domain) in our experiments
Reference
- M. Berland and E. Charniak. Finding parts in very large corpora. In Proc. of the 37th ACL Conf., pages 57–64, 1999.
- S. Das and M. Chen. Yahoo! for anazon: Extracting market sentiment from stock message boards. In Proc. of the 8th APFA, 2001.
- K. Dave, S. Lawrence, and D. M. Pennock. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proc. of the 12th Int. WWW Conf., 2003.
- T. E. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 1993.
- V. Hatzivassiloglou and K. R. McKeown. Predicting the semantic orientation of adjectives. In Proc. of the 35th ACL Conf., pages 174–181, 1997.
- M. Hearst. Direction-based text interpretation as an information access refinement. Text-Based Intelligent Systems, 1992.
- B. Katz. From sentence processing to information access on the world wide web. In Proc. of AAAI Spring Symp. on NLP, 1997.
- H. Li and K. Yamanishi. Mining from open answers in questionnaire data. In Proc. of the 7th ACM SIGKDD Conf., 2001.
- C. Manning and H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
- M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19, 1993.
- Int. J. of Lexicography, 2(4):245–264, 1990.
- S. Morinaga, K. Yamanishi, K. Teteishi, and T. Fukushima. ACM SIGKDD Conf., 2002.
- B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In Proc. of the 2002 ACL EMNLP Conf., pages 79–86, 2002.
- [15] L. Rovinelli and C. Whissell. Emotion and style in 30-second television advertisements targeted at men, women, boys, and girls. Perceptual and Motor Skills, 86:1048–1050, 1998.
- [16] W. Sack. On the computation of point of view. In Proc. of the 12th AAAI Conf., 1994.
- [17] P. Subasic and A. Huettner. Affect analysis of text using fuzzy semantic typing. IEEE Trans. on Fuzzy Systems, Special Issue, Aug., 2001.
- [18] L. Terveen, W. Hill, B. Amento, D. McDonald, and J. Creter. PHOAKS: A system for sharing recommendations. CACM, 40(3):59–62, 1997.
- Operational Text Classification, 2001.
- [20] P. D. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proc. of the 40th ACL Conf., pages 417–424, 2002.
- [22] J. M. Wiebe. Learning subjective adjectives from corpora. In Proc. of the 17th AAAI Conf., 2000.
- [23] C. Zhai and J. Lafferty. Model-based feedback in the lanuage modeling approach to information retrieval. In Proc. of the 10th Information and Knowledge Management Conf., 2001.
- [24] Y. Zhang, W. Xu, and J. Callan. Exact maximum likelihood estimation for word mixtures. In ICML Workshop on Text
Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn