AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
The results on review articles are comparable with the state of the art sentiment classifiers, and the results on general web pages are better than those of the state of the art algorithms by a wide margin

Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques

ICDM, pp.427-434, (2003)

Cited: 770|Views188
EI WOS

Abstract

We present sentiment analyzer (SA) that extracts sentiment (or opinion) about a subject from online text documents. Instead of classifying the sentiment of an entire document about a subject, SA detects all references to the given subject, and determines sentiment in each of the references using natural language processing (NLP) technique...More

Code:

Data:

0
Introduction
  • A huge amount of information is available in online documents such as web pages, newsgroup postings, and on-line news databases.
  • There has been extensive research on automatic text analysis for sentiment, such as sentiment classifiers[13, 6, 16, 2, 19], affect analysis[17, 21], automatic survey analysis[8, 16], opinion extraction[12], or recommender systems [18]
  • These methods typically try to extract the overall sentiment revealed in a document, either positive or negative, or somewhere in between.
  • SA detects, for each occurrence of a topic spot, the sentiment about the topic
  • It produces the following output for the above sample sentences provided that Sony PDA, NR70, and T series CLIEs are specified topics: 1.
  • It produces the following output for the above sample sentences provided that Sony PDA, NR70, and T series CLIEs are specified topics: 1. Sony PDA - positive NR70 - positive
Highlights
  • Today, a huge amount of information is available in online documents such as web pages, newsgroup postings, and on-line news databases
  • Base Noun Phrases (BNP), dBNPand Beginning Definite Base Noun Phrases (bBNP) were extracted from the review pages and the Mixture Model and Likelihood Test were applied on the respective bnp’s
  • The precision scores are summarized in Table 3. bBNP-L performed impressively well
  • Its performance continued improving with increasing level of restrictions in the candidate feature terms, perhaps because, with further restriction, the selected candidate terms are more probable feature terms
  • The results on review articles are comparable with the state of the art sentiment classifiers, and the results on general web pages are better than those of the state of the art algorithms by a wide margin (38% vs. 91 ∼ 93%)
  • Sentiment Analyzer (SA) achieves high precision (86% ∼ 91%) and even higher accuracy (90% ∼ 93%) on general Web documents and news articles
  • More advanced sentiment patterns currently require a fair amount of manual validation
Tables
  • Table1: Counts for a bnp [<a class="ref-link" id="c9" href="#r9">9</a>]
  • Table2: The product review datasets
  • Table3: Precision of feature term extraction algorithms
  • Table4: Top 20 feature terms extracted by bBNP-L in the order of their rank list are more likely to be feature terms. We used the Ratnaparkhi POS tagger[<a class="ref-link" id="c14" href="#r14">14</a>] to extract bnp’s. α = 0.3 was used for the computation of the Mixture Model. (Other values of α were used, which did not produce any better results than what are reported here.) The extracted feature terms were manually examined by two human subjects and only the terms that both subjects labeled as feature terms were counted for the computation of the precision
  • Table5: Performance comparison of sentiment extraction alorithms on the product review datasets
  • Table6: The performance of SA and ReviewSeer on general web documents and news articles
Download tables as Excel
Funding
  • SA achieves high precision (86% ∼ 91%) and even higher accuracy (90% ∼ 93%) on general Web documents and news articles
  • On the contrary, ReviewSeer suffered with sentences from general web documents: the accuracy is only 38% (down from 88.4%). (The accuracy is computed based on the figures from Table 14 of [3]: we have averaged the accuracies of the three equal-size groups of a test set, 21%, 42% & 50%, respectively.) The accuracy was improved to 68% after removing difficult cases and using only clearly positive or negative sentences about the given subject
  • The challenge here is that these difficult cases are the majority of the sentences that any sentiment classifier has to deal with: 60% (356 out of 600) of the test cases for the ReviewSeer experiment and even more (as high as over 90% on some domain) in our experiments
Reference
  • M. Berland and E. Charniak. Finding parts in very large corpora. In Proc. of the 37th ACL Conf., pages 57–64, 1999.
    Google ScholarLocate open access versionFindings
  • S. Das and M. Chen. Yahoo! for anazon: Extracting market sentiment from stock message boards. In Proc. of the 8th APFA, 2001.
    Google ScholarLocate open access versionFindings
  • K. Dave, S. Lawrence, and D. M. Pennock. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proc. of the 12th Int. WWW Conf., 2003.
    Google ScholarLocate open access versionFindings
  • T. E. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 1993.
    Google ScholarLocate open access versionFindings
  • V. Hatzivassiloglou and K. R. McKeown. Predicting the semantic orientation of adjectives. In Proc. of the 35th ACL Conf., pages 174–181, 1997.
    Google ScholarLocate open access versionFindings
  • M. Hearst. Direction-based text interpretation as an information access refinement. Text-Based Intelligent Systems, 1992.
    Google ScholarLocate open access versionFindings
  • B. Katz. From sentence processing to information access on the world wide web. In Proc. of AAAI Spring Symp. on NLP, 1997.
    Google ScholarLocate open access versionFindings
  • H. Li and K. Yamanishi. Mining from open answers in questionnaire data. In Proc. of the 7th ACM SIGKDD Conf., 2001.
    Google ScholarLocate open access versionFindings
  • C. Manning and H. Schutze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
    Google ScholarLocate open access versionFindings
  • M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19, 1993.
    Google ScholarLocate open access versionFindings
  • Int. J. of Lexicography, 2(4):245–264, 1990.
    Google ScholarLocate open access versionFindings
  • S. Morinaga, K. Yamanishi, K. Teteishi, and T. Fukushima. ACM SIGKDD Conf., 2002.
    Google ScholarLocate open access versionFindings
  • B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In Proc. of the 2002 ACL EMNLP Conf., pages 79–86, 2002.
    Google ScholarLocate open access versionFindings
  • [15] L. Rovinelli and C. Whissell. Emotion and style in 30-second television advertisements targeted at men, women, boys, and girls. Perceptual and Motor Skills, 86:1048–1050, 1998.
    Google ScholarLocate open access versionFindings
  • [16] W. Sack. On the computation of point of view. In Proc. of the 12th AAAI Conf., 1994.
    Google ScholarLocate open access versionFindings
  • [17] P. Subasic and A. Huettner. Affect analysis of text using fuzzy semantic typing. IEEE Trans. on Fuzzy Systems, Special Issue, Aug., 2001.
    Google ScholarLocate open access versionFindings
  • [18] L. Terveen, W. Hill, B. Amento, D. McDonald, and J. Creter. PHOAKS: A system for sharing recommendations. CACM, 40(3):59–62, 1997.
    Google ScholarLocate open access versionFindings
  • Operational Text Classification, 2001.
    Google ScholarFindings
  • [20] P. D. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proc. of the 40th ACL Conf., pages 417–424, 2002.
    Google ScholarLocate open access versionFindings
  • [22] J. M. Wiebe. Learning subjective adjectives from corpora. In Proc. of the 17th AAAI Conf., 2000.
    Google ScholarLocate open access versionFindings
  • [23] C. Zhai and J. Lafferty. Model-based feedback in the lanuage modeling approach to information retrieval. In Proc. of the 10th Information and Knowledge Management Conf., 2001.
    Google ScholarLocate open access versionFindings
  • [24] Y. Zhang, W. Xu, and J. Callan. Exact maximum likelihood estimation for word mixtures. In ICML Workshop on Text
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn