A holistic lexicon-based approach to opinion mining

WSDM, pp.231-240, (2008)

Cited by: 1571|Views136
EI

Abstract

One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientations (positive, negative or neutral) of o...More

Code:

Data:

Introduction
  • With the rapid expansion of e-commerce over the past 10 years, more and more products are sold on the Web, and more and more people are buying products online.
  • With more and more users becoming comfortable with the Web, an increasing number of people are writing reviews.
  • Many reviews are long, which makes it hard for a potential customer to read them to make an informed decision on whether to purchase the product.
  • The large number of reviews makes it hard for product manufacturers or businesses to keep track of customer opinions and sentiments on their products and services.
  • It is highly desirable to produce a summary of reviews [13, 21]
Highlights
  • With the rapid expansion of e-commerce over the past 10 years, more and more products are sold on the Web, and more and more people are buying products online
  • We propose a new method to aggregate orientations of such words by considering the distance between each opinion word and the product feature
  • This paper proposed an effective method for identifying semantic orientations of opinions expressed by reviewers on product features
  • It is able to deal with two major problems with the existing methods, (1) opinion words whose semantic orientations are context dependent, and (2) aggregating multiple opinion words in the same sentence
  • For (1), a holistic approach is proposed that can accurately infer the semantic orientation of an opinion word based on the review context
  • Experimental results show that the proposed technique performs markedly better than the state-of-the-art existing methods
Methods
  • 2. if word is a Negation word orientation = apply Negation Rules; mark words in sentence used by Negation rules.
  • 5. elseif word is a TOO word orientation = apply TOO Rules; mark words in sentence used by TOO rules.
  • If the authors cannot get an orientation there, the authors will look at the clause before “ but ” and negate its orientation.
  • Non-but clauses containing but-like words: Similar to negations and opinion words, a sentence containing “but” does not necessarily change the opinion orientation.
  • “but” in “The author likes the picture quality of this camera, and its size” does not change opinion after “but” due to the phrase “and”
Conclusion
  • This paper proposed an effective method for identifying semantic orientations of opinions expressed by reviewers on product features.
  • It is able to deal with two major problems with the existing methods, (1) opinion words whose semantic orientations are context dependent, and (2) aggregating multiple opinion words in the same sentence.
  • Previous research only considers explicit opinions expressed by adjectives and adverbs
  • In this work, both explicit and implicit opinions are considered.
  • The authors' method handles implicit features represented by feature indicators.
  • These make the proposed technique more complete.
  • Experimental results show that the proposed technique performs markedly better than the state-of-the-art existing methods
Tables
  • Table1: Characteristics of the review data
  • Table2: Results of opinion sentence extraction and sentence orientation prediction
  • Table3: Comparison of FBS, OPINE and Opinion Observer based on the benchmark data set in [<a class="ref-link" id="c13" href="#r13">13</a>], which consists of all reviews of the first 5 products in Table 2
Download tables as Excel
Related work
  • Opinion analysis has been studied by many researchers in recent years. Two main research directions are sentiment classification and feature-based opinion mining. Sentiment classification investigates ways to classify each review document as positive, negative, or neutral. Representative works on classification at the document level include [4, 5, 9, 12, 26, 27, 29, 32]. These works are different from ours as we are interested in opinions expressed on each product feature rather than the whole review.

    Sentence level subjectivity classification is studied in [10], which determines whether a sentence is a subjective sentence (but may not express a positive or negative opinion) or a factual one. Sentence level sentiment or opinion classification is studied in [10, 13, 17, 23, 28, 33, etc]. Our work is different from the sentence level analysis as we identify opinions on each feature. A review sentence can contain multiple features, and the orientations of opinions expressed on the features can also be different, e.g., “the voice quality of this phone is great and so is the reception, but the battery life is short.” “voice quality”, “reception” and “battery life” are features. The opinion on “voice quality”, “reception” are positive, and the opinion on “battery life” is negative. Other related works at both the document and sentence levels include those in [2, 10, 15, 16, 36].
Reference
  • . A. Andreevskaia and S. Bergler. Mining WordNet for Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses. In EACL’06, pp. 209–216, 2006.
    Google ScholarLocate open access versionFindings
  • . P. Beineke, T. Hastie, C. Manning, and S. Vaithyanathan. An Exploration of Sentiment Summarization. In Proc. of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications, 2003.
    Google ScholarLocate open access versionFindings
  • . G. Carenini, R. Ng, and A. Pauls. Interactive Multimedia Summaries of Evaluative Text. IUI’06, 2006.
    Google ScholarFindings
  • . S. Das, and M. Chen. Yahoo! for Amazon: Extracting market sentiment from stock message boards. APFA’01, 2001.
    Google ScholarFindings
  • . K. Dave, S. Lawrence, and D. Pennock. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW’03, 2003.
    Google ScholarFindings
  • [7]. A. Esuli and F. Sebastiani, EACL-06, 200Determining Term Subjectivity and Term Orientation for Opinion Mining, EACL-06, 2006.
    Google ScholarLocate open access versionFindings
  • [8]. C. Fellbaum. WordNet: an Electronic Lexical Database, MIT Press, 1998.
    Google ScholarFindings
  • [10]. V. Hatzivassiloglou and J. Wiebe. Effects of adjective orientation and gradability on sentence subjectivity. COLING’00, 2000.
    Google ScholarFindings
  • [11]. V. Hatzivassiloglou and K. McKeown. Predicting the Semantic Orientation of Adjectives. ACL-EACL’97, 1997.
    Google ScholarFindings
  • [12]. M. Hearst. Direction-based Text Interpretation as an Information Access Refinement. In P. Jacobs, editor, TextBased Intelligent Systems. Lawrence Erlbaum Associates, 1992.
    Google ScholarLocate open access versionFindings
  • [13]. M. Hu and B. Liu. Mining and summarizing customer reviews. KDD’04, 2004.
    Google ScholarFindings
  • [14]. N. Jindal, and B. Liu. Mining Comparative Sentences and Relations. In AAAI’06, 2006.
    Google ScholarLocate open access versionFindings
  • [15]. N. Kaji and M. Kitsuregawa. Automatic Construction of Polarity-Tagged Corpus from HTML Documents. COLING/ACL’06, 2006.
    Google ScholarFindings
  • [16]. H. Kanayama and T. Nasukawa. Fully Automatic Lexicon Expansion for Domain-Oriented Sentiment Analysis. EMNLP’06, 2006.
    Google ScholarFindings
  • [17]. S. Kim and E. Hovy. Determining the Sentiment of Opinions. COLING’04, 2004.
    Google ScholarFindings
  • [18]. S. Kim and E. Hovy. Automatic Identification of Pro and Con Reasons in Online Reviews. COLING/ACL 2006.
    Google ScholarLocate open access versionFindings
  • [19]. N. Kobayashi, R. Iida, K. Inui and Y. Matsumoto. Opinion Mining on the Web by Extracting Subject-Attribute-Value Relations. In Proc. of AAAI-CAAW'06, 2006.
    Google ScholarLocate open access versionFindings
  • [20]. L.-W. Ku, Y.-T. Liang and H.-H. Chen. Opinion Extraction, Summarization and Tracking in News and Blog Corpora. In Proc. of the AAAI-CAAW'06, 2006.
    Google ScholarLocate open access versionFindings
  • [21]. B. Liu, M. Hu, M. and J. Cheng. Opinion Observer: Analyzing and comparing opinions on the Web. WWW-05, 2005.
    Google ScholarFindings
  • [22]. S. Morinaga, K. Yamanishi, K. Tateishi, and T. Fukushima, Mining Product Reputations on the Web. KDD’02, 2002.
    Google ScholarFindings
  • [24]. V. Ng, S. Dasgupta and S. M. Niaz Arifin. Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews. ACL’06, 2006.
    Google ScholarLocate open access versionFindings
  • [25]. NLProcessor – Text Analysis Toolkit. 2000. http://www.infogistics.com/textanalysis.html.
    Findings
  • [26]. B. Pang and L. Lee, Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales. ACL’05, 2005.
    Google ScholarFindings
  • [27]. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment Classification Using Machine Learning Techniques. EMNLP’2002, 2002.
    Google ScholarLocate open access versionFindings
  • [28]. A-M. Popescu and O. Etzioni. Extracting Product Features and Opinions from Reviews. EMNLP-05, 2005.
    Google ScholarLocate open access versionFindings
  • [29]. E. Riloff and J. Wiebe. 2003. Learning extraction patterns for subjective expressions. EMNLP’2003, 2003.
    Google ScholarFindings
  • [30]. V. Stoyanov and C. Cardie. Toward opinion summarization: Linking the sources. In Proc. of the Workshop on Sentiment and Subjectivity in Text, 2006.
    Google ScholarLocate open access versionFindings
  • [31]. R. Tong. An Operational System for Detecting and Tracking Opinions in on-line discussion. SIGIR 2001 Workshop on Operational Text Classification, 2001.
    Google ScholarLocate open access versionFindings
  • [32]. P. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL’02, 2002.
    Google ScholarLocate open access versionFindings
  • [33]. T. Wilson, J. Wiebe, and R. Hwa. Just how mad are you? Finding strong and weak opinion clauses. AAAI’04, 2004.
    Google ScholarFindings
  • [34]. J. Wiebe, and R. Mihalcea. Word Sense and Subjectivity. In ACL’06, 2006.
    Google ScholarLocate open access versionFindings
  • [35]. J. Wiebe, and E. Riloff: Creating Subjective and Objective sentence classifiers from unannotated texts. CICLing, 2005.
    Google ScholarLocate open access versionFindings
  • [37]. L. Zhuang, F. Jing, X.-Yan Zhu, and L. Zhang. Movie Review Mining and Summarization. CIKM-06, 2006.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科