Learning extraction patterns for subjective expressions

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing, pp.105-112, (2003)

Cited by: 1114|Views113
EI
Full Text
Bibtex
Weibo

Abstract

This paper presents a bootstrapping process that learns linguistically rich extraction patterns for subjective (opinionated) expressions. High-precision classifiers label unannotated data to automatically create a large training set, which is then given to an extraction pattern learning algorithm. The learned patterns are then used to ide...More

Code:

Data:

0
Introduction
  • Many natural language processing applications could benefit from being able to distinguish between factual and subjective information.
  • Question answering systems should distinguish between factual and speculative answers.
  • Document-level classification can distinguish between “subjective texts”, such as editorials and reviews, and “objective texts,” such as newspaper articles.
  • Editorial articles frequently contain factual information to back up the arguments being made, and movie reviews often mention the actors and plot of a movie as well as the theatres where it’s currently playing.
  • Newspaper articles are generally considered to be relatively objective documents, but in a recent study (Wiebe et al, 2001) 44% of sentences in a news collection were found to be subjective
Highlights
  • Many natural language processing applications could benefit from being able to distinguish between factual and subjective information
  • We have developed a bootstrapping process for subjectivity classification that explores three ideas: (1) highprecision classifiers can be used to automatically identify subjective and objective sentences from unannotated texts, (2) this data can be used as a training set to automatically learn extraction patterns associated with subjectivity, and (3) the learned patterns can be used to grow the training set, allowing this entire process to be bootstrapped
  • The scheme was inspired by work in linguistics and literary theory on subjectivity, which focuses on how opinions, emotions, etc. are expressed linguistically in context (Banfield, 1982)
  • We evaluated whether the learned patterns can improve the coverage of the highprecision subjectivity classifier (HP-Subj), to complete the bootstrapping loop depicted in the top-most dashed line of Figure 1
  • We showed that an extraction pattern learning technique can learn subjective expressions that are linguistically richer than individual words or fixed phrases
  • We augmented our original high-precision subjective classifier with these newly learned extraction patterns. This bootstrapping process resulted in substantially higher recall with a minimal loss in precision
Results
  • 4.1 Subjectivity Data

    The text collection that the authors used consists of Englishlanguage versions of foreign news documents from FBIS, the U.S Foreign Broadcast Information Service.
  • The authors' system takes unannotated data as input, but the authors needed annotated data to evaluate its performance.
  • The scheme was inspired by work in linguistics and literary theory on subjectivity, which focuses on how opinions, emotions, etc.
  • The goal is to identify and characterize expressions of private states in a sentence.
  • Private state is a general covering term for opinions, evaluations, emotions, and speculations (Quirk et al, 1985).
  • In sentence (1) the writer is expressing a negative evaluation
Conclusion
  • This research explored several avenues for improving the state-of-the-art in subjectivity analysis.
  • The authors augmented the original high-precision subjective classifier with these newly learned extraction patterns.
  • This bootstrapping process resulted in substantially higher recall with a minimal loss in precision.
  • The authors plan to experiment with different configurations of these classifiers, add new subjective language learners in the bootstrapping process, and address the problem of how to identify new objective sentences during bootstrapping
Tables
  • Table1: Bootstrapping the Learned Patterns into the High-Precision Sentence Classifier
  • Table2: Examples of Learned Patterns Used by HP-Subj and Sample Matching Sentences
Download tables as Excel
Funding
  • ∗This work was supported by the National Science Foundation under grants IIS-0208798, IIS-0208985, and IRI-9704240
Study subjects and analysis
documents with a total of 210 sentences: 13
A private state may have low, medium, high or extreme strength. To allow us to measure interannotator agreement, three annotators (who are not authors of this paper) independently annotated the same 13 documents with a total of 210 sentences. We begin with a strict measure of agreement at the sentence level by first considering whether the annotator marked any private-state expression, of any strength, anywhere in the sentence

Reference
  • C. Baker, C. Fillmore, and J. Lowe. 1998. The Berkeley FrameNet Project. In Proceedings of the COLING-ACL-98.
    Google ScholarLocate open access versionFindings
  • T. Ballmer and W. Brennenstuhl. 1981. Speech Act Classification: A Study in the Lexical Analysis of English Speech Activity Verbs. Springer-Verlag.
    Google ScholarFindings
  • A. Banfield. 1982. Unspeakable Sentences. Routledge and Kegan Paul, Boston.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科