0% found this document useful (0 votes)
24 views4 pages

The Effect of Negation On Sentiment Analysis and Retrieval Effectiveness

sentiment analisys

Uploaded by

Gorro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views4 pages

The Effect of Negation On Sentiment Analysis and Retrieval Effectiveness

sentiment analisys

Uploaded by

Gorro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

The Effect of Negation on Sentiment Analysis and

Retrieval Effectiveness
Lifeng Jia Clement Yu Weiyi Meng
Department of Computer Science Department of Computer Science Department of Computer Science
University of Illinois at Chicago University of Illinois at Chicago SUNY at Binghamton
Chicago, IL 60607, USA Chicago, IL 60607, USA Binghamton, NY 13902, USA
[email protected] [email protected] [email protected]

ABSTRACT sentimental polarity of a sentence. In [12], the scope of a negation


We investigate the problem of determining the polarity of word or phrase is assumed to be those words between that
sentiments when one or more occurrences of a negation term such negation and the first punctuation mark following it. [3, 4] suggest
as not appear in a sentence. The concept of the scope of a that the scope of a negation term to be its next 5 words. In [17],
negation term is introduced. By using a parse tree and typed the polarity of a sentimental term is flipped within the vicinity of
dependencies generated by a parser and special rules proposed by negation, which implies that the scope of negation is several
us, we provide a procedure to identify the scope of each negation words to its right. Besides these heuristics in identifying the scope
term. Experimental results show that the identification of the of a negation word or phrase, some research evaluated the impact
scope of negation improves both the accuracy of sentiment of negation differently. [6] introduces the concept of contextual
analysis and the retrieval effectiveness of opinion retrieval. valence shifter, which consists of negation, intensifier and
diminisher. Contextual valence shifters have an impact of flipping
Categories and Subject Descriptors the polarity, increasing or decreasing the degree to which a
H.3.3 [Information Storage and Retrieval]: Information Search sentimental term is positive or negative. [2] categorizes negations
and Retrieval retrieval models, selection process; I.2.7 into function negations, such as not, and contextual negation,
[Artificial Intelligence]: Natural Language Processing text such as eliminate. Both kinds of negations can flip the polarity
analysis; of sentimental terms. The same problem but in the medical/health
domain was investigated in [1]. However the precision involving
General Terms: Algorithms, Experimentation, Languages the negative word not is very low, at 63%. In this paper, we
assume that sentimental terms are either individual word or multi-
Keywords: Scope of Negation, Candidate Scope of Negation, word phrases whose polarities have been pre-determined by
Sentiment Analysis, Retrieval Effectiveness, Opinion Retrieval methods such as [15, 16, 18] and negation terms are either
individual negation words or negation phrases. We concentrate on
1. INTRODUCTION the impact of negation terms on sentiment analysis. Negation
In opinion retrieval, an opinionated document satisfies two terms are not restricted to not. The most common negation
conditions: it is relevant to the query and has an opinion about the words are: no, not (or its contraction nt), never, less, without,
query [10]. The TREC 2007 blog track [11] introduced a new barely, hardly and rarely; the most common negation phrases are:
polarity classification task. The task is to provide sentiment no longer, no more, no way, no where, by no means, at no time,
analysis on opinionated documents, i.e. it is to determine whether not ... anymore. We restrict our analysis to the set of negative
a given opinionated document carries positive, negative or mixed terms given above in this paper.
(both positive and negative) opinions. To determine the polarity
of an opinionated document, we first classify the polarities of The objectives of the paper are to determine the polarities of the
individual sentences and then aggregate the sentence level results parts of a sentence, which may be affected by each occurrence of
into a document level polarity. The polarity of a sentence is very a negation term and then utilize the polarity information to
often recognized by certain sentimental words or phrases within it. improve retrieval effectiveness. Our study has the following
However, their contextual polarities are dependent on the scope of contributions. (a) We introduce the concept of the scope of a
each negation word or phrase preceding them, because their negation term and provide a methodology to determine it. To our
polarities might be flipped by negation words or phrases. knowledge, this study is the first one in which the scope of a
negation term is defined and a non-trivial procedure is provided
Existing research [12, 3, 4, 17, 6, 2, 1] has been conducted on for its computation. (b) We study different methods of
determining the impact of negation words or phrases on the determining the polarity of the candidate scope acted on by each
occurrence of a negation term in a sentence, where a candidate
Permission to make digital or hard copies of all or part of this work for scope represents a logical unit of a sentence containing the scope.
personal or classroom use is granted without fee provided that copies are Experiments are performed to compare the effectiveness of these
not made or distributed for profit or commercial advantage and that methods. (c) We incorporate our technique of determining
copies bear this notice and the full citation on the first page. To copy polarity into an opinion retrieval system [18, 19] and compare it
otherwise, or republish, to post on servers or to redistribute to lists, against other existing techniques. Experimental results show that
requires prior specific permission and/or a fee. our technique outperforms the other techniques in retrieval
CIKM09, November 26, 2009, Hong Kong, China. effectiveness.
Copyright 2009 ACM 978-1-60558-512-3/09/11...$10.00.

1827
2. PRELIMINARY DEFINITIONS the words from w to w (including w but excluding w) are
To identify the scope of a negation term, t, our strategy is to first eliminated.
compute a candidate scope, which is a minimal logical unit of Examples of delimiters are when, whenever, whether,
the sentence containing the scope. Then, we prune those words because, unless, until, since and hence.
that are within the candidate scope but not within the scope. Definition 3.2 Conditional Word Delimiter: A conditional word
Clearly, the candidate scope of t is a subset of the words delimiter may or may not serve as a delimiter. It serves as a
appearing after t in the sentence. The logical unit is the set of delimiter and eliminates a subset of words from the candidate
descendant terminal (leaf) nodes of a non-terminal node in a parse scope if it satisfies some specific conditions. Examples of
tree of the sentence. To ensure that the candidate scope is conditional word delimiters include so, as, which, who,
minimal, we restrict the candidate scope not to extend to another why, where, for, like and quotation marks. If a
independent clause of the sentence. We give the following conditional delimiter serves as a delimiter, one of the above two
computational procedure to approximate the candidate scope. rules (1) and (2), which is applicable to a delimiter, is applied to
Procedure ComputedCandidateScope: Suppose a negation term eliminate some words from the candidate scope. The types of
t occurs in a sentence S. Obtain a parse tree for S, say using conditions, which make a conditional delimiter a delimiter,
Stanfords parser [7]. Find the least common ancestor, LCA, of include:(a) the part of speech of the conditional delimiter; (b) the
the node representing t and the node representing the word, say t, location of the negation term relative to the conditional delimiter;
immediately after t. Then, all descendant leaf nodes of LCA and (c) the word leading to an adjective clause. Due to space
starting from t and extending to its right hand side form the limitations, only a small number of conditions are listed above.
candidate scope of t. Intuitively, it is the set of words whose 3.2 Heuristic Rules for Scope Detection
polarities may be flipped by t. The logical unit of the sentence, Besides certain specific words, which are word delimiters and
which is the descendant leaf nodes of LCA with the exception of t conditional word delimiters, we also propose rules involving
and its preceding words, is the computed candidate scope. sentimental verbs, sentimental adjectives and sentimental nouns
In order to precisely locate the scope of a negation term in a such that the word immediately after one of these sentimental
sentence, we need to analyze the sentence syntactically. The parse terms acts as a delimiter. Furthermore, a heuristic rule concerning
tree and typed dependencies [8] of a sentence provide helpful double objects is proposed.
assistance in syntax analysis. A parse tree is an ordered and Sentimental Verb Rule: Whenever a negation term in a sentence
rooted tree that represents the syntactic structure of the sentence negates a sentimental verb, the word w immediately after the verb
according to some formal grammar. The definitions of typed serves as a delimiter.
dependency and five concrete typed dependencies, namely
conjunction, copula open clausal complement, direct Sentimental Adjective Rule: Whenever a sentimental adjective
object and indirect object are given in [8, 9] and are omitted forms a cop or xcomp typed dependency with the closest
due to limitation of space. They will be utilized in determining the preceding copula or verb, which is negated by a negation term,
scope. the term immediately after this adjective serves as a delimiter.
Sentimental Noun Rule: Whenever a sentiment noun acts as the
3. IDENTIFY THE SCOPE OF NEGATION object of a verb, which is negated by a negation term, the term
In this section, we discuss how to recognize the scope of a immediately after this noun is a delimiter.
negation term from the candidate scope, utilizing the concepts Double Object Rule: Whenever a negation term negates a verb
defined in the last section. taking double objects, only the direct object should be in the
scope and the indirect object should be excluded.
3.1 Delimiters
The candidate scope of a negation term is not always its exact 3.3 Exceptions of Scope of Negation
scope. Therefore, after obtaining the candidate scope, we need to For a sentence with a negation term, we have introduced various
identify its actual scope. In order to do it, the concepts of a kinds of methods in identifying the scope of that negation term.
delimiter and a conditional word delimiter are introduced. A However, sometimes a negation term in a sentence does not have
delimiter d when encountered in the candidate scope CS of a any scope. In this section, we summarize several situations when
negation term eliminates certain words, including d and the words a negation term does not have a scope.
after it from CS, whereas a conditional delimiter behaves as a
Exception Situation 1: Whenever a negation term is a part of
delimiter, if certain conditions are satisfied.
some special phrase without any negation sense, there is no scope
Definition 3.1 Delimiter: A delimiter has the capability to for this negation term. Examples of these special phrases include
eliminate some words from the candidate scope of a negation not only, not just, not to mention and no wonder.
term. The set of words eliminated by the delimiter is given by
Exception Situation 2: A negation term does not have a scope
either one of the two following rules:
when it occurs in a negative rhetorical question. A negative
(1) All words including the delimiter and those after it are rhetorical question is identified by the following heuristic. (1) It is
eliminated, if there is no conjunction, satisfying rule (2). a question; and (2) it has a negation term within the first three
(2) Only a portion of the succeeding words is eliminated by words of the question.
the delimiter. Let w be a delimiter. If w, a succeeding
Exception Situation 3: A negation term does not have a scope
word of w forms a typed dependency of conjunction by
when the sentence itself is a restricted comparative sentence.
either AND or OR with some word w that precedes
Such a sentence is approximated by the pattern: modal word (such
w and w is not eliminated from the candidate scope, then

1828
as can) immediately followed by a negation term, immediately of the subject of a dependent clause. (6)PD: the sentimental
followed by a copular verb (or get) or another tense of the verb polarity of the predicate of a dependent clause. (7)OD: the
and followed by a comparative word. sentimental polarity of the object of a dependent clause. (8)MD:
the sentimental polarity of the modifier of a dependent clause.
3.4 Scope Identification Procedure For a sentence, all feature values are assigned accordingly. In
After introducing various techniques to identify the scope of a general, for each candidate scope, CS, a vector of 8 features is
negation term, we now present a procedure, which identifies the computed and fed into Quinlans C4.5 decision tree program [13]
scope of each occurrence of a negation term. This procedure takes to generate the classifier. Let the decision tree method be denoted
a sentence with one or more occurrences of negation terms as by DT.
input and outputs their scopes within the sentence. If a sentiment
term is within the scopes of i negation terms, then its polarity is 4.2 Retrieval Effectiveness on TREC
flipped i times. Collection
For each occurrence of a negation term t within a sentence S, The scope identification technique described in Section 3 is
Case 1: If the occurrence of t in S satisfies Exception Situations incorporated into an opinion retrieval system [18, 19] to improve
1, 2 or 3, then the entire candidate scope of t is discarded. the retrieval effectiveness of polarity classification of TREC
documents for given queries. A brief description is as follows.
Case 2: If all three conditions in Case 1 fail, obtain the candidate Given a query, opinionated relevant documents are first retrieved
scope, CS, of t according to the parse tree of the sentence S and by an opinion retrieval system [18, 19] and then opinionative
then identify the scope from CS by processing according to the sentences are classified into to be positively or negatively
following cases. opinionative ones by a polarity classifier, which uses features
Case 2.1: There are no word delimiters within CS and none of the consisting of positive and negative sentimental terms. If a feature
sentimental verb, adjective and noun rules and the double object is within the scopes of an odd number of negation terms, its
rule is satisfied. In this case, the scope is the candidate scope. negated feature is used instead. All features which are present in
the remaining part of the sentence are not modified. The classifier
Case 2.2: When a word delimiter or a conditional word delimiter
used to classify each sentence is SVM-Light [5], which produces
satisfying the conditions to serve as a delimiter is encountered, the
either a positive score or a negative score for each sentence. Each
scope is obtained from CS by applying delimiter rules (1) or (2) in
opinionative document is assigned a positive score and a negative
Definition 3.1.
score by summing its positive/negative scores of its opinionative
Case 2.3: When the conditions in any of the three sentimental sentences. If a document has a certain proportion of positive score
rules are satisfied, the scope is obtained from CS by applying the to negative score, then it is classified to be positive, negative or
delimiter rules (1) or (2) in Definition 3.1. mixed. Two ranked lists of opinionated documents are produced,
Case 2.4: When the condition in the double object rule is one for the positively ranked documents, and the other for the
satisfied, the candidate scope is modified by discarding the negatively ranked documents.
indirect object.
It is possible that different parts of a candidate scope of a negation
5. EXPERIMENTS
To evaluate the effectiveness of our method on polarity
term satisfy the conditions of Cases 2.2, 2.3 or 2.4. For each
determination, we conduct two sets of experiments. In each set of
satisfied part, the candidate scope is modified accordingly.
experiments, we compare our method with other methods, which
handle the negation terms in different ways. The first set of
4. SENTIMENT ANALYSIS experiments involves the accuracy of computing the polarity of a
4.1 Sentiment Analysis on Candidate Scope sentence. The second set of experiments involves the ranking of
To analyze the polarity of a candidate scope CS, if CS contains positively and negatively ranked opinionated documents retrieved
some sentimental terms, the contextual polarities of sentimental from 3.2 million TREC documents with respect to 150 TREC
terms are first determined by taking into the consideration of their queries.
predetermined polarities and the scopes of negation terms (if exist)
preceding them; CS is then classified to be positive or negative if 5.1 Experimental Results
all contextual polarities of sentimental terms are of the same 5.1.1 Sentiment Analysis on Candidate Scope
polarity; otherwise, CS is classified to be mixed. If no sentimental We now evaluate the accuracy of our method, SCT, in identifying
terms occur in CS, the polarity of CS is neutral. Let this simple the scope of negation by a dataset that consists of 1000 sentences.
combination method be denoted by SM. These sentences are randomly sampled from the review corpus
Besides SM above, we also employ decision tree to determine the crawled from Rateitall.com. Instead of computing the scope of
polarity of CS. The decision tree has 8 independent features and negation, T, using our proposed method, three heuristics in
requires training examples. These features indicate the syntactic identifying T are as follows: (a) T is within K words to the right
roles of the sentimental terms within CS and are explained below. of the negation term [3, 4, 17]. For (a), we test values of K = 3, 4
Each of these features can take on one of the four values: and 5 and K = 4 gives the best results. Thus, we report results for
{positive, negative, mixed, neutral}. The 8 features are: (1)SI: the K = 4 only in this Section. This method (a) is denoted by SC4. (b)
sentimental polarity of the subject of an independent clause. T is the set of words containing the first sentimental term to the
(2)PI: the sentimental polarity of the predicate of an independent right of the negation word. This method (b) is denoted by SC1st.
clause. (3)OI: the sentimental polarity of the object of an (c) T is the set of all the words within CS and this method is
independent clause. (4)MI: the sentimental polarity of the denoted by SCCS. The polarity of the candidate scope can be
modifier of an independent clause. (5)SD: the sentimental polarity determined automatically using either SM or DT. For example,

1829
SCT+DT represents that the scope of negation is first identified 8. REFERENCES
by SCT and then the polarity of candidate scope is determined by [1] WW. Chapman, W. Bridewell, P. Hanbury P, GF. Cooper
DT. and BG. Buchanan. A simple algorithm for identifying
Table 1: The Accuracies of Various Methods. negated findings and diseases in discharge summaries. J
Biomed Inform. 2001 Oct. 34(5):301-10.
Methods Accuracy Methods Accuracy
[2] Yejin Choi and Claire Cardie. Learning with Compositional
SCT+DT 88.4% SC1st+DT 82.8%
Semantics as Structural Inference for Subsentential
SCT+SM 85.8% SC1st+SM 79.8% Sentiment Analysis. In Proc. of EMNLP 2008.
SC4+DT 83.6% SCCS+DT 82.1% [3] G. Grefenstette, Y. Qu, J. Shanahan and D. Evans.
SC4+SM 80.7% SCCS+SM 79.2% Coupling Niche Browsers and Affect Analysis for an
Opinion Mining Application. In Proc. of RIAO 2004.
5.1.2 Retrieval Effectiveness on TREC Collection
In the second set of experiments, we rank the positively and [4] Minqing Hu and Bing Liu. Mining and summarizing
negatively opinionative documents in the TREC blogosphere customer reviews. In Proc. of SIGKDD 2004.
collection including all 150 queries released from 2006 to 2008. [5] T Joachims. Making large-scale SVM learning practical.
In the blog track of TREC 2008, the key measure to evaluate the Advances in Kernel Methods:Support Vector Learning.
retrieval effectiveness is the Mean Average Precision (MAP). 1999.
Thus, we utilize the same measure here. Our method to rank these [6] Alistair Kennedy, Diana Inkpen. Sentiment Classification of
two sets of documents for each query has been described in Movie Reviews Using Contextual Valence Shifters.
Section 4.2. It is denoted by SCT, and is compared against other Computational Intelligence, Vol. 22, (2006), pp. 110-125.
methods. Specifically, the methods to be compared against are
listed below. (1) The method that is utilized in [19] is denoted by [7] Dan Klein and Christopher D. Manning. Accurate
SCBL. It only flips the polarity of the closest sentimental term. (2) Unlexicalized Parsing. In Proc. of ACL 2003, pp. 423-430.
The scope of each negation term is within K word to the right of [8] Marie-Catherine de Marneffe, Bill MacCartney and
the negation term. Two methods with K = 4 and K = 5 are Christopher D. Manning. Generating Typed Dependency
proposed in [3, 4, 17] and are denoted as SC4 and SC5. (3) Two Parses from Phrase Structure Parses. In Proc. of LREC 2006.
methods [2] have been proposed to determine the polarity of an
[9] Marie-Catherine de Marneffe and Christopher D. Manning.
expression within a sentence. An author in [2] suggested that we
Stanford typed dependencies manual, September 2008.
utilize the method denoted by SCNegEx [2], which achieved the
best accuracy of sentiment analysis at the sentence level in the [10] I. Ounis, M. Rijke, C. Macdonald, G. Mishne, I. Soboroff.
corpus of SemEval-07 [14]. We follow this suggestion. The gold Overview of the TREC-2006 Blog Track. In TREC 2006.
standard provided by TREC is utilized. Table 2 shows the [11] I. Ounis, C. Macdonald and I. Soboroff. Overview of the
effectiveness of the various methods in ranking positive and TREC-2007 Blog Track. In TREC 2007.
negative documents.
[12] Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan.
Table 2: MAP scores of 5 methods on all TREC queries Thumbs up? Sentiment classification using machine learning
150 TREC Queries: 851-950 and 1001-1050 techniques. In Proc. of EMNLP 2002.
Improvement Improvement [13] J. R. Quinlan. C4.5: Programs for Machine Learning.
Positive Negative
By SCT By SCT Morgan-Kaufman, 1993.
SCBL 0.1596 2.9% 0.0779 11.3% [14] Carlo Strapparava, Rada Mihalcea. Semeval-2007 task 14:
SC4 0.1634 0.5% 0.0805 9.8% Affective text. In Proc. of SemEval, 2007.
SC5 0.1630 0.5% 0.0812 8.9% [15] M. Taboada, J. Grieve. Analyzing appraisal automatically. In
AAAI Spring Symposium on Exploring Attitude and Affect
SCNegEx 0.1487 10.4% 0.0823 7.4%
in Text: Theories and Applications. 2004, pp 158-161.
SCT 0.1642 - 0.0884 -
[16] T. Wilson, J. Wiebe, P. Hoffmann. Recognizing contextual
polarity in phrase-level sentiment analysis. In Proc. of
6. CONCLUSION HLT/EMNLP 2005.
We study the impact of each occurrence of a negation term in a
sentence on its polarity. We introduce the concept of scope of the [17] K. Yang. WIDIT in TREC 2008 Blog Track: Leveraging
negation term t, which is precisely the sequence of words after t Multiple Sources of Opinion Evidence. In TREC 2008.
and is affected by t. Techniques are provided to compute it. Two [18] W. Zhang, C. Yu and W. Meng. Opinion Retrieval from
sets of experiments are performed to compare our method against Blogs. In Proc. of CIKM 2007.
other existing methods. Experimental results show that our
[19] W. Zhang, L. Jia, C. Yu and W. Meng. Improve the
method outperforms other methods in both the accuracy of
Effectiveness of the Opinion Retrieval and Opinion Polarity
sentiment analysis and the retrieval effectiveness of polarity
Classification. In Proc. of CIKM 2008.
classification in opinion retrieval.

7. ACKNOWLEDGEMENTS
We acknowledge the support of NSF via the grants: IIS-0842546
and IIS-0842608

1830

You might also like