0% found this document useful (0 votes)

64 views

SentA Russir Day2

Sentiment analysis involves automatically detecting sentiment in text. There are several types of sentiment analysis tasks such as subjectivity detection, polarity detection, sentiment strength detection, and multiple sentiment detection. Standard machine learning methods and linguistic algorithms are commonly used for sentiment analysis. Machine learning methods use training data to learn how to classify sentiment while linguistic algorithms incorporate additional language knowledge but require no training data.

Uploaded by

Miu Cornel Gabriel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views

SentA Russir Day2

Uploaded by

Miu Cornel Gabriel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Sentiment analysis tasks and

methods
Mike Thelwall
University of Wolverhampton, UK
Information Studies
Contents
Types of sentiment analysis task
Standard machine learning methods
Linguistic algorithms
Terminology and problems
Sentiment Analysis (SA), AKA Opinion Mining, is the task of
automatically detecting sentiment in text
Active research area since ~2002
Standard part of online market research toolkits
Commonly used for automatic processing of large numbers of texts to
identify opinions about products or brands
Opinions are personal judgements about something
It is good. It is bad. It is expensive.
Subjective text contains opinions; Objective text states only facts.
Sentiments are expressions of emotion or attitude or opinion
It is good. It is bad. It is expensive. I like it. I am happy. I am depressed. I
am angry at you.
Sentiment analysis is sometimes thought of as the prediction of
peoples private/internal states from text
http://www.cs.cornell.edu/home/llee/opinion-mining-sentiment-analysis-survey.html
Opinion Mining Applications
Identify unpopular features of BMWs
Automatic analysis of thousands of comments in
BMW car forum
Identify features and sentiment
Identify if a new computer is popular
Automatic analysis of all blogs
Compare to results for other computers
Identify impact of TV advertising campaign
Automatic analysis of all blogs
Identify and detect sentiment in product mentions
Commercial sentiment analysis
goals
Determine overall opinions about a
product
E.g., the M90 phone is excellent.
E.g., the M90 is expensive but excellent.
Determine opinions about parts of a
product
E.g., the screen of the M90 is too small but
its weight is very light.
I love the steering wheel on the new
Picasso!
Gamon, et. al. (2005). Pulse: Mining customer opinions from free text.
Lecture Notes in Computer Science, 3646, 121-132.
Commercial sentiment analysis
goals
Determine changes in overall customer brand opinion
(e.g., daily proportions of positive/negative
comments)
In response to advertising
As routine monitoring
Identify individual unhappy customers
E.g., identify Tweets that mention the brand and
are negative
Endnote web is driving me mad, argggggh!!!
Social science sentiment
analysis goals
Track trends in sentiment over time
(see next slide)
Identify changes in sentiment
Discover patterns in sentiment use in a
communication medium
E.g., gender, age, nationality differences
Do women/Russians use more sentiment?
#oscars
%

m
a
t
c
h
i
n
g

p
o
s
t
s
S
e
n
t
i
m
e
n
t

s
t
r
e
n
g
t
h
S
u
b
j
.
Increase in ve sentiment strength
Date and time
Date and time
9 Feb 2010
9 Feb 2010
9 Mar 2010
9 Mar 2010
Av. +ve sentiment
Just subj.
Av. -ve sentiment
Just subj.
Proportion of tweets
mentioning the Oscars
Types of sentiment analysis
task 1: Subjectivity detection
Detecting whether a text is opinionated/
subjective or neutral/ objective
Binary decision
Can use machine learning
Does not classify polarity
This phone is very cheap.
This phone costs 200 roubles.
I love the phone.
Types of sentiment analysis
task 2: Polarity detection
Detecting whether a subjective text is
positive or negative
Binary decision
Can use machine learning
This phone is very cheap.
It is lovely.
I am frustrated with the phone.
Types of sentiment analysis task 3:
Sentiment strength detection
Measuring the strength of sentiment in
a text
Scale ratings many different ones
used
E.g.,
strong negative 1-2-3-4-5-6-7-8-9 strong positive OR
1-2-3-4-5 negative & 1-2-3-4-5 positive
The car is very good.
I am tired but happy.
Types of sentiment analysis task 4:
Multiple sentiment detection
Detecting a range of emotions
E.g., happy, sad, angry, depressed, excited
Is harder and some emotions are rare
in text.
2. Machine learning
Machine learning algorithms typically
have a variety of parameters that can
be learned
Input set of human-classified texts
Algorithm adjusts its parameters to
perform well on the human-classified
texts
Should also be accurate on similar new
texts
Machine learning overview
Training data (typically) human-annotated with the correct
sentiment values and used for training the algorithm
Test data identical to the above except used for testing the
trained algorithm to see how accurate it is
Training
data
Untrained algorithm
Trained
algorithm
Step 1
Testing
data
Results
Step 2
Training example
Features : anna, hate, i, love, you
d1 feature vector: (1,1,1,0,0)
d2 feature vector: (1,0,0,1,1)
An algorithm is told d1 is negative and
d2 is positive: what will it learn?
(-,-,?,+,+)
I love you. I hate Anna. d1 d2
Training example 2
What will the algorithm learn now?
Features : anna, hate, i, love, you
d1 feature vector: (1,1,1,0,0) negative
d2 feature vector: (1,0,0,1,1) positive
d2 feature vector: (1,0,1,1,0) positive
(?,-,?,+,+)
I love you. I hate Anna. d1 d2
I love Anna. d3
Types of machine learning
algorithm
Many generic and many highly tailored
machine learning algorithms
For text analysis there is an important
distinction between types:
Linguistic use grammatical and other
knowledge about language to understand
the text analysed (e.g., SentiStrength)
Non-linguistic use brute force methods
that do not incorporate linguistic
knowledge (e.g., with feature vector
inputs)
Non-linguistic algorithms
General mathematical methods
incorporating abstract intuitions about
how to learn to guess correct sentiment
value from training data
The algorithm makes its judgement
based solely on the feature vectors
Support Vector Machines
Popular and powerful
maps the feature vectors into a high-dimensional
space in a clever way
finds a hyperplane (a bit like a straight line) that
separates the training data into two different
classes as well as possible
uses the same hyperplane to predict the classes of
the test data or unknown data
p
p
p
p
p
n
n
n
n
n
u
u
u
u
Support Vector Machines -
Example
Find the
hyperplane
p
p
p
p
p
n
n
n
n
n
u
u
Other generic machine learning
algorithms
Nave Bayes makes simple probability
assumptions
Rule generators e.g. finds simple rules like
If document contains love and doesnt
contain hate then classify it as positive
Genetic algorithms
Logistic regression
Decision tables
Boosting algorithms
Multilayer perceptron
Many more, and each one has many
variations and parameters
Practical advice
Use Weka with many machine learning
algorithms to run tests and develop a
system (no programming needed)
www.cs.waikato.ac.nz/ml/weka/
For text analysis, need to write code to
convert data into feature vectors
Or use text-specific analysis packages like GATE that focus more on
natural language processing (gate.ac.uk)
OR SVMLight (free online, fairly easy to use)
Weka
Contains many components that can be built into processing
pipelines
Can use in five different ways
Explorer load data and try different algorithms on it (not
large data sets)
Experimenter set up large-scale experiments with different
algorithms and data
KnowlegeFlow connect together multiple algorithms on the
fly
Command line interface - one algorithm at a time
Java programs API for systematic and customised
testing
Sample Weka process
1. Load data file (add data file loading component to
interface; enter name of data file)
2. Mark one of the data columns as correct or the
class to be predicted
3. Split the data into training and testing sets
4. Train the ML algorithm to the training data, evaluate
it on the testing data
5. Calculate accuracy statistics on the results
data
results
Weka 2
Many different options
Takes time to get used to
Is very powerful and flexible
Need ML understanding to use
Linguistic algorithms
Incorporate additional grammatical and other information about
language
Typically use a scoring function to predict sentiment
Tend to be more accurate but take much more time to run
Examples of additional power:
The word like can be positive (verb) or neutral (preposition)
linguistic techniques can disambiguate the two senses.
The words hate, and hated have the same lexical root, and a
similar meaning to loathe and loathed
not often reverses the meaning of subsequent words
there are many idioms that have special meanings
sarcasm has special meanings
Linguistic knowledge of the possible meanings of words can give
algorithms a head start
E.g., SentiWordNet lists many classes of positive and negative
words
Example - SentiStrength
I love my Lada
->
I love[+3] my Lada. (-1, 3)
I do not hate traffic.
->
I do not[reverse] hate[-4] traffic. (-1, 4)
Linguistic resources
Part of speech tagger
Predicts use of any given word
Sentiment resource or lexicon
E.g., SentiWordNet = network of groups of
sentiment words and meanings
Chunker identifies sentences and
phrases
Standard toolkits include Gate and
LingPipe
SentiWordNet
Example of a linguistic resource
Based on WordNet
Database of word meanings and relationships
Can use to find words similar to any given word
Uses synsets groups of semantically equivalent words
~ 150,000 words in ~ 115,000 synsets
SentiWordNet http://sentiwordnet.isti.cnr.it/
Assigns three probability-like scores to each WordNet
Synset: for objectivity, positivity and negativity
The scores are based on an estimation algorithm
Powerful resource for estimating the sentiment of individual
words
Needs linguistic processing of source text to match words to
synsets.
E.g., to disambiguate the different like synsets
Unsupervised algorithms
An algorithm is supervised if it requires a
training stage
An algorithm is unsupervised if it
requires no training
An algorithm is semi-supervised if it has
a limited training stage
Machine learning algorithms tend to be
supervised or semi-supervised
Linguistic algorithms are often
unsupervised = no need for training data
Summary
There are several different sentiment analysis tasks
and many different applications of sentiment analysis
Machine learning is based upon algorithms that learn
to solve classification or clustering problems from
human-coded examples
Linguistic algorithms use knowledge of language to
improve performance, but may be less customisable
to specific domains (see later)
Bibliography
Pang, B., & Lee, L. (2008). Opinion mining and
sentiment analysis. Foundations and Trends in
Information Retrieval, 1(1-2), 1-135.
Witten, I. H., & Frank, E. (2005). Data mining:
Practical machine learning tools and techniques. San
Francisco: Morgan Kaufmann.
Gamon, M., Aue, A., Corston-Oliver, S., & Ringger, E.
(2005). Pulse: Mining customer opinions from free
text. Lecture Notes in Computer Science, 3646, 121-
132.

CR 30-X 2nd CR 30-Xm - Service Manual For Download
100% (5)
CR 30-X 2nd CR 30-Xm - Service Manual For Download
553 pages
Antennas and Propagation For Wireless Communication Systems, 2nd Ed
80% (5)
Antennas and Propagation For Wireless Communication Systems, 2nd Ed
553 pages
Sentiment Analysis of Twitter Data My
75% (4)
Sentiment Analysis of Twitter Data My
14 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
4 pages
Lec.4 SDA (2023-2024) .FCDS
No ratings yet
Lec.4 SDA (2023-2024) .FCDS
18 pages
NLP 2
No ratings yet
NLP 2
86 pages
NLP Unit 6
No ratings yet
NLP Unit 6
16 pages
Machine Learning With Advance Model
No ratings yet
Machine Learning With Advance Model
19 pages
Minor_Project_Presentation (1)
No ratings yet
Minor_Project_Presentation (1)
16 pages
Sentiment Analysis Using Feature Selection and Machine Learning Algorithms
No ratings yet
Sentiment Analysis Using Feature Selection and Machine Learning Algorithms
48 pages
MARK3088 - Lecture WK 5 - New Product Idea Generation
No ratings yet
MARK3088 - Lecture WK 5 - New Product Idea Generation
46 pages
Opinion Mining: Dr. Alaa El-Halees Faculty of Information Technology Islamic University of Gaza Seminar 9/9/2008
No ratings yet
Opinion Mining: Dr. Alaa El-Halees Faculty of Information Technology Islamic University of Gaza Seminar 9/9/2008
34 pages
Social Media Sentiment Analysis Document
No ratings yet
Social Media Sentiment Analysis Document
6 pages
SYSTEM FOR SENTIMENT ANALYSIS OF BIG TEXT DATA
No ratings yet
SYSTEM FOR SENTIMENT ANALYSIS OF BIG TEXT DATA
4 pages
YOUTUBE SENTEMENT ANALYSIS (Major Project mp11)
No ratings yet
YOUTUBE SENTEMENT ANALYSIS (Major Project mp11)
40 pages
AAIML
No ratings yet
AAIML
10 pages
ML-11
No ratings yet
ML-11
13 pages
Sentiment Analysis of User Comment Text Based On L
No ratings yet
Sentiment Analysis of User Comment Text Based On L
13 pages
Thesis - Aru Omarali
No ratings yet
Thesis - Aru Omarali
34 pages
Applsci 13 04550
No ratings yet
Applsci 13 04550
21 pages
Ppt- Sentiment Analysis Using Machine Learning Algorithms
No ratings yet
Ppt- Sentiment Analysis Using Machine Learning Algorithms
23 pages
Sentiment Analysis 1
No ratings yet
Sentiment Analysis 1
12 pages
### Seminar Report
No ratings yet
### Seminar Report
12 pages
twitter sentiment analysis ppt
100% (2)
twitter sentiment analysis ppt
10 pages
Ai & ML Week-12
No ratings yet
Ai & ML Week-12
17 pages
Report
No ratings yet
Report
30 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
19 pages
MP 1
No ratings yet
MP 1
14 pages
Stock Prediction With Sentiment
No ratings yet
Stock Prediction With Sentiment
7 pages
Sa Mincut Aditya
No ratings yet
Sa Mincut Aditya
36 pages
Analyzing Sentiment Using IMDb Dataset
No ratings yet
Analyzing Sentiment Using IMDb Dataset
4 pages
Sentiment Analysis: Natural Language Processing (NLP) Customer Feedback
No ratings yet
Sentiment Analysis: Natural Language Processing (NLP) Customer Feedback
12 pages
Sentiment__Analysis
No ratings yet
Sentiment__Analysis
12 pages
Sentiment Analysis For X Using Machine Learning
No ratings yet
Sentiment Analysis For X Using Machine Learning
9 pages
Sentiment Analysis Guide
No ratings yet
Sentiment Analysis Guide
36 pages
NLP and Sentiment Analysis
No ratings yet
NLP and Sentiment Analysis
89 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
3 pages
Sentiment Analysis of Tweets Using Machine Learning
No ratings yet
Sentiment Analysis of Tweets Using Machine Learning
22 pages
Sentiment Analysis Using DL
No ratings yet
Sentiment Analysis Using DL
20 pages
Sentiment Analysis
100% (1)
Sentiment Analysis
19 pages
Poster WEBIST 2018
No ratings yet
Poster WEBIST 2018
1 page
Sentiment Analysis: A NLP And: 2. Detailed Approach
No ratings yet
Sentiment Analysis: A NLP And: 2. Detailed Approach
6 pages
Sentiment Analysis Using Machine Learning Classifiers
No ratings yet
Sentiment Analysis Using Machine Learning Classifiers
41 pages
1-s2.0-S2666285X21000327-main
No ratings yet
1-s2.0-S2666285X21000327-main
7 pages
Assignment-5
No ratings yet
Assignment-5
2 pages
MADHU-IEEE Update
No ratings yet
MADHU-IEEE Update
5 pages
An Introduction To Sentiment Analysis
No ratings yet
An Introduction To Sentiment Analysis
2 pages
Exploiting Emojis in Sentiment Analysis A Survey
No ratings yet
Exploiting Emojis in Sentiment Analysis A Survey
14 pages
Sentiment Analysis of Movie Review Using Machine L
No ratings yet
Sentiment Analysis of Movie Review Using Machine L
7 pages
Sentiment Analysis Literature Review
No ratings yet
Sentiment Analysis Literature Review
2 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
13 pages
RES Presentation
No ratings yet
RES Presentation
21 pages
Web Mining Unit 2
No ratings yet
Web Mining Unit 2
12 pages
Sentimental Analysis Final Year Project
No ratings yet
Sentimental Analysis Final Year Project
21 pages
Natural Language Processing for Sentiment Analysis - ankur Shukla
No ratings yet
Natural Language Processing for Sentiment Analysis - ankur Shukla
27 pages
plati 1
No ratings yet
plati 1
16 pages
Final YouTube Automating Comment Analysis
No ratings yet
Final YouTube Automating Comment Analysis
19 pages
Revised_Sentiment_Analysis_Paper
No ratings yet
Revised_Sentiment_Analysis_Paper
4 pages
Opinion Mining Using Machine Learning
No ratings yet
Opinion Mining Using Machine Learning
3 pages
XGBOOST
No ratings yet
XGBOOST
5 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
30 pages
Me and My AI: 1, #1
From Everand
Me and My AI: 1, #1
Factsmasterx
No ratings yet
Service Manual LK14M42 LT0661
No ratings yet
Service Manual LK14M42 LT0661
893 pages
AE 314 - Prelim - Lab Module 01
No ratings yet
AE 314 - Prelim - Lab Module 01
10 pages
Gmail - Booking Confirmation On IRCTC, Train - 15036, 22-Sep-2019, 2S, MB - DLI
No ratings yet
Gmail - Booking Confirmation On IRCTC, Train - 15036, 22-Sep-2019, 2S, MB - DLI
2 pages
LSBF Reference Form
No ratings yet
LSBF Reference Form
2 pages
SPN 157 Circuit
No ratings yet
SPN 157 Circuit
5 pages
CAD Exchanger SDK Presentation
No ratings yet
CAD Exchanger SDK Presentation
11 pages
Presentation 27 05 2024
No ratings yet
Presentation 27 05 2024
24 pages
Komatsu Hydraulic Excavator Pc150 6k Shop Manual
100% (49)
Komatsu Hydraulic Excavator Pc150 6k Shop Manual
20 pages
Cet Full Mock 2024
No ratings yet
Cet Full Mock 2024
16 pages
Hydraulic Drifter: Innovative Design
No ratings yet
Hydraulic Drifter: Innovative Design
2 pages
Cybersecurity Infographics
No ratings yet
Cybersecurity Infographics
7 pages
Unit - 1 - III Bca - PYTHON
100% (1)
Unit - 1 - III Bca - PYTHON
27 pages
Coloring Worksheet: Bird Characteristics
No ratings yet
Coloring Worksheet: Bird Characteristics
4 pages
04 TechnicalSafetyConcept LaneAssistance
No ratings yet
04 TechnicalSafetyConcept LaneAssistance
12 pages
Buy SSN Number
No ratings yet
Buy SSN Number
7 pages
Wireless and Mobile Communication
No ratings yet
Wireless and Mobile Communication
19 pages
Upgrading Cimplicity 6.1 To 8.1 License Issue
No ratings yet
Upgrading Cimplicity 6.1 To 8.1 License Issue
2 pages
Distribution Management
No ratings yet
Distribution Management
21 pages
SSD and Relationship-Ssd and Usecase
No ratings yet
SSD and Relationship-Ssd and Usecase
30 pages
PR502 (A4) VER3 - General Format
No ratings yet
PR502 (A4) VER3 - General Format
24 pages
ZBP Vendor Creation
No ratings yet
ZBP Vendor Creation
16 pages
Cyber Crimes-Challenges & Solutions: Rajarshi Rai Choudhury, Somnath Basak, Digbijay Guha
No ratings yet
Cyber Crimes-Challenges & Solutions: Rajarshi Rai Choudhury, Somnath Basak, Digbijay Guha
4 pages
Flexible Automated Warehouse: A Literature Review and An Innovative Framework
No ratings yet
Flexible Automated Warehouse: A Literature Review and An Innovative Framework
26 pages
Termend PracticePaper
No ratings yet
Termend PracticePaper
10 pages
Lane Splitting Guidelines
No ratings yet
Lane Splitting Guidelines
5 pages
02 Software Project Planning
No ratings yet
02 Software Project Planning
12 pages
Pregunta 2
No ratings yet
Pregunta 2
31 pages

SentA Russir Day2

Uploaded by

SentA Russir Day2

Uploaded by

Sentiment analysis tasks and

You might also like