ratioSentimentScores

Sentiment scores with ratio rule

Syntax

compoundScores = ratioSentimentScores(documents)

[compoundScores,positiveScores,negativeScores] = ratioSentimentScores(documents)

___ = ratioSentimentScores(___,Name,Value)

Description

Use ratioSentimentScores to evaluate sentiment in tokenized text with a ratio rule. The ratioSentimentScores function, by default, uses the VADER sentiment lexicon.

compoundScores = ratioSentimentScores(documents) returns sentiment scores for tokenized documents based on the ratio of positive and negative tokens. For each document where the ratio of the positive score to negative score is larger than 1, the function returns 1. For each document where the ratio of the negative score to positive score is larger than 1, the function returns -1. Otherwise, the function returns 0.

example

[compoundScores,positiveScores,negativeScores] = ratioSentimentScores(documents) also returns the sums of the positive and negative token scores of the documents respectively.

___ = ratioSentimentScores(___,Name,Value) specifies additional options using one or more name-value pairs.

example

Examples

collapse all

Evaluate Sentiment in Text

Open Live Script

Create a tokenized document.

str = [
    "The book was VERY good!!!!"
    "The book was terrible."];
documents = tokenizedDocument(str);

Evaluate the sentiment of the tokenized documents. A score of 1 indicates positive sentiment, a score of -1 indicates negative sentiment, and a score of 0 indicates neutral sentiment.

compoundScores = ratioSentimentScores(documents)

compoundScores = 2×1

     1
    -1

Evaluate Sentiment Using Custom Lexicon

Open Live Script

Sentiment analysis algorithms rely on annotated lists of words called sentiment lexicons. For example, the ratioSentimentScores function uses a sentiment lexicon with words annotated with a sentiment score ranging from -1 to 1, where scores close to 1 indicate strong positive sentiment, scores close to -1 indicate strong negative sentiment, and scores close to zero indicate neutral sentiment.

If the sentiment lexicon used by the ratioSentimentScores function does not suit the data you are analyzing, for example, if you have a domain-specific data set like medical or engineering data, then you can use your own custom sentiment lexicon. For an example showing how to generate a domain specific sentiment lexicon, see Generate Domain Specific Sentiment Lexicon.

Create a tokenized document array containing the text data to analyze.

textData = [ 
    "This company is showing extremely strong growth."
    "This other company is accused of misleading consumers."];
documents = tokenizedDocument(textData);

Load the example domain specific lexicon for finance data.

filename = "financeSentimentLexicon.csv";
tbl = readtable(filename);
head(tbl)

        Token         SentimentScore
    ______________    ______________

    {'innovative'}             4    
    {'greater'   }        3.6216    
    {'efficiency'}        3.5971    
    {'enhance'   }        3.5628    
    {'better'    }        3.5532    
    {'creative'  }        3.5358    
    {'strengthen'}        3.5161    
    {'improved'  }         3.484

Evaluate the sentiment using the ratioSentimentScores function and specify the custom sentiment lexicon using the 'SentimentLexicon' option. A score of 1 indicates positive sentiment, a score of -1 indicates negative sentiment, and a score of 0 indicates neutral sentiment.

compoundScores = ratioSentimentScores(documents,'SentimentLexicon',tbl)

compoundScores = 2×1

     1
    -1

Input Arguments

collapse all

`documents` — Input documents
`tokenizedDocument` array

Input documents, specified as a tokenizedDocument array.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'Threshold',0.5 sets the ratio threshold to 0.5

`SentimentLexicon` — Sentiment lexicon
table

Sentiment lexicon, specified as a table with these variables:

Token – Token, specified as a string scalar. The tokens must be lowercase.
SentimentScore – Sentiment score of token, specified as a numeric scalar, where positive scores indicate positive sentiment, negative scores indicate negative sentiment, and 0 indicates neutral sentiment.

The default sentiment lexicon is the VADER sentiment lexicon.

Data Types: table

`Threshold` — Ratio threshold
1 (default) | nonnegative scalar

Ratio threshold, specified as a nonnegative scalar.

If the ratio of the positive score to negative score of documents(i) is larger than Threshold, then compoundScores(i) is 1. If the ratio of the negative score to positive score of documents(i) is larger than Threshold, then compoundScores(i) is -1. Otherwise, compoundScores(i) is 0.

Output Arguments

collapse all

`compoundScores` — Compound sentiment scores
numeric vector

Compound sentiment scores, returned as a numeric vector. The function returns one score for each input document.

`positiveScores` — Positive sentiment scores
numeric vector

Positive sentiment scores, returned as a numeric vector. The function returns one score for each input document. The value positiveScores(i) corresponds to the positive sentiment score of documents(i).

`negativeScores` — Negative sentiment scores
numeric vector

Negative sentiment scores, returned as a numeric vector. The function returns one score for each input document. The value negativeScores(i) corresponds to the negative sentiment score of documents(i).

References

[1] Jurafsky, Dan, and James H. Martin. Speech and Language Processing. 3rd Edition (draft)., 2018.

Version History

Introduced in R2019b

ratioSentimentScores

Syntax

Description

Examples

Evaluate Sentiment in Text

Evaluate Sentiment Using Custom Lexicon

Input Arguments

`documents` — Input documents
`tokenizedDocument` array

Name-Value Arguments

`SentimentLexicon` — Sentiment lexicon
table

`Threshold` — Ratio threshold
1 (default) | nonnegative scalar

Output Arguments

`compoundScores` — Compound sentiment scores
numeric vector

`positiveScores` — Positive sentiment scores
numeric vector

`negativeScores` — Negative sentiment scores
numeric vector

References

Version History

See Also

Topics

ratioSentimentScores

Syntax

Description

Examples

Evaluate Sentiment in Text

Evaluate Sentiment Using Custom Lexicon

Input Arguments

documents — Input documents tokenizedDocument array

Name-Value Arguments

SentimentLexicon — Sentiment lexicon table

Threshold — Ratio threshold 1 (default) | nonnegative scalar

Output Arguments

compoundScores — Compound sentiment scores numeric vector

positiveScores — Positive sentiment scores numeric vector

negativeScores — Negative sentiment scores numeric vector

References

Version History

See Also

Topics

`documents` — Input documents
`tokenizedDocument` array

`SentimentLexicon` — Sentiment lexicon
table

`Threshold` — Ratio threshold
1 (default) | nonnegative scalar

`compoundScores` — Compound sentiment scores
numeric vector

`positiveScores` — Positive sentiment scores
numeric vector

`negativeScores` — Negative sentiment scores
numeric vector