The Power of Natural Language Processing using TextBlob

The data science field has exploded with interest from developers, businesses, institutions, universities, and even criminals in order to turn data into information and discover the true benefit of raw data. Before we knew it, the ability to mimic the function of the human brain through code has shown us the ability to analyze language is one of the most significant approaches towards progressing society and humanity in general. With this, the growth of data science, machine learning, and natural language processing have been implemented into tools consumers use every day, such as Siri, Alexa, and Cortana.

Development of open-sourced data and data processing has guided the use of robust approaches to utilize data for natural language processing, sentiment analysis, and determining emotions. Data is now the next gold rush as we begin to understand how dataneeds to be extracted, transformed, loaded, and for full benefit, turned into Information. In theory, like gold, data is a commodity.

In this article, I plan to give you a basic understanding of utilizing programming tools in order to conduct sentiment analysis with Python within social media networks and how you can leverage the ability to perform analysis of text based data to understand real emotions using computer based code.

Introducing TextBlob

As an NLP library for Python, TextBlob has been one of the go-to packages for developers.

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. — TextBlob’s website

Understanding Sentiment Analysis Using TextBlob

The sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity). The polarity score is a float within the range [-1.0, 1.0]. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective. — TextBlob Quickstart Tutorial

TextBlob returns polarity and subjectivity of a sentence.

Polarity is a float which lies between the ranges of [-1.0,1.0].

  • -1 defines a negative sentiment
  • 1 defines a positive sentiment.

Subjectivity lies between [0.0,1.0].

  • 0.0 is very objective and 1.0 is very subjective.
  • If subjectivity < 0.5, the sentence is more subjective than objective and vice versa.

TextBlob also provides labels for semantics which allow for the detection and analysis of emojis, tones of sentence like exclamation marks, and more.

Installing From the Python PyPI

$ pip install -U textblob

TextBlob is available as a conda package. To install with conda, run

$ conda install -c conda-forge textblob

TextBlob in Python

First, import textblob. in your preferred notebook compiler.

Then, all we need to do is use TextBlob(text) in order to utilize the many different methods of TextBlob.

from textblob import TextBlob
blob = TextBlob(text)

Once you have created a textblob object you gain access to common text-processing operations. I will will treat the TextBlob objects as if they are Python strings to demonstrate the power of NLP in things like sentiment analysis.

Analyzing Sentiment using textblob

In order to perform sentiment analysis using textblob we have to use the sentiment ( ) method as shown below;

blob = TextBlob('companies that specialize in sentiment analysis are least likely to look at for data.') 
blob.sentiment

Result

Sentiment(polarity=-0.15, subjectivity=0.7)

Helpful Additions

Spelling Correction

In the use-case we are analyzing large sets of data, and the misspelling of a word may have a detrimental effect on the data, it is important to correct the spelling of individual words.

tutorial = TextBlob("in sentimet analyss, which of the following is an implicit opinion?")
print(tutorial.correct())

Result


in sentiment analysis, which of the following is an implicit opinion?

Noun Phrase Extraction

A noun phrase is a group of two or more words that center on a noun (e.g., ‘dog’, ‘girl’, ‘man’) and include modifiers (e.g., ‘the’, ‘a’, ‘None of’). For example, ‘boy’ is not a noun phrase but ‘a helpful boy’ is a noun phrase.

Commonly in data science, a researcher may want to extract all noun phrases within a sentence rather than determining individual nouns.

blob = TextBlob('I wanted to learn cryptocurrency sentiment analysis.')
blob.noun_phrases

Result

WordList(['cryptocurrency sentiment analysis'])

As we can see, just ‘sentiment analysis’ are extracted from the sentence because it is the only noun phrase in the sentence.

Conclusion

Most commonly, developers are hired by companies to analyze large chunks of data for raw data which can later discover things such as how a consumer or client feels about a product or experience. A company could identify phone numbers, personal names, locations, and other specific entities just through regex based identification of text and scraping a basic webpage. An even deeper understanding of linguistic concepts such as meronyms, mesonyms, troponyms, and synonyms, stemming, lemmatization, parts of speech, word sense disambiguation, and similar areas.

I hope this article helps you discover the power of natural language processing (NLP) with some examples to test in your own environment.

Thank you For Reading

Constructive criticism and feedback are welcomed. Nicholas Resendez can be reached on Instagram @nirholas, on LinkedIn , and Twitter @Bothersome for updates on new articles.

Comments

Leave a Reply

Sign In

Authenticate with MetaMask Loading...

Register

Authenticate with MetaMask Loading...

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.

Membership

An active membership is required for this action, please click on the button below to view the available plans.