Day 9: TextBlob--Finding Sentiments in Text

Today for my 30 day challenge, I decided to take a break from JavaScript and learn about text processing using the Python programming language. I will be focusing on Sentiment Analysis in this blog. My interest in sentiment analysis is few years old when I wanted to write an application which will process a stream of tweets about a movie, and then output the overall sentiment about the movie. Having this information would help me decide if I wanted to watch a particular movie or not.

I googled around, and found that Naive Bayes classifier can be used to solve this problem. The only programming language that I knew at the time was Java, so I wrote a custom implementation and used the application for some time. I was lazy to commit the code, so when my machine crashed, I lost the code and application. Now I commit all my code to github, and I have close to 200 public repositories :)

In this blog, I will talk about a Python package called TextBlob which can help developers solve this problem. We will first cover some basics, and then we will develop a simple Flask application which will use the TextBlob API.

What is TextBlob?

TextBlob is an open source text processing library written in Python. It can be used to perform various natural language processing tasks such as part-of-speech tagging, noun-phrase extraction, sentiment analysis, text translation, and many more. You can read about all the features supported by TextBlog in the official documentation.

Why should I care?

The reason I decided to learn TextBlob are as follows:

  1. I wanted to develop applications which require text processing. When we add text processing capabilities to the application, the application becomes more human in that it can understand behavior better. Text processing is very hard to get it right. TextBlob stands on strong shoulders of NTLK, which is the leading platform for building Python programs to work with human language data.

  2. I wanted to learn how text processing can be done in Python.

Install TextBlob

Before we can get started with TextBlob, we need to install Python and virtualenv on the machine. The Python version I am using in this blog post is 2.7.

There are various ways to install TextBlob on the machine as mentioned in the official documentation. We will use the pip install way. For developers unaware of pip, it is Python package manager. We can install pip from the official website. Go to any convenient directory on your file system, and run following commands.

$ mkdir myapp
$ cd myapp
$ virtualenv venv --python=python2.7
$ . venv/bin/activate
$ pip install textblob
$ curl https://raw.github.com/sloria/TextBlob/master/download_corpora.py | python

The commands above will create a myapp directory on the local machine, then activate virtualenv with Python version 2.7, then install the textblob package, and then finally download the necessary NTLK corpora.

Github Repository

The code for today's demo application is available on github: day9-textblob-demo-openshift.

Application

The demo application is running on OpenShift http://showmesentiments-t20.rhcloud.com/. It is a very simple example of using TextBlob sentiment analysis API. As user types, he will see whether the message is positive(Green), negative(Red), or neutral(Orange).

Sentiment Analysis Demo app running on OpenShift

We will develop a simple Flask application which will expose a REST API. If you are not aware of Flask, you can refer to my earlier post on it.

Next we will install the Flask framework. To install the Flask framework, we will run first activate the virtualenv and then use pip to install Flask.

$ . venv/bin/activate
$ pip install flask

As I mentioned in my earlier blog post on Flask, it is awesome for writing REST based web services. Create a new file called app.py under the myapp folder.

$ touch app.py

Copy the following code and paste it in the app.py source file

from flask import Flask , jsonify, render_template
from textblob import TextBlob
 
app = Flask(__name__)
 
@app.route('/')
@app.route('/index')
def index():
    return render_template('index.html')
 
@app.route('/api/v1/sentiment/<message>')
def sentiment(message):
    text = TextBlob(message)
    response = {'polarity' : text.polarity , 'subjectivity' : text.subjectivity}
    return jsonify(response)
 
if __name__ == "__main__":
    app.run(debug=True)

The code shown above does the following:

  1. It imports the Flask class, jsonify function, and render_template function from flask package.
  2. It imports the TextBlob class from textblob package.
  3. It defines a route to '/' and 'index' url. So, if a user makes a GET request to either '/' or '/index', then the index.html will be rendered.
  4. It defines a route to '/api/v1/sentiment/' url. The is a placeholder and will contain the text message the user want to run sentiment analysis on. We create an instance of TextBlob passing it the message. Next, we get polarity and subjectivity of the message, and then create a json object and return it back.
  5. Finally, we start the development server to run the application using the python app.py command. We also enabled debugging by passing Debug=True. Debugging provides an interactive debugger in the browser when an unexpected exceptions occur. Another benefit of the debugger is that it will automatically reload the changes. We can keep the debugger running in the background and work through our application. This provides a highly productive environment.

The index() function renders an html file. Create a new folder called templates in the myapp directory and then create new file named index.html.

$ mkdir templates
$ touch templates/index.html

Copy the content to the index.html source file which uses Twitter Boostrap to add style. We are also using jQuery to make REST calls on a keyup event. We don't make REST calls if key is backspace, tab, enter, left , right, up, down.

<html>
<head>
    <title>Do sentiment analysis on the text</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <link rel="stylesheet" type="text/css" href="static/css/bootstrap.css">
    <style type="text/css">
    body {
      padding-top:60px;
      padding-bottom: 60px;
    }
  </style>
</head>
<body>
 
<div class="navbar navbar-inverse navbar-fixed-top">
      <div class="container">
        <div class="navbar-header">
          <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
            <span class="icon-bar"></span>
            <span class="icon-bar"></span>
            <span class="icon-bar"></span>
          </button>
          <a class="navbar-brand" href="#">Run Sentiment Analysis</a>
        </div>
 
    </div>
  </div>
 
<div class="container">
    <div class="row">
        <div class="col-md-6">
            <textarea class="form-control" rows="3" placeholder="Write your text. Minimum length 10 characters"></textarea>   
        </div>
        <div class="col-md-6">
            <p id="result"></p>
        </div>
    </div>
 
</div>
 
 
<script type="text/javascript" src="static/js/jquery.js"></script>
<script type="text/javascript">
    $("textarea").keyup(function(e){
        console.log('keycode '+e.keyCode);
        switch (e.keyCode) {
            case 8:  // Backspace
                console.log('backspace'+e);
            case 9:  // Tab
                console.log('Tab');
            case 13: // Enter
                console.log('Enter');
            case 37: // Left
                console.log('Left');
            case 38: // Up
                console.log('Up');
            case 39: // Right
                console.log('Right');
            case 40: // Down
                console.log('Down');
            break;
 
            default:
            var input = $('textarea').val();
            $('#result').removeClass("alert alert-warning");
            $('#result').removeClass("alert alert-danger");
            $('#result').removeClass("alert alert-success");
            if (input.length > 10){
 
            $.get('/api/v1/sentiment/'+input,function(result){
 
                if(result.polarity < 0.0){
 
                    $('#result').addClass("alert alert-danger")   .text(input);
                } else if( result.polarity >= 0.0 && result.polarity <= 0.5){
                    $('#result').addClass("alert alert-warning").text(input);
                }else{
                    $('#result').addClass("alert alert-success").text(input);
                }
 
            })
        }
    }
 
 
    });
 
</script>
</body>
</html>

You can copy the js and css files from my github repository.

Deploy to the cloud

Before we can deploy the application to our cloud environment, we'll have to do few setup tasks :

  1. Sign up for an OpenShift Account. It is completely free and Red Hat gives every user three free Gears on which to run your applications. At the time of this writing, the combined resources allocated for each user is 1.5 GB of memory and 3 GB of disk space.

  2. Install the rhc client tool on your machine. The rhc is a ruby gem so you need to have ruby 1.8.7 or above on your machine. To install rhc, just typesudo gem install rhc If you already have one, make sure it is the latest one. To update your rhc, execute the command shown below.sudo gem update rhc For additional assistance setting up the rhc command-line tool, see the following page: https://openshift.redhat.com/community/developers/rhc-client-tools-install

  3. Setup your OpenShift account using rhc setup command. This command will help you create a namespace and upload your ssh keys to OpenShift server.

To deploy the application on OpenShift just type the command shown below.

$ rhc create-app day9demo python-2.7 --from-code https://github.com/shekhargulati/day9-textblob-demo-openshift.git --timeout 180

It will do all the stuff from creating an application, to setting up public DNS, to creating private git repository, and then finally deploying the application using code from my Github repository.The application will be deployed on http://day9demo-{domain-name}.rhcloud.com. Please replace {domain-name} with your account domain name. The app is running here http://showmesentiments-t20.rhcloud.com/

That's it for today. Keep giving feedback.

What's Next

touch app.py

Why use touch here?