« Back
Mar 26, 2017 10:36:16 AM
Gleaning Insight from Content with IBM Watson

In recent years, machine learning as a service has come of age, with robust capabilities from Amazon, Google, Microsoft and others now available through REST APIs for a fraction of the cost of deploying or developing your own capabilities. One of the better known – if not easy to separate hype from reality – is IBM Watson. While Watson gained fame as the Jeopardy!-winning supercomputer, IBM now uses the brand for a wide variety of machine learning capabilities from speech-to-text and conversational bots to text mining algorithms for understanding the concepts, references and tone of text-based content.

In this post, we'll cover how to integrate one such IBM service – Natural Language Understanding – and rapidly prototype an application that you can try on your own content. It includes how to get started with IBM's hosted service Bluemix and the Python code to connect to the REST API. I've also included a working data application that you can run with your own text.

The code and application below uses an API username and password for IBM Watson's free plan, which is limited to a few hundred requests per day. You can sign up for your own username and password by visiting IBM Bluemix. You can use their services for a one month trial, after which you will need to provide a credit card (even for the free plan). Note that IBM Bluemix is a full application development, compute, and storage environment akin to Amazon's AWS or Microsoft Azure. It has starter components as well intended to make application development simpler, like Salesforce's Heroku platform.

Once you have valid API credentials, you can connect to Watson services using any programming language capable of making simple web-based REST API calls. We use Python because of its ubiquity as a data programming language, though similar code would work fine in most other languages. IBM also offers SDKs for Python and other languages that wraps the API calls for you.

Each service has a different endpoint and slightly different parameters for passing content and configuration options. For this post, we include information on how to use the Natural Language Understanding (NLU) service. Consult IBM's developer cloud documentation for Watson services to learn about other capabilities.

The following Python function connects to the NLU service, sends the text of interest, and returns the response object as a Python dictionary.

def call_watson_api(text):
url_base = ''

features = {}
features['keywords'] = {
"sentiment": True,
"limit": 10

features['concepts'] = {
"limit": 10

features['entities'] = {
"sentiment": False,
"limit": 10

data = {
"text": text,
"features": features

headers = {'Content-Type': 'application/json'}
res =, data=json.dumps(data), auth=creds, headers=headers)
res_json = res.json() return res_json

In this instance, we configured the algorithm to pick up concepts, keywords and entities (people, places, companies, etc.). The NLU service can also detect emotion and sentiment, relationships, and other text-based features by means of the parameters object. The service returns a dictionary with keys representing the requested items (concepts, etc.) and values being lists of discovered items, themselves entities with name, type and relevance among other attributes.

The relevance score gives you a sense of both how confident the algorithm is in what it discovered and how prominent the item is in the content. The score ranges from 0 (little confidence/prominence) to 1 (very high confidence/prominence).

Like other text mining algorithms, Watson's NLU service works best on modern everyday communication. The service's usefulness out of the box is lower for outdated text or industry-specific cases like law or medicine. In those instances, you will likely need to either build a custom model, which Watson provides facilities for, or do pre- and post-processing to remove noise and otherwise improve upon the service.

So now it's time to give the API a whirl. Copy and paste some text into the application below and click the button. The word cloud will show you items that Watson discovered, coloring them by type and giving more prominence to those with higher relevance scores. Using just a handful of pre-built components, I built this application in about 15 minutes. (You can view it full screen and edit it your own version by signing up for the community edition of the Exaptive Studio with this link, which adds the application and its components to your studio so you can get started quickly.)


Here's a dataflow diagram of what's going on under the hood of the xap. Dataflow programming works well for communicating and digesting quickly how an application is working. Follow the flow left to right.

watson-dataflow big.png

Pretty simple. Watson did most of the work. My script above is encapsulated in the component "IBMWatson..." All I had to do is add a visualization, the word cloud, and some UI elements to make IBM Watson accessible to just about anyone. What a world?! If you decide to iterate on it yourself in the Studio, let me know what you do with it!


watson with dead space.jpg


    Recent Posts