Contact Us
Blog

Frank D. Evans

Frank D. Evans
Frank is a Data Scientist that works heavily with big data systems. His work spans financial analysis and behavioral analytics to sports analytics. His primary interest is in machine learning and feature engineering on very large scales. You can check out his work here: https://github.com/frankdevans and http://www.slideshare.net/frankdevans.
Find me on:

Recent Posts

If every ‘new’ idea is derivative, derive them.

Everything is derivative. Take advantage of that. “New” ideas are the next step in an extensive network of existing people and ideas. If we can get the data and reconstruct the network, we can analyze it and understand where branches of a network have the potential for innovation.

Read More »

How a Data Scientist Built a Web-Based Data Application

I’m an algorithms guy. I love exploring data sets, building cool models, and finding interesting patterns that are hidden in that data. Once I have a model, then of course I want a great interactive, visual way to communicate it to anyone that will listen. When it comes to interactive visuals there is nothing better than JavaScript’s D3. It’s smooth and beautiful.

But like I said, I’m an algorithms guy. Those machine learning models I’ve tuned are in Python and R. And I don’t want to spend all my time trying to glue them together with web code that I don't understand very well and I’m not terribly interested in.

Read More »

Topic Modeling the State of the Union: TV and Partisanship

Do you feel like partisanship is running amok? It’s not your imagination. As an example, the modern State of the Union has become hyperpartisan, and topic modeling quantifies that effect. 

Topic modeling finds broad topics that occur in a body of text. Those topics are characterized by key terms that have some relationship to each other.  Here are the four dominant topic groups found in State of the Union addresses since 1945.

Read More »

Text Analysis with R: Does POTUS Write the State of the Union or Vice Versa?

In this post, I apply text clustering techniques – hierarchical clustering, K-Means, and Principal Components Analysis – to every presidential state of the union address from Truman to Obama. I used R for the setup, the clustering, and the data vis.

It turns out that the state of the union writes the State of the Union more than the president does. The words used in the addresses appear linked to the era more than to an individual president or his party affiliation. However, there is one major exception in President George W. Bush, whose style and content marks a sharp departure from both his predecessors and contemporaries. You can see the R scripts and more technical detail on the process here. The State of the Union addresses up to 2007 are available here and the rest you can get here.

Read More »

Subscribe to Email Updates

Privacy Policy
309 NW 13th St, Oklahoma City, OK 73103 | 888.514.8982