Frank D. Evans

Frank D. Evans

Frank is a Data Scientist that works heavily with big data systems. His work spans financial analysis and behavioral analytics to sports analytics. His primary interest is in machine learning and feature engineering on very large scales. You can check out his work here: and
Find me on:

Recent Posts

Text Analysis with R: Does POTUS Write the State of the Union or Vice Versa?

In this post, I apply text clustering techniques – hierarchical clustering, K-Means, and Principal Components Analysis – to every presidential state of the union address from Truman to Obama. I used R for the setup, the clustering, and the data vis.

It turns out that the state of the union writes the State of the Union more than the president does. The words used in the addresses appear linked to the era more than to an individual president or his party affiliation. However, there is one major exception in President George W. Bush, whose style and content marks a sharp departure from both his predecessors and contemporaries. You can see the R scripts and more technical detail on the process here. The State of the Union addresses up to 2007 are available here and the rest you can get here.

Read More »