Blog

Finding Abstractions that Give Data Applications 'Flight'

Continuing with our recent theme of abstraction in data applications, Dave King gave a talk last month explaining his design principles for "Making Code Sing: Finding the Right Abstractions." Nailing the best abstractions is a quintessential software challenge. We strive for generality, flexibility, and reuse, but we are often forced to compromise in order to get the details right for one particular use case. We end up with projects that we know have amazing potential for use in other applications but are too hardcoded to make repurposing easy. It’s frustrating to see the possibilities locked away, just out of reach.

Read More »

Epiphanies on Abstraction, Modularity, and Being Combinatorial

Six months ago I didn't understand the concept of abstraction. Now it comes up almost daily. It’s foundational to my thinking on everything from software to entrepreneurship. I can’t believe how simple it seems. When I finally grokked abstraction, it felt like my first taste of basic economics. Given a new framework, something that had always been there, intuited but blurry, came into focus.

Read More »

A Novice Coder, a Finance Data Application, and the Value of Rapid Prototyping

I like to build things. I like analysis. I like programming. Interestingly, you often need to reverse that order before you’re in a position to build an application for analyzing something. You need programming knowledge to turn the analysis into a “thing.” The problem is, while I like programming, I’m still new to it. I mean, I’m Codecademy good, but that doesn’t translate into a user facing application leveraging Python, Javascript, and D3. So, when I recently sat down to build a minimally viable data application for looking at airline stocks, I wondered how long it might take to get to viable and, frankly, feared how minimal it might be.

Read More »

How a Data Scientist Built a Web-Based Data Application

I’m an algorithms guy. I love exploring data sets, building cool models, and finding interesting patterns that are hidden in that data. Once I have a model, then of course I want a great interactive, visual way to communicate it to anyone that will listen. When it comes to interactive visuals there is nothing better than JavaScript’s D3. It’s smooth and beautiful.

Read More »

Topic Modeling the State of the Union: TV and Partisanship

Do you feel like partisanship is running amok? It’s not your imagination. As an example, the modern State of the Union has become hyperpartisan, and topic modeling quantifies that effect. 

Read More »

How to Tell an Interesting Data Story

The Laffer Curve. Anyone know what this says? It says that at this point on the revenue curve, you will get exactly the same amount of revenue as at this point. This is very controversial. Does anyone know what Vice President Bush called this in 1980? Anyone? …Bueller?... Bueller?... Bueller?

Read More »

Affecting Change Using Social Influence Mapping

If you've ever tried to get a company to adopt new software you know how challenging it can be. Despite what seem to you like obvious benefits and your relentless communication, people selectively ignore or, worse, revolt against the change. Change efforts will even stumble in the face of this wisdom of the ages:

Read More »

How I Made a Neural Network Web Application in an Hour

Computer vision is an exciting and quickly growing set of data science technologies. It has a broad range of applications from industrial quality control to disease diagnosis. I have dabbled with a few different technologies that fall under this umbrella before, and I decided that it would be a worthwhile endeavor to rapid prototype an image recognition web application that used a neural network.

Read More »

Text Analysis with R: Does POTUS Write the State of the Union or Vice Versa?

In this post, I apply text clustering techniques – hierarchical clustering, K-Means, and Principal Components Analysis – to every presidential state of the union address from Truman to Obama. I used R for the setup, the clustering, and the data vis.

Read More »

Einstellung Effect: What You Already Know Can Hurt You

The Einstellung effect is a psychological phenomenon that changes the way we all come to solutions and impedes innovation.

Read More »

Communicating Data Science: How to Captivate a Noncaptive Audience

When communicating about your latest data science project, whether verbally or in writing, your audience often needs to know the takeaway right away, or you’ll lose their attention. This is especially the case if your audience includes colleagues, conference attendees, or readers from outside your field. In an earlier post on communicating data science, I dove into how the elements of story can hold your audience’s attention through a dense presentation. This post introduces (and applies) some tried and true approaches for introducing the end of your story at the beginning. You’ll capture the attention of those for whom your point is valuable and have their attention for your story, and the rest of the audience doesn’t matter.

Read More »

Data Science Wanderlust: Analyzing Global Health with Protein Sequences

Fifteen years ago, I had the unique opportunity to go on Semester at Sea, an around-the-world trip on a converted cruise ship that combined college coursework stops at nine countries on four continents. This once in a lifetime trip instilled in me a strong sense of wanderlust and a deep desire to give back to the global community.

Read More »

Communicating Data Science with 'Story'

Getting your audience’s attention, keeping it, and persuading listeners of your point are all hard to do in a world where most listeners start out thinking, and feeling, “I’ve got my own scheisse to do.” John Weathington’s recent post in Tech Republic, “Be the Hemingway of Data Science Storytelling,” makes the point that presenting data, which can be dry, is more effective if it incorporates elements of story – a protagonist, a journey with challenges, and a conclusion. Jeff Leek’s “The Elements of Data Analytic Style” has a chapter about presenting data that emphasizes story as the method for communicating results.

Read More »

The True Meaning of Catalyst, Crescendo, and Adaptation

People sometimes ask me what our company’s name means and why we chose it. The explanation often leads to discussions about similar but different terms. So I thought I’d use this blog post to explain, hopefully illuminate and while I’m at it, to correct some usage that’s bugged me for some time. Actually, let’s start right there.

I'm not sure where accuracy becomes pedantry, but there are two words - catalyst and crescendo - that instantly make my ears prick up when I hear them, only because I've heard them used incorrectly for long. One is from science and one from music, two things I tend to obsess about.

Read More »

Embracing the Hairball

One of the perennial challenges in visualizing complex networks is dealing with hairballs: how do you draw a network that is so large and densely interconnected that any full rendering of it tends to turn into an inscrutable mess? There are various approaches to addressing this problem: BioFabric, Hive Plots, and many others. Most involve very different visual abstractions for the network.

Read More »

The Data Scientific Method

The Oxford English Dictionary defines the scientific method as "a method or procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses." With more scientists today than ever, the scientific method is alive and well, and generating more data than ever. This explosion of data has brought about the field of data science and an associated plethora of analytics tools. Controversially, some have claimed, such as in this Wired magazine article, that data science is so powerful that it has made the scientific method obsolete. Google's founding philosophy is that “we don't know why this page is better than that one. If the statistics of incoming links say it is, that's good enough.” The implication is that with enough data, people will no longer need to know why something happens, it just does, and that’s good enough. Is it, really?

Read More »

We work on Technology. Then it works on us.

I think I was 10 years old when my dad brought home our first microwave oven. It was an imposing black box that weighed a ton and had scary warning labels that mentioned radiation. The only time I had ever heard mention of radiation before was in regard to the atom bomb. We felt like we were supposed to run for cover whenever we turned it on, but, like everyone else I knew who had one, we did just the opposite. We huddled around it. We brought our noses right up to the translucent window, and watched, mesmerized in wonder, as the food inside got zapped by mysterious, limitless, invisible energy. When the timer beeped, and the door opened to reveal a steaming bowl of soup that had been cold only a minute ago, it seemed like a miracle. I remember those early days with the microwave vividly – experimenting with eggs, and chocolate syrup, and the off-limits gold-rimmed fine china that would send off an awe-inspiring barrage of orange sparks after just 15 seconds. Just 15 seconds! 15! I think that was the most important thing of all about the microwave oven – not what it did to my food, but what it did to my sense of time.

Read More »

    Recent Posts

    Subscribe