Contact Us
Blog

Machine Learning Helps Humans Perform Text Analysis

The rise of Big Data created the need for data applications to be able to consume data residing in disparate databases, of wildly differing schema. The traditional approach to performing analytics on this sort of data has been to warehouse it; to move all the data into one place under a common schema so it can be analyzed.

This approach is no longer feasible with the volume of data being produced, the variety of data requiring specific optimized schemas, and the velocity of the creation of new data. A much more promising approach has been based on semantic link data, which models data as a graph (a network of nodes and edges) instead of as a series of relational tables.

Read More »

Moving Beyond Data Visualization to Data Applications

One thing we love doing at Exaptive – aside from creating tools that facilitate innovation – is hiring intelligent, creative, and compassionate people to fill our ranks. Frank Evans is one of our data scientists. He was invited to present at the TEDxOU event on January 26, 2018.

Frank gave a great talk about how to go beyond data visualization to data applications. The verbatim of his script is below the video. Learn more about how to build data applications here

Read More »

Exploring Tech Stocks: A Data Application Versus Data Visualization

A crucial aspect that sets a data application apart from an ordinary visualization is interactivity. In an application, visualizations can interact with each other. For example, clicking on a point in a scatterplot may send corresponding data to a table. In an application, visualizations are also enhanced with simple filtering tools, e.g. selections in a list can update results shown a heat map. 

You can already try some linked visualizations to find the perfect taco. Now, we'll look at how some simple filtering elements enhance visualization, using a tech stock exploration xap I built over a couple of days. (A xap is what we call a data application built with the Exaptive Studio.) A few simple, but flexible interactive elements can help transform ordinary visualizations into powerful, insightful data applications. Humble check boxes and lists help produce extra value from charts and plots.

Read More »

A Graph-Analytics Data Application Using a Supercomputer!

We recently had to prototype a data application over a supercomputer tuned for graph analysis. We built a proof-of-concept leveraging multiple APIs, Cray’s Urika-GX and Graph Engine (CGE), and a handful of programming languages in less than a week.

Read More »

A Data Application for the _______ Genome Project

The ability to reuse and repurpose - exaptation - is often a catalyst for exciting breakthroughs. The Astronomical Medicine Project (yes, astronomical medicine) was founded on the realization that space phenomena could be visualized using MRI software, like highly irregular brains. The first private space plane, designed by Burt Ratan, reenters the atmosphere using wings inspired by a badminton birdie. Anecdotes like this abound in many fields, and the principle applies to working with data and creating data applications, as much as it does any innovation. 

To demonstrate and give our users a running start at successfully repurposing something, we want to share an editable data application, the Taco Cuisine Genome AtlasWe held an internal hackathon in which teams had a day to design and build a xap. (A xap is what we call data application built with our platform. Learn a bit more about our dataflow programming environment here.) One team took algorithms and visualizations created for a cancer research application and applied them to tacos. Application users can identify, according to multiple ingredients, specific tacos and where to find them.

The best part is that this wasn't entirely an act of frivolity. Repurposing healthcare and life sciences tools on different, albeit mundane, data led to a potential improvement for the cancer research application - a map visualization of clinical trials for specific cancer types. 

It can't be said enough. New perspective is a key catalyst for innovation. 

So, we've made this xap available for the public to kickstart other work. Explore itbuild off it, and apply it to your own data. You can also learn the basics of how it's done.

Read More »

A Data Exploration Journey with Cars and Parallel Coordinates

Parallel coordinates is one way to visually compare many variables at once and to see the correlations between them. Each variable is given a vertical axis, and the axes are placed parallel to each other. A line representing a particular sample is drawn between the axes, indicating how the sample compares across the variables.

Previously, I wrote how it's possible to create a basic network diagram application from just three components in the Exaptive Studio. Many users will require more scalable from a data application, and fortunately the Studio allows for the creation of something like our Parallel Coordinates Explorer. Often times, a parallel coordinates diagram can also become cluttered, but fortunately, our Parallel Coordinates component lets users rearrange axes and highlight samples in the data to filter the view.

It helps to use some real data to illustrate. One dataset that many R aficionados may be familiar with is the mtcars dataset. It's a list of 32 different cars, or samples, with 11 variables for each car. The list is derived from a 1974 issue of Motor Trend magazine, which compared a number of stats across cars of the era, including the number of cylinders in the engine, displacement (the size of the engine, in cubic inches), economy (in miles per gallon of fuel), and power output.

Let's say we're interested in fuel economy, and want to find out characteristics could signify a car with good fuel economy. Anecdotally, you may have heard that larger engines generate more power, but that smaller engines generate better fuel economy. You may also have heard that four-cylinder engines are typically smaller in size than larger engines. Does this hold true for Motor Trend's mtcars data?

To find out we'll use a xap (what we call a data application made with Exaptive) that lets a user upload either a csv or Excel file and generates a parallel coordinates visualization from the data. But a data application is more than a data visualization. We're going to make a data application that selects and filters the data for rich exploration. 

In our dataflow programming environment, we use a few components to ingest the data and send a duffle of data to the visualization. Then a hand-full of helper components come together make the an application with which an end-user can explore the data.

Here's the dataflow diagram, with annotations. 

Read More »

Use a Network Diagram to Uncover Relationships in Your Data

Often times, when we're looking at a mass of data, we're trying to get a sense for relationships within that data. Who is the leader in this social group? What is a common thread between different groups of people? Such relationships can be represented hundreds of ways graphically, but few are as powerful as the classic network diagram.

Read More »

Finding Netflix's Hidden Trove of Original Content with a Basic Network Diagram

Nexflix has collected an impressive amount of data on Hollywood entertainment, made possible by tracking the viewing habits of its more-than 90 million members. In 2013, Netflix took an educated guess based on that data to stream its own original series, risking its reputation and finances in the process. When people were subscribing to Netflix to watch a trove of television series and movies created by well-established networks and studios, why create original content? Now, few would question the move.

Read More »

What is a Data Application?

There are data visualizations. There are web applications. If they had a baby, you'd get a data application.

Data applications are a big part of where our data-driven world is headed. They're how data science gets operationalized. They are how end-users - whether they're subject matter experts, business decision makers, or consumers - interact with data, big and small. We all use a data application when we book a flight, for instance. 

Dave King, our CEO and one of our chief software architects, spoke with Software Engineering Daily about what makes data applications important and best practices for building them. Check out the podcast or read the abridged transcript beneath it. (Learn how they're built, or try building a data application if you'd like.)

Read More »

A Data Application to Foretell the Next Silicon Valley?

Can we predict what the next hub of tech entrepreneurship will be? Could we pinpoint where the next real estate boom will be and invest there? Thanks to advances in machine learning and easier access to public data through Open Data initiatives, we can now explore these types of questions.

Read More »

Finding Abstractions that Give Data Applications 'Flight'

Continuing with our recent theme of abstraction in data applications, Dave King gave a talk last month explaining his design principles for "Making Code Sing: Finding the Right Abstractions." Nailing the best abstractions is a quintessential software challenge. We strive for generality, flexibility, and reuse, but we are often forced to compromise in order to get the details right for one particular use case. We end up with projects that we know have amazing potential for use in other applications but are too hardcoded to make repurposing easy. It’s frustrating to see the possibilities locked away, just out of reach.

Read More »

Epiphanies on Abstraction, Modularity, and Being Combinatorial

Six months ago I didn't understand the concept of abstraction. Now it comes up almost daily. It’s foundational to my thinking on everything from software to entrepreneurship. I can’t believe how simple it seems. When I finally grokked abstraction, it felt like my first taste of basic economics. Given a new framework, something that had always been there, intuited but blurry, came into focus.

Read More »

A Novice Coder, a Finance Data Application, and the Value of Rapid Prototyping

I like to build things. I like analysis. I like programming. Interestingly, you often need to reverse that order before you’re in a position to build an application for analyzing something. You need programming knowledge to turn the analysis into a “thing.” The problem is, while I like programming, I’m still new to it. I mean, I’m Codecademy good, but that doesn’t translate into a user facing application leveraging Python, Javascript, and D3. So, when I recently sat down to build a minimally viable data application for looking at airline stocks, I wondered how long it might take to get to viable and, frankly, feared how minimal it might be.

Read More »

How a Data Scientist Built a Web-Based Data Application

I’m an algorithms guy. I love exploring data sets, building cool models, and finding interesting patterns that are hidden in that data. Once I have a model, then of course I want a great interactive, visual way to communicate it to anyone that will listen. When it comes to interactive visuals there is nothing better than JavaScript’s D3. It’s smooth and beautiful.

But like I said, I’m an algorithms guy. Those machine learning models I’ve tuned are in Python and R. And I don’t want to spend all my time trying to glue them together with web code that I don't understand very well and I’m not terribly interested in.

Read More »

Topic Modeling the State of the Union: TV and Partisanship

Do you feel like partisanship is running amok? It’s not your imagination. As an example, the modern State of the Union has become hyperpartisan, and topic modeling quantifies that effect. 

Topic modeling finds broad topics that occur in a body of text. Those topics are characterized by key terms that have some relationship to each other.  Here are the four dominant topic groups found in State of the Union addresses since 1945.

Read More »

Subscribe to Email Updates

Privacy Policy
309 NW 13th St, Oklahoma City, OK 73103 | 888.514.8982