Parallel coordinates is one way to visually compare many variables at once and to see the correlations between them. Each variable is given a vertical axis, and the axes are placed parallel to each other. A line representing a particular sample is drawn between the axes, indicating how the sample compares across the variables.
Previously, I wrote how it's possible to create a basic network diagram application from just three components in the Exaptive Studio. Many users will require more scalable from a data application, and fortunately the Studio allows for the creation of something like our Parallel Coordinates Explorer. Often times, a parallel coordinates diagram can also become cluttered, but fortunately, our Parallel Coordinates component lets users rearrange axes and highlight samples in the data to filter the view.
It helps to use some real data to illustrate. One dataset that many R aficionados may be familiar with is the mtcars dataset. It's a list of 32 different cars, or samples, with 11 variables for each car. The list is derived from a 1974 issue of Motor Trend magazine, which compared a number of stats across cars of the era, including the number of cylinders in the engine, displacement (the size of the engine, in cubic inches), economy (in miles per gallon of fuel), and power output.
Let's say we're interested in fuel economy, and want to find out characteristics could signify a car with good fuel economy. Anecdotally, you may have heard that larger engines generate more power, but that smaller engines generate better fuel economy. You may also have heard that four-cylinder engines are typically smaller in size than larger engines. Does this hold true for Motor Trend's mtcars data?
To find out we'll use a xap (what we call a data application made with Exaptive) that lets a user upload either a csv or Excel file and generates a parallel coordinates visualization from the data. But a data application is more than a data visualization. We're going to make a data application that selects and filters the data for rich exploration.
In our dataflow programming environment, we use a few components to ingest the data and send a duffle of data to the visualization. Then a hand-full of helper components come together make the an application with which an end-user can explore the data.
Here's the dataflow diagram, with annotations.
Read More »