Plotting and Graph Terminology

Explaining basic concepts such as axis, data source and data series used for plots.

sine curve plotted using Gadfly

Data Source

The source data we want to visualize is referred to as the data source. Typically this is organized in a table. In scientific programming packages such as R, Matlab, NumPy and Julia these is usually referred to as a Data Frame. A data frame is a table where each column has a label we can refer to it.

Data Series

The data in our data source needs to be mapped into one or more data series to be able to plot anything.

Example From Geology

Let me take an example from the domain I work in which is oil and gas. It is common to send measurement instruments down a well bore and measure different properties of the rock at different depths:

  • Velocity of sound waves through the rock. Sending sound waves between two points in a rock formation will take some time. This time closely matches the density of the rock.

Plot Types and Geometries

The plot types define how the data series get visualized. As you can see below there are many different plot types or charts you can use.

Mapping Data to Plot

When you have picked data, defined series and decided what sort of plot to use (geometry), you still need how data should be mapped to the geometry you see in the plot. In Gadfly each aspect of the geometry which may be mapped to some data is referred to as an aesthetic.

xs = 1:2:12
ys = xs.^2
df = DataFrame(x = xs, y = ys)
df = DataFrame(foo = xs, bar = ys)
  1. x aesthetic connects to foo column of the data source.
  2. y aesthetic connects to bar column of the data source.
plot(df, x = :foo, y = :bar, Geom.LineGeometry)

