Ditch Excel and Use Julia Data Frames

Manipulating and visualizing pizza sales data using Julia DataFrames.jl and Plots.jl

Erik Engheim
20 min readOct 27, 2020
Photo by Brett Jordan from Pexels

In this story we will look at pizza sales data found here:

https://vincentarelbundock.github.io/Rdatasets/csv/gt/pizzaplace.csv

This kind of data can be manipulated in a spreadsheet application such as Excel and using data frames popular in languages such as R, Python (Pandas) and Julia (DataFrames.jl).

Loading Data

First we will load the data in Julia and pick a subset (id, name, size and price) of columns in the table to work with:

using DataFrames, CSV

url = "https://vincentarelbundock.github.io/Rdatasets/csv/gt/pizzaplace.csv"
filename = download(url)
all_pizzas = CSV.read(filename, DataFrame)

# Get rid of column with row numbers
all_pizzas = all_pizzas[:, 2:end]

# Pick most interesting columns
pz = select(all_pizzas, :id, :name, :size, :price)

We can look at the first view rows to see what this looks like in the Julia REPL (Read Evaluate Program Loop):

julia> first(pz, 4)
4×4 DataFrame
│ Row │ id │ name │ size │ price │
│ │ String │ String │ String │ Float64 │
├─────┼─────────────┼─────────────┼────────┼─────────┤

--

--

Erik Engheim

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.