Ditch Excel and Use Julia Data Frames
Manipulating and visualizing pizza sales data using Julia DataFrames.jl and Plots.jl
20 min readOct 27, 2020
In this story we will look at pizza sales data found here:
https://vincentarelbundock.github.io/Rdatasets/csv/gt/pizzaplace.csv
This kind of data can be manipulated in a spreadsheet application such as Excel and using data frames popular in languages such as R, Python (Pandas) and Julia (DataFrames.jl).
Loading Data
First we will load the data in Julia and pick a subset (id, name, size and price) of columns in the table to work with:
using DataFrames, CSV
url = "https://vincentarelbundock.github.io/Rdatasets/csv/gt/pizzaplace.csv"
filename = download(url)
all_pizzas = CSV.read(filename, DataFrame)
# Get rid of column with row numbers
all_pizzas = all_pizzas[:, 2:end]
# Pick most interesting columns
pz = select(all_pizzas, :id, :name, :size, :price)
We can look at the first view rows to see what this looks like in the Julia REPL (Read Evaluate Program Loop):
julia> first(pz, 4)
4×4 DataFrame
│ Row │ id │ name │ size │ price │
│ │ String │ String │ String │ Float64 │
├─────┼─────────────┼─────────────┼────────┼─────────┤
│…