Introduction�
This project aims to replicate the exercises from Cole Nussbaumer Knaflic's book, "Storytelling with Data - Let's Practice!", using Python's Vega-Altair. Our primary objective is to document the reasoning behind the modifications proposed by the author, while also highlighting the challenges that arise when transitioning from the book's Excel-based approach to programming in a different software environment.
Vega-Altair was selected for this project due to its declarative syntax, interactivity, grammar of graphics, and compatibility with web formatting tools, while within the user-friendly Python environment. Anticipated challenges include the comparatively smaller documentation and development community of Vega-Altair compared to more established libraries like Matplotlib, Seaborn, or Plotly, and seemingly straightforward tasks in Excel that may require multiple iterations to translate effectively into the language.
In addition to the broader objective, this notebook also serves as a personal journey of learning Altair, a syntax that was previously unfamiliar to me. By delving into it, I aim to widen my repertoire in the data visualization field, discovering new ways to create compelling visual representations.
The data for all exercises can be found in the book's official website: https://www.storytellingwithdata.com/letspractice/downloads
Imports�
These are the libraries necessary to run the code for this project.
# For data manipulation and visualization
import pandas as pd
import numpy as np
import altair as alt
# For animation in Chapter 6 - Exercise 6
import ipywidgets as widgets
from ipywidgets import interact
from IPython.display import clear_output
# For converting .ipynb into .html
import nbconvert
import nbformat
# For printing current date
from datetime import date
And these are the versions used.
# Python version
! python --version
Python 3.11.6
# Library version
print("Pandas version: " + pd.__version__)
print("Numpy version: " + np.__version__)
print("Altair version: " + alt.__version__)
print("Ipywidgets version: " + widgets.__version__)
print("Nbconvert version: " + nbconvert.__version__)
print("Nbformat version: " + nbformat.__version__)
Pandas version: 2.1.2 Numpy version: 1.26.0 Altair version: 5.1.2 Ipywidgets version: 8.1.1 Nbconvert version: 7.12.0 Nbformat version: 5.9.2
# Last update of notebook
print("Code last updated at:", date.today())
Code last updated at: 2023-12-10
Chapter 2 - Choose an effective visual�
"When I have some data I need to show, how do I do that in an effective way?" - Cole Nussbaumer Knaflic
Loading the data�
The first problem with the Excel-to-Altair translation arises from the data itself, as it is polluted with titles and texts for readability in Excel. This, however, is not friendly when dealing with Python, so we should be careful when loading it. Alterations like this will happen in all subsequent exercises.
# Example of polluted loading
table = pd.read_excel(r"Data\2.1 EXERCISE.xlsx")
table
| EXERCISE 2.1 | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 4 | Unnamed: 5 | |
|---|---|---|---|---|---|---|
| 0 | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | NaN | FIG 2.1a | NaN | NaN | NaN | NaN |
| 2 | NaN | NaN | NaN | NaN | NaN | NaN |
| 3 | NaN | New client tier share | NaN | NaN | NaN | NaN |
| 4 | NaN | NaN | NaN | NaN | NaN | NaN |
| 5 | NaN | Tier | # of Accounts | % Accounts | Revenue ($M) | % Revenue |
| 6 | NaN | A | 77 | 0.070772 | 4.675 | 0.25 |
| 7 | NaN | A+ | 19 | 0.017463 | 3.927 | 0.21 |
| 8 | NaN | B | 338 | 0.310662 | 5.984 | 0.32 |
| 9 | NaN | C | 425 | 0.390625 | 2.805 | 0.15 |
| 10 | NaN | D | 24 | 0.022059 | 0.374 | 0.02 |
del table
# Correctly configures loading
table = pd.read_excel(r"Data\2.1 EXERCISE.xlsx", usecols = [1, 2, 3, 4, 5], header = 6)
table
| Tier | # of Accounts | % Accounts | Revenue ($M) | % Revenue | |
|---|---|---|---|---|---|
| 0 | A | 77 | 0.070772 | 4.675 | 0.25 |
| 1 | A+ | 19 | 0.017463 | 3.927 | 0.21 |
| 2 | B | 338 | 0.310662 | 5.984 | 0.32 |
| 3 | C | 425 | 0.390625 | 2.805 | 0.15 |
| 4 | D | 24 | 0.022059 | 0.374 | 0.02 |
Table�
The initial changes recommended in the book focus on improving the table's readability itself. These changes include reordering the tiers, adding a row to show the total value, incorporating a category called "All others" to account for unmentioned values when the total percentage doesn't add up to 100%, and rounding the numbers while adjusting the percentage format as required.
The following code implements these modifications.
# Ordering the tiers
table = table.loc[[1, 0, 2, 3, 4]]
# Fixing the percentages
table['% Accounts'] = table['% Accounts'].apply(lambda x: x*100)
table['% Revenue'] = table['% Revenue'].apply(lambda x: x*100)
# Calculating and adding "All other" values
other_account_per = 100 - table['% Accounts'].sum()
other_revenue_per = 100 - table['% Revenue'].sum()
other_account_num = (other_account_per*table['# of Accounts'][0])/table['% Accounts'][0]
other_revenue_num = (other_revenue_per*table['Revenue ($M)'][0])/table['% Revenue'][0]
table.loc[len(table)] = ["All other", other_account_num, other_account_per, other_revenue_num, other_revenue_per]
# Since we will not use rounded values or the total row for the graphs,
# we should create a new variable before making the following alterations
table_charts = table.copy()
# Adding total values row
table.loc[len(table)] = ["Total", table['# of Accounts'].sum(), table['% Accounts'].sum(),
table['Revenue ($M)'].sum(), table['% Revenue'].sum()]
# Rounding the numbers
table['% Accounts'] = table['% Accounts'].apply(lambda x: round(x))
table['Revenue ($M)'] = table['Revenue ($M)'].apply(lambda x: round(x, 1))
The new table is as follows:
table
| Tier | # of Accounts | % Accounts | Revenue ($M) | % Revenue | |
|---|---|---|---|---|---|
| 1 | A+ | 19.0 | 2 | 3.9 | 21.0 |
| 0 | A | 77.0 | 7 | 4.7 | 25.0 |
| 2 | B | 338.0 | 31 | 6.0 | 32.0 |
| 3 | C | 425.0 | 39 | 2.8 | 15.0 |
| 4 | D | 24.0 | 2 | 0.4 | 2.0 |
| 5 | All other | 205.0 | 19 | 0.9 | 5.0 |
| 6 | Total | 1088.0 | 100 | 18.7 | 100.0 |
or, for even better readability in Python:
table.set_index("Tier")
| # of Accounts | % Accounts | Revenue ($M) | % Revenue | |
|---|---|---|---|---|
| Tier | ||||
| A+ | 19.0 | 2 | 3.9 | 21.0 |
| A | 77.0 | 7 | 4.7 | 25.0 |
| B | 338.0 | 31 | 6.0 | 32.0 |
| C | 425.0 | 39 | 2.8 | 15.0 |
| D | 24.0 | 2 | 0.4 | 2.0 |
| All other | 205.0 | 19 | 0.9 | 5.0 |
| Total | 1088.0 | 100 | 18.7 | 100.0 |
Some changes were not implemented, such as colors of rows, alignment of text, and embedding graphs into the table, for lack of compatibility with the Pandas DataFrame format. The percentage symbol (%) next to the number in the percentage columns was not added since doing this in Python will transform the data from int to string, and therefore it is not a recommended approach.
Pie chart�
Considering that percentages depict a fraction of a whole, the next proposal is to employ a pie chart.
Here is the default Altair graph version:
# Default pie chart
alt.Chart(table_charts).mark_arc().encode(
theta = "% Accounts",
color = alt.Color('Tier')
)
Some of the adjustments needed to bring it closer to the original include reordering the tiers, changing the labels position, altering the color palette, and adding an title.
## % of Accounts Pie Chart
# Creating a base chart with a title, aligned to the left and with normal font weight
base = alt.Chart(
table_charts,
title = alt.Title(r"% of Total Accounts", anchor = 'start', fontWeight = 'normal')
).encode(
theta = alt.Theta("% Accounts:Q", stack = True), # Encoding the angle (theta) for the pie chart
color = alt.Color('Tier', legend = None), # Encoding color based on the 'Tier' field
order = alt.Order(field = 'Tier') # Ordering the sectors of the pie chart based on the 'Tier' field
)
# Creating the pie chart with an outer radius of 115
pie = base.mark_arc(outerRadius = 115)
# Creating text labels for each sector of the pie chart
text = base.mark_text(radius = 140, size = 15).encode(text = alt.Text("Tier"))
# Combining the pie chart and text labels
acc_pie = pie + text
acc_pie
Not informing the data type for the field order makes it so Altair rearranges the Tiers alphabetically instead of using the order provided by dataframe. We can fix this by identifying Tier as Ordered (O).
## % of Accounts Pie Chart
base = alt.Chart(
table_charts,
title = alt.Title(r"% of Total Accounts", anchor = 'start', fontWeight = 'normal')
).encode(
theta = alt.Theta("% Accounts:Q", stack = True),
color = alt.Color('Tier',
scale = alt.Scale(
range = ['#4d71bc', '#5d9bd4', '#6fae45', '#febf0f', '#e77e2d', '#a6a6a6']
), # Setting custom colors for each sector of the pie chart
sort = None, # So that the colors don't follow the alphabetic order
legend = None
),
order = alt.Order(field = 'Tier:O'))
pie = base.mark_arc(outerRadius = 115)
text = base.mark_text(radius = 140, size = 15).encode(text = alt.Text("Tier"))
acc_pie = pie + text
acc_pie
Initially, offset was used instead of anchor, manually specifying the title location in the x-axis by pixels. This produces a more replica-like result, as you define the texts to be exactly to the same place as the example. While this approach yields a result that closely mimics the example, we acknowledge that anchoring provides a faster and cleaner solution. The decision has been made to adopt anchoring for the remainder of this project, prioritizing efficiency and universality across all graphs, even if it means sacrificing pinpoint accuracy in text placement.
The HEX color code values of the palette from the book were acquired through the use of the online tool "Color Picker Online", which is freely accessible at https://imagecolorpicker.com/.
The pie chart above can now be easily modified to represent the percentage of total revenue.
# % of Revenue Pie Chart
base = alt.Chart(
table_charts,
title = alt.Title(r"% of Total Revenue", anchor = 'start', fontWeight = 'normal')
).encode(
theta = alt.Theta("% Revenue:Q", stack = True),
color = alt.Color('Tier',
scale = alt.Scale(
range = ['#4d71bc', '#5d9bd4', '#6fae45', '#febf0f', '#e77e2d', '#a6a6a6']
),
sort = None,
legend = None
),
order = alt.Order(field = 'Tier:O'))
pie = base.mark_arc(outerRadius = 115)
text = base.mark_text(radius = 140, size = 15).encode(text = alt.Text("Tier"))
rev_pie = pie + text
rev_pie
With both graphs available, we can add them next to each other and include a main title.
# Combining two pie charts using the vertical concatenation operator '|'
pies = acc_pie | rev_pie
# Setting properties for the combined pie charts
pies.properties(
title = alt.Title('New Client Tier Share', offset = 10, fontSize = 20) # Adding a title with specific offset and font size
)
Visualization as depicted in the book:

Pie charts can present readability challenges, as the human eye struggles to differentiate the relative volumes of slices effectively. While adding data percentages next to the slices can enhance comprehension, it may also introduce unnecessary clutter to the visualization.
Bar chart�
The next graph proposed to tackle is a horizontal bar chart. Since now the comparison does not involve angles and are aligned at the start point, discerning the segment's scale is easier.
This is the default representation in Altair:
# Default altair bar chart
alt.Chart(table_charts).mark_bar().encode(
y = alt.Y('Tier'),
x = alt.X('% Accounts')
)
The necessary adjustments involve placing the Tier label in the upper left corner, displaying values next to the bars instead of using an x-axis, and adding a title while rearranging the tiers.
# Creating a base chart with a title, aligned to the left and with normal font weight
base = alt.Chart(
table_charts,
title = alt.Title('TIER | % OF TOTAL ACCOUNTS', anchor = 'start', fontWeight = 'normal')
).mark_bar().encode(
y = alt.Y('Tier', title = None), # Encoding the 'Tier' field on the y-axis, without a specific title
x = alt.X('% Accounts', axis = None), # Encoding the '% Accounts' field on the x-axis, without axis labels
order = alt.Order(field = 'Tier:O'), # Ordering the bars based on the 'Tier' field
text = alt.Text("% Accounts", format = ".0f") # Displaying the '% Accounts' values as text, formatted to have no decimal places
)
# Creating the final bar chart by combining the bars and text labels
final_acc = base.mark_bar() + base.mark_text(align = 'left', dx = 2)
# Displaying the final bar chart
final_acc
Adding the order by Tier:O didn't had the same effect as it did on the pie chart. The compatible method for this case is adding a sort keyword in the axis to be sorted.
base = alt.Chart(
table_charts,
title = alt.Title('TIER | % OF TOTAL ACCOUNTS |', anchor = 'start', fontWeight = 'normal')
).encode(
y = alt.Y('Tier', sort = ["A+"], title = None), # Encoding the 'Tier' field on the y-axis, with a specific sorting order
x = alt.X('% Accounts', axis = None),
text = alt.Text("% Accounts", format = ".0f")
)
final_acc = (base.mark_bar() + base.mark_text(align = 'left', dx = 2)).properties(width = 150) # Setting te width
final_acc
Now we do the same for the revenue column. In addition, the y-axis is removed so it isn't repeated when uniting the charts.
base = alt.Chart(
table_charts,
title = alt.Title('% OF TOTAL REVENUE', anchor = 'start', fontWeight = 'normal')
).encode(
y = alt.Y('Tier', sort = ["A+"]).axis(None),
x = alt.X('% Revenue').axis(None),
text = alt.Text("% Revenue", format = ".0f")
)
final_rev = (base.mark_bar() + base.mark_text(align = 'left', dx = 2)).properties(width = 150)
final_rev
Similar to the pie chart, we can arrange these graphs side by side and include a main title.
# Combining two charts horizontally using the concatenation operator '|'
hor_bar = final_acc | final_rev
# Configuring the view of the combined chart, removing strokes
hor_bar.configure_view(stroke = None).properties(
title = alt.Title('New Client Tier Share', anchor = 'start', fontSize = 20) # Adding title
)
Visualization as depicted in the book:

In both the pie and bar chart, the labeling beside the value is not in the same position as the examples provided. This discrepancy arises from the fact that adjusting these labels to match the book's examples, with variations in positions (some inside and some outside of the pie), different colors, and even omitting some numbers, would be a labor-intensive manual task in Altair. These adjustments are primarily for aesthetic purposes and do not significantly impact readability, in some cases even obscuring the information being presented.
Examples of how to manually define labels will be presented in future exercises.
Horizontal dual series bar chart�
The two graphs in the last visualization can be merged into a single grouped bar chart.
# Altair with default settings
alt.Chart(table_charts).mark_bar().encode(
x = alt.X('value:Q'), # Encoding the quantitative variable 'value' on the x-axis
y = alt.Y('variable:N'), # Encoding the nominal variable 'variable' on the y-axis
color = alt.Color(
'variable:N',
legend = alt.Legend(title = 'Metric')
), # Encoding color based on 'variable' with legend title 'Metric'
row = alt.Row('Tier:O') # Faceting by rows based on the ordinal variable 'Tier'
).transform_fold(
fold = ['% Accounts', '% Revenue'], # Transforming the data by folding the specified columns
as_ = ['variable', 'value'] # Renaming the folded columns to 'variable' and 'value'
)
The necessary alterations involve removing the grid, adjusting label positions and reducing redundancy, adding a title and subtitle, and changing the color palette.
# Custom settings
merged_hor_bar = alt.Chart(
table_charts,
title = alt.Title('New client tier share', fontSize = 20) # Adding a title with specific font size
).mark_bar().encode(
x = alt.X(
'value:Q',
axis = alt.Axis(
title = "TIER | % OF TOTAL ACCOUNTS vs REVENUE", # Setting a custom title for the x-axis
grid = False, # Remove grid
orient = 'top', # Put axis on top
labelColor = "#888888", # Setting the label color as gray
titleColor = '#888888' # Setting the title color as gray
)
),
y = alt.Y(
'variable:N',
axis = alt.Axis(title = None, labels = False, ticks = False) # Removing y-axis title, labels, and ticks
),
color = alt.Color(
'variable:N',
legend = alt.Legend(title = 'Metric'), # Adding a legend with a custom title
scale = alt.Scale(range = ['#b4c6e4', '#4871b7']) # Setting a custom color range
),
row = alt.Row(
'Tier:O',
header = alt.Header(labelAngle = 0, labelAlign = "left"), # Rotating row labels and aligning to the left
title = None,
sort = ['A+'], # Sorting rows based on 'Tier'
spacing = 10 # Adding spacing between rows
)
).transform_fold(
fold = ['% Accounts', '% Revenue'], # Transforming the data by folding the specified columns
as_ = ['variable', 'value'] # Renaming the folded columns to 'variable' and 'value'
).properties(
width = 200 # Setting the width of the chart
).configure_view(stroke = None) # Removing the stroke from the view
merged_hor_bar
Visualization as depicted in the book:

Vertical bar chart�
We should can modify the bar chart to be in a vertical orientation. This can be done by switching the y and x axis and the Row class to the Column class, as well as reorient the labels.
# Creating a vertical bar chart
vert_bar = alt.Chart(
table_charts,
title = alt.Title('New client tier share', fontSize = 20) # Adding a title with specific font size
).mark_bar().encode(
y = alt.Y(
'value:Q',
axis = alt.Axis(
title = "% OF TOTAL ACCOUNTS vs REVENUE", # Setting a custom title for the y-axis
titleAlign = 'left', # Aligning the title to the left
titleAngle = 0, # Setting the title angle to 0 degrees
titleAnchor = 'end', # Anchoring the title to the end
titleY = -10, # Adjusting the title position
grid = False, # Turning off grid lines
labelColor = "#888888", # Setting the label color to gray
titleColor = '#888888' # Setting the title color to gray
)
),
x = alt.X(
'variable:N',
axis = alt.Axis(title = None, labels = False, ticks = False) # Removing x-axis title, labels, and ticks
),
color = alt.Color(
'variable:N',
legend = alt.Legend(title = 'Metric'), # Adding a legend with a custom title
scale = alt.Scale(range = ['#b4c6e4', '#4871b7']) # Setting a custom color range
),
column = alt.Column(
'Tier:O',
header = alt.Header(labelOrient = 'bottom', titleOrient = "bottom", titleAnchor = "start"), # Adjusting column header settings
sort = ['A+'], # Sorting columns based on 'Tier'
title = 'TIER' # Adding a title for the column
)
).transform_fold(
fold = ['% Accounts', '% Revenue'], # Transforming the data by folding the specified columns
as_ = ['variable', 'value'] # Renaming the folded columns to 'variable' and 'value'
).properties(
width = 50 # Setting the width of the chart
).configure_view(stroke = None) # Removing the stroke from the view
vert_bar
Visualization as depicted in the book:

It is worth noting that titles in Altair do not readily support the option of changing the colors of individual words within them. As a simple solution for the time being, we will retain the legend that effectively indicates which column corresponds to each word. Future exercises will delve into a more complicated way to tackle this challenge.
In the code above, we've utilized the transform_fold method to generate the grouped bar chart because our data is structured in the 'wide form', which is the standard Excel format. However, Altair (as well as other visualization languages) is inherently designed to work with 'long form' data. The transform_fold function automates this conversion within the chart, enabling us to create the graph. This approach can obscure the process, making it preferable to perform the data transformation before creating the visualizations.
# Transforms the data to the long-form format
melted_table = pd.melt(table_charts, id_vars = ['Tier'], var_name = 'Metric', value_name = 'Value')
melted_table
| Tier | Metric | Value | |
|---|---|---|---|
| 0 | A+ | # of Accounts | 19.000000 |
| 1 | A | # of Accounts | 77.000000 |
| 2 | B | # of Accounts | 338.000000 |
| 3 | C | # of Accounts | 425.000000 |
| 4 | D | # of Accounts | 24.000000 |
| 5 | All other | # of Accounts | 205.000000 |
| 6 | A+ | % Accounts | 1.746324 |
| 7 | A | % Accounts | 7.077206 |
| 8 | B | % Accounts | 31.066176 |
| 9 | C | % Accounts | 39.062500 |
| 10 | D | % Accounts | 2.205882 |
| 11 | All other | % Accounts | 18.841912 |
| 12 | A+ | Revenue ($M) | 3.927000 |
| 13 | A | Revenue ($M) | 4.675000 |
| 14 | B | Revenue ($M) | 5.984000 |
| 15 | C | Revenue ($M) | 2.805000 |
| 16 | D | Revenue ($M) | 0.374000 |
| 17 | All other | Revenue ($M) | 0.935000 |
| 18 | A+ | % Revenue | 21.000000 |
| 19 | A | % Revenue | 25.000000 |
| 20 | B | % Revenue | 32.000000 |
| 21 | C | % Revenue | 15.000000 |
| 22 | D | % Revenue | 2.000000 |
| 23 | All other | % Revenue | 5.000000 |
We can now use this table to remake the bar chart without the transform_fold method.
# Selecting specific rows from the melted table based on the 'Metric' column
selected_rows = melted_table[melted_table['Metric'].isin(['% Accounts', '% Revenue'])]
vert_bar2 = alt.Chart(
selected_rows,
title = alt.Title('New client tier share', fontSize = 20) # Adding a title with specific font size
).mark_bar().encode(
y = alt.Y(
'Value',
axis = alt.Axis(
title = "% OF TOTAL ACCOUNTS vs REVENUE", # Setting a custom title for the y-axis
titleAlign = 'left', # Aligning the title to the left
titleAngle = 0, # Setting the title angle to 0 degrees
titleAnchor = 'end', # Anchoring the title to the end
titleY = -10, # Adjusting the title position
grid = False, # Turning off grid lines
labelColor = "#888888", # Setting the label color to gray
titleColor = '#888888' # Setting the title color to gray
)
),
x = alt.X(
'Metric',
axis = alt.Axis(title = None, labels = False, ticks = False) # Removing x-axis labels and ticks
),
color = alt.Color(
'Metric',
scale = alt.Scale(range = ['#b4c6e4', '#4871b7']) # Setting a custom color range
),
column = alt.Column(
'Tier',
header = alt.Header(labelOrient='bottom', titleOrient="bottom", titleAnchor="start"), # Adjusting column header settings
sort = ['A+'], # Sorting columns based on 'Tier'
title = 'TIER' # Adding a title for the column
)
).properties(
height = 200, width = 50 # Setting the height and width of the chart
).configure_view(stroke = None) # Removing the stroke from the view
vert_bar2
Bar chart with lines�
The next proposed graph is an extension of the previous bar chart, featuring the addition of lines to accentuate the endpoints of the columns within the same tier.
However, due to the nature of faceted charts, we encounter an error (ValueError: Faceted charts cannot be layered. Instead, layer the charts before faceting) when attempting to layer it. This issue arises because, in faceted charts, the x-axis structure is altered.
Now that we've transformed our data into long-format, we can work around this problem by creating our graph without using the column method, and thereby, avoiding faceting. Instead of specifying x as Metric, y as Value, color as Metric, and column as Tier, we can redefine x as Tier, y as Value, color as Metric, and introduce XOffset for controlling the horizontal positioning of data points within a group. In essence, column primarily serves to define distinct x-axis categories, while XOffset is employed to manage the horizontal placement of data points within a group.
The following chart incorporates the alterations we discussed and yields a graph that closely resembles the previous one.
bar = alt.Chart(
selected_rows,
title = alt.Title('New client tier share', fontSize = 20, anchor = 'start')
).mark_bar().encode(
x = alt.X(
'Tier',
axis = alt.Axis(
title = 'TIER', # Setting a custom title for the x-axis
labelAngle = 0, # Setting the label angle to 0 degrees
titleAnchor = "start", # Anchoring the title to the start
domain = False, # Hiding the x-axis domain line
ticks = False # Hiding the x-axis ticks
),
sort = ['A+'] # Sorting x-axis based on 'Tier'
),
y = alt.Y(
'Value',
axis = alt.Axis(
title = "% OF TOTAL ACCOUNTS vs REVENUE", # Setting a custom title for the y-axis
titleAlign = 'left', # Aligning the title to the left
titleAngle = 0, # Setting the title angle to 0 degrees
titleAnchor = 'end', # Anchoring the title to the end
titleY = -10, # Adjusting the title position
grid = False, # Turning off grid lines
labelColor = "#888888", # Setting the label color to gray
titleColor =' #888888' # Setting the title color to gray
)
),
color = alt.Color(
'Metric',
scale = alt.Scale(range = ['#b4c6e4', '#4871b7']) # Setting a custom color range
),
xOffset = 'Metric' # Adjusting the x-offset
).properties(
height = 250, width = 375 # Setting the height and width of the chart
)
bar.configure_view(stroke = None) # Removing the stroke from the view
bar
Now, we can layer the graph and introduce the lines. It's worth noting that creating the lines in Altair is not a straightforward task and a considerable amount of documentation searching was necessary to achieve it.
# x, y and y2 do not accept to be defined as "condition", so repetitive code is necessary
# Create a vertical rule chart for ascending lines
rule_asc = alt.Chart(selected_rows).mark_rule(x2Offset = 10, xOffset = -10).encode(
x = alt.X('Tier', sort = ['A+']), # X-axis encoding for 'Tier', sorted in a specific order
x2 = alt.X2('Tier'), # End point of the rule line
y = alt.Y('min(Value)'), # Start point of the rule line
y2 = alt.Y2('max(Value)'), # End point of the rule line
strokeWidth = alt.value(2), # Set the stroke width of the rule line
opacity = alt.condition(
(alt.datum.Tier == 'A+') | # Condition for specific tiers
(alt.datum.Tier == 'A') | # to determine opacity settings
(alt.datum.Tier == 'B'),
alt.value(1), alt.value(0) # Opacity set to 1 if condition is met, else 0
)
)
# Create a vertical rule chart for descending lines
rule_desc = alt.Chart(selected_rows).mark_rule(x2Offset = 10, xOffset = -10
).encode(
x = alt.X('Tier', sort = ['A+']),
x2 = alt.X2('Tier'),
y = alt.Y('max(Value)'),
y2 = alt.Y2('min(Value)'),
strokeWidth = alt.value(2),
opacity = alt.condition(
(alt.datum.Tier == 'A+') |
(alt.datum.Tier == 'A') |
(alt.datum.Tier == 'B'),
alt.value(0), alt.value(1)
)
)
# Points of % Revenue where % Revenue > % Accounts
points1 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = 10, color = "black").encode(
x = alt.X('Tier', sort = ['A+']),
y = alt.Y('max(Value)'),
opacity = alt.condition(
(alt.datum.Tier == 'A+') |
(alt.datum.Tier == 'A') |
(alt.datum.Tier == 'B'),
alt.value(1), alt.value(0)
)
)
# Points of % Revenue where % Revenue < % Accounts
points2 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = 10, color = "black").encode(
x = alt.X('Tier', sort = ['A+']),
y = alt.Y('min(Value)'),
opacity = alt.condition(
(alt.datum.Tier == 'A+') |
(alt.datum.Tier == 'A') |
(alt.datum.Tier == 'B'),
alt.value(0), alt.value(1)
)
)
# Points of % Accounts where % Revenue < % Accounts
points3 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = -10, color = "black").encode(
x = alt.X('Tier', sort = ['A+']),
y = alt.Y('max(Value)'),
opacity = alt.condition(
(alt.datum.Tier == 'A+') |
(alt.datum.Tier == 'A') |
(alt.datum.Tier == 'B'),
alt.value(0), alt.value(1)
)
)
# Points of % Revenue where % Revenue > % Accounts
points4 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = -10, color = "black").encode(
x = alt.X('Tier', sort = ['A+']),
y = alt.Y('min(Value)'),
opacity = alt.condition(
(alt.datum.Tier == 'A+') |
(alt.datum.Tier == 'A') |
(alt.datum.Tier == 'B'),
alt.value(1), alt.value(0)
)
)
bar_point = bar + rule_asc + rule_desc + points1 + points2 + points3 + points4
bar_point.configure_view(stroke = None)
Visualization as depicted in the book:

Lines only�
With two types of visualizations displaying the same data, the book suggests to eliminate the bars altogether. This can be done without the need to program more graphs:
# Configure the legend to be disabled (hidden)
# The 'opacity' configuration is set to 0, making the bars transparent
point = bar_point.configure_mark(opacity = 0).configure_view(stroke = None).configure_legend(disable = True)
point
Visualization as depicted in the book:

Slope graph�
At last, we can reassemble the lines to create a slope graph.
# Create base chart, setting the title
base = alt.Chart(
selected_rows,
title = alt.Title("New client tier share", anchor = 'start', fontWeight = 'normal', fontSize = 20)
)
# Line chart configuration
line = base.mark_line(
point = True # The lines have points at the end
).encode(
x = alt.X(
'Metric',
axis = alt.Axis(title = None, labelAngle = 0, domain = False, ticks = False)
),
y = alt.Y('Value', axis = None),
color = alt.Color(
'Tier',
scale = alt.Scale(range = ['black']), # All Tier lines are black
legend = None
)
).properties(
width = 300,
height = 350
)
# Labels to the right of the slope
# These labels are for the Accounts
labels1 = base.mark_text(
align = 'left',
dx = 10
).encode(
x = alt.X('Metric'),
y = alt.Y('Value'),
text = alt.Text('Value:Q', format = '.0f'),
opacity = alt.condition(alt.datum.Metric == '% Accounts', alt.value(0), alt.value(0.7))
)
# Labels to the left of the slope
# These labels are for the Revenue
labels2 = base.mark_text(
align ='left',
dx = -20
).encode(
x = alt.X('Metric'),
y = alt.Y('Value'),
text = alt.Text('Value:Q', format='.0f'),
opacity = alt.condition(alt.datum.Metric == '% Accounts', alt.value(0.7), alt.value(0))
)
# Labels for the Tiers
tier_labels = base.mark_text(
align = 'left',
dx = 30,
fontWeight = 'bold'
).encode(
x = alt.X('Metric'),
y = alt.Y('Value'),
text = 'Tier',
opacity = alt.condition(alt.datum.Metric == '% Accounts', alt.value(0), alt.value(1)) # Show in only one side
)
# Tier title
tier_title = alt.Chart(
{"values": [{"text": ['TIER']}]}
).mark_text(
align = "left",
dx = 105,
dy = -120,
fontWeight = 'bold'
).encode(
text = "text:N")
slope = line + labels1 + labels2 + tier_labels + tier_title
slope.configure_view(stroke = None)
Notice how there are two numbers overlapping, the percentage of accounts of the A+ and D tiers. Since they are the same number when rounded, we can just eliminate one of the values display.
# Eliminate Tier A+ from display
label_condition = (alt.datum.Metric == '% Accounts') & (alt.datum.Tier != 'A+')
labels2 = base.mark_text(
align ='left',
dx = -20
).encode(
x = alt.X('Metric'),
y = alt.Y('Value'),
text = alt.Text('Value:Q', format='.0f'),
opacity = alt.condition(label_condition, alt.value(0.7), alt.value(0))
)
slope = line + labels1 + labels2 + tier_labels + tier_title
slope.configure_view(stroke = None)
Visualization as depicted in the book:

Interactivity�
In this exercise, the selected graph for interactivity is the simple vertical bar chart, without the lines.
The chosen interactive features include a simple tooltip, revealing the precise values of each column upon hovering. Additionally, it shows the legend to provide further clarity about the corresponding data categories.
Finally, the columns are designed to highlight dynamically when the viewer hovers over them. Because of this feature, the color palette was changed, since the monochromatic version made the highlighted column and the not highlighted neighbor too similar.
# Selection for interactive points on hover
hover = alt.selection_point(on = 'mouseover', nearest = True, empty = False)
# Bar chart configuration with interactivity
bar_interactive = alt.Chart(
selected_rows,
title = alt.Title('New client tier share', fontSize = 20, anchor = 'start')
).mark_bar().encode(
x = alt.X(
'Tier',
axis = alt.Axis(title = 'TIER', labelAngle = 0, titleAnchor = "start", domain = False, ticks = False),
sort = ['A+']
),
y = alt.Y('Value', axis = alt.Axis(
title = "% OF TOTAL ACCOUNTS vs REVENUE",
titleAlign = 'left',
titleAngle = 0,
titleAnchor = 'end',
titleY = -10,
grid = False,
labelColor = "#888888",
titleColor = '#888888'
)),
color = alt.Color('Metric', scale = alt.Scale(range = ['#0a2f73', '#096b2b'])),
xOffset = 'Metric',
opacity = alt.condition(hover, alt.value(1), alt.value(0.5)), # Set opacity based on hover
tooltip = ['Value:Q', 'Metric'] # Show tooltip with specified fields
).properties(
height = 250, width = 375
).add_params(hover) # Add the hover selection to the chart
# Configure view settings for the interactive bar chart
bar_interactive.configure_view(stroke = None)
Exercise 2.4 - Practice in your tool�
This exercise proposes to display the same data in six different formats, hand-drawn by the author in the theoretical exercise 2.3. The purpose of the activity is to practice in our own tool, and while C. Nussbaumer uses Excel, we will proceed with Altair.
If you would like to return to the Table of Contents, you can click here.

Loading the data�
# Loading considering the NaN caused by Excel formatting
table = pd.read_excel(r"Data\2.4 EXERCISE.xlsx", usecols = [1, 2, 3], header = 4)
table
| DATE | CAPACITY | DEMAND | |
|---|---|---|---|
| 0 | 2019-04 | 29263 | 46193 |
| 1 | 2019-05 | 28037 | 49131 |
| 2 | 2019-06 | 21596 | 50124 |
| 3 | 2019-07 | 25895 | 48850 |
| 4 | 2019-08 | 25813 | 47602 |
| 5 | 2019-09 | 22427 | 43697 |
| 6 | 2019-10 | 23605 | 41058 |
| 7 | 2019-11 | 24263 | 37364 |
| 8 | 2019-12 | 24243 | 34364 |
| 9 | 2020-01 | 25533 | 34149 |
| 10 | 2020-02 | 24467 | 25573 |
| 11 | 2020-03 | 25194 | 25284 |
In the graphs for this exercise, we require the inclusion of the "unmet demand" column, which is currently absent from the dataset. To obtain this value, we can calculate the difference between demand and capacity for each date.
# Calculate Unmet Demand
table['UNMET DEMAND'] = table['DEMAND'] - table['CAPACITY']
# Show only the first five lines
table.head()
| DATE | CAPACITY | DEMAND | UNMET DEMAND | |
|---|---|---|---|---|
| 0 | 2019-04 | 29263 | 46193 | 16930 |
| 1 | 2019-05 | 28037 | 49131 | 21094 |
| 2 | 2019-06 | 21596 | 50124 | 28528 |
| 3 | 2019-07 | 25895 | 48850 | 22955 |
| 4 | 2019-08 | 25813 | 47602 | 21789 |
Now we transform the data from the wide-format used in Excel to the long-format used in Altair.
# Transforming data into long-format
melted_table = pd.melt(table, id_vars = ['DATE'], var_name = 'Metric', value_name = 'Value')
melted_table
| DATE | Metric | Value | |
|---|---|---|---|
| 0 | 2019-04 | CAPACITY | 29263 |
| 1 | 2019-05 | CAPACITY | 28037 |
| 2 | 2019-06 | CAPACITY | 21596 |
| 3 | 2019-07 | CAPACITY | 25895 |
| 4 | 2019-08 | CAPACITY | 25813 |
| 5 | 2019-09 | CAPACITY | 22427 |
| 6 | 2019-10 | CAPACITY | 23605 |
| 7 | 2019-11 | CAPACITY | 24263 |
| 8 | 2019-12 | CAPACITY | 24243 |
| 9 | 2020-01 | CAPACITY | 25533 |
| 10 | 2020-02 | CAPACITY | 24467 |
| 11 | 2020-03 | CAPACITY | 25194 |
| 12 | 2019-04 | DEMAND | 46193 |
| 13 | 2019-05 | DEMAND | 49131 |
| 14 | 2019-06 | DEMAND | 50124 |
| 15 | 2019-07 | DEMAND | 48850 |
| 16 | 2019-08 | DEMAND | 47602 |
| 17 | 2019-09 | DEMAND | 43697 |
| 18 | 2019-10 | DEMAND | 41058 |
| 19 | 2019-11 | DEMAND | 37364 |
| 20 | 2019-12 | DEMAND | 34364 |
| 21 | 2020-01 | DEMAND | 34149 |
| 22 | 2020-02 | DEMAND | 25573 |
| 23 | 2020-03 | DEMAND | 25284 |
| 24 | 2019-04 | UNMET DEMAND | 16930 |
| 25 | 2019-05 | UNMET DEMAND | 21094 |
| 26 | 2019-06 | UNMET DEMAND | 28528 |
| 27 | 2019-07 | UNMET DEMAND | 22955 |
| 28 | 2019-08 | UNMET DEMAND | 21789 |
| 29 | 2019-09 | UNMET DEMAND | 21270 |
| 30 | 2019-10 | UNMET DEMAND | 17453 |
| 31 | 2019-11 | UNMET DEMAND | 13101 |
| 32 | 2019-12 | UNMET DEMAND | 10121 |
| 33 | 2020-01 | UNMET DEMAND | 8616 |
| 34 | 2020-02 | UNMET DEMAND | 1106 |
| 35 | 2020-03 | UNMET DEMAND | 90 |
To simplify the data transformation process in the graphs, we will deviate from the "yyyy-mm" format for the date. Instead, we will create two separate columns, one for the year and another for the abbreviated name of the month. This adjustment will streamline our visualization efforts by reducing the need for extensive data transformations within the graphs themselves.
# Transform the column into datetime format
melted_table['DATE'] = pd.to_datetime(melted_table['DATE'])
# Extracting year and month
melted_table['year'] = melted_table['DATE'].dt.year
melted_table['month'] = melted_table['DATE'].apply(lambda x: x.strftime('%b'))
# The DATE column is no longer useful
melted_table.drop('DATE', axis = 1, inplace = True)
melted_table.head()
| Metric | Value | year | month | |
|---|---|---|---|---|
| 0 | CAPACITY | 29263 | 2019 | Apr |
| 1 | CAPACITY | 28037 | 2019 | May |
| 2 | CAPACITY | 21596 | 2019 | Jun |
| 3 | CAPACITY | 25895 | 2019 | Jul |
| 4 | CAPACITY | 25813 | 2019 | Aug |
To further avoid data manipulation within the chart code, we will create auxiliary tables.
# Making new sets of data
# Just 2019
table_2019 = melted_table[melted_table['year'].isin([2019])]
# Just Demand from 2019
demand_2019 = table_2019[table_2019['Metric'].isin(['DEMAND'])]
# Just Capacity from 2019
capacity_2019 = table_2019[table_2019['Metric'].isin(['CAPACITY'])]
# Just Unmet Demand from 2019
unmet_2019 = table_2019[table_2019['Metric'].isin(['UNMET DEMAND'])]
# Demand and Capacity from 2019
bar_table = table_2019[table_2019['Metric'].isin(['CAPACITY', 'DEMAND'])]
# Unmet Demand and Capacity from 2019
stacked_table = table_2019[table_2019["Metric"].isin(["CAPACITY", "UNMET DEMAND"])]
Bar chart�
While the author deliberately filled the Capacity columns while leaving Demand only outlined in the attempt to visually distinguish between what can be fulfilled (Capacity) and the unmet portion of the requirement (Unmet Demand), Altair is not easily compatible with that choice.
The variable which dictates if the mark will be filled does not accept a condition as its value. Since the author itself admits the shortcomings of this approach ("I find the outline plus the white space between the bars visually jarring"), we chose to differentiate the data by color, as it is traditional.
# Unfilled version
alt.Chart(
bar_table,
title = alt.Title(
"Demand vs capacity over time", anchor = "start", offset = 20, fontSize = 16 # Set customized title
),
).mark_bar(filled = False).encode( # Filled = False makes the bars unfilled
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False, # Removes grid
titleAnchor = "end",
labelColor = "#888888", # Changes the label color to gray
titleColor = "#888888", # Changes the title color to gray
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]), # y-axis goes from 0 to 60000
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0, # Makes label horizontal
titleAnchor = "start",
labelColor = "#888888", # Changes label and title color to gray
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False # Removes ticks from axis
),
title = "2019"
),
color = alt.Color( # Sets colors based on Metric (Demand and Capacity)
"Metric", scale = alt.Scale(range = ["#b4c6e4", "#4871b7"]), sort = "descending"
),
xOffset = alt.XOffset("Metric", sort = "descending") # Sets offset on the x-axis
).configure_view(
stroke = None
) # Remove the chart border
# Filled version
alt.Chart(
bar_table,
title = alt.Title(
"Demand vs capacity over time", anchor = "start", offset = 20, fontSize = 16 # Set customized title
),
).mark_bar().encode( # Filled = True is default
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False, # Removes grid
titleAnchor = "end",
labelColor = "#888888", # Changes the label color to gray
titleColor = "#888888", # Changes the title color to gray
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]), # y-axis goes from 0 to 60000
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0, # Makes label horizontal
titleAnchor = "start",
labelColor = "#888888", # Changes label and title color to gray
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False # Removes ticks from axis
),
title = "2019"
),
color = alt.Color( # Sets colors based on Metric (Demand and Capacity)
"Metric", scale = alt.Scale(range = ["#b4c6e4", "#4871b7"]), sort = "descending"
),
xOffset = alt.XOffset("Metric", sort = "descending") # Sets offset on the x-axis
).configure_view(
stroke = None
) # Remove the chart border
Visualization as depicted in the book:

Line graph�
Cleaner than the bar chart, the next step was to convey the data using the line graph, with the labeling beside each line, along with the final value of the year. This helps the viewer to visualize the difference between the capacity and the demand.
line = (
alt.Chart(
bar_table,
title = alt.Title(
"Demand vs capacity over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10 # Offsets the title in the y-axis
)
)
.mark_line() # Using a line mark for the chart
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None, # Disabling sorting for better time representation
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888", # Set colors to gray
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#1f77b4", "#1f77b4"]),
legend = None
),
strokeWidth = alt.condition(
"datum.Metric == 'CAPACITY'", alt.value(3), alt.value(1)
) # Adjusting line thickness based on the metric
)
.properties(width = 350, height = 250) # Set size of the graph
)
# Adding labels
label = (
alt.Chart(bar_table)
.mark_text(align = "left", dx = 3)
.encode(
x = alt.X("month", sort = None, aggregate = "max"),
y = alt.Y("Value", aggregate = {"argmax": "month"}),
text = alt.Text("Metric"), # The text itself is the Metric
color = alt.Color("Metric", scale = alt.Scale(range = ["#1f77b4", "#1f77b4"]))
)
)
# Combining the line chart and labels
line + label
As it is possible to notice, defining the label position as the maximum argument of the y-axis did not yield the intended result. This is because Altair is considering the values in an alphabetical order (making Sept the last month), even when setting sort = None in the x-axis.
Since documentation fixing this issue was not found, the next approach was adding the label manually. This also assist the process of adding the value next to the metric.
# Demand label
label1 = alt.Chart({"values":
[{"text": ['34K DEMAND']}]
}
).mark_text(size = 10,
align = "left",
dx = 160, dy = -15,
color = '#1f77b4' # Color it blue
).encode(text = "text:N")
# Capacity label
label2 = alt.Chart({"values":
[{"text": ['24K CAPACITY']}]
}
).mark_text(size = 10,
align = "left",
dx = 160, dy = 25,
color = '#1f77b4', # Color it blue
fontWeight = 'bold'
).encode(text = "text:N")
line_final = line + label1 + label2
line_final.configure_view(stroke = None)
Visualization as depicted in the book:

Overlapping bars�
The author now explores overlapping bars, wherein two bar graphs are positioned on top of each other, sharing the same axis. The Capacity data is displayed with transparency to prevent any potential confusion that might arise with a stacked bar chart.
In this particular graph, our choice was to emulate the column labeling using a title with different colors, despite Altair not providing a straightforward method for such customization. Unlike previous examples where the default legend effectively distinguished colors, the current data distinction — "opaque" or "transparent" — is better conveyed by utilizing normal or bold text in the title instead of relying on a legend with colors.
# Demand bar, with bigger spacing between them, unfilled
demand = (
alt.Chart(
demand_2019,
width = alt.Step(40), # Defines the width of the bars (including distance between them)
title = alt.Title(
"Demand vs capacity over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start"
)
)
.mark_bar(filled = False) # Unfilled
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888", # Gray label and title
titleColor = "#888888"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0, # Horizontal label
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
ticks = False
),
title = "2019"
)
)
)
# Capacity bar, bigger size and more transparency
capacity = (
alt.Chart(capacity_2019)
.mark_bar(size = 30) # Makes the bar thicker but keeps the distance the same
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
opacity = alt.value(0.5) # Makes the bar transparent
)
)
# Labeling for subtitle
label1 = (
alt.Chart({"values": [{"text": ["DEMAND |"]}]})
.mark_text(
size = 10,
align = "left",
dx = -235,
dy = -120,
color = "#1f77b4"
)
.encode(text = "text:N")
)
label2 = (
alt.Chart({"values": [{"text": ["CAPACITY"]}]})
.mark_text(
size = 10,
align = "left",
dx = -177,
dy = -120,
color = "#1f77b4",
fontWeight = 800
)
.encode(text = "text:N")
)
overlap = capacity + demand + label1 + label2
# Sets space (padding) between bands
overlap.configure_scale(
bandPaddingInner = 0.5
).configure_view(
stroke = None
).properties(
height = 200
)
Visualization as depicted in the book:

Stacked bars�
In the stacked bars configuration, the Demand bar chart has been replaced with Unmet Demand (i.e., Demand - Capacity). This modification allows the stacking to represent the cumulative Demand value. Additionally, a color adjustment has been made, with Unmet Demand now rendered in a darker shade to emphasize its significance as a more meaningful metric.
# Stacked bar
bars = (
alt.Chart(
stacked_table,
title = alt.Title(
"Demand vs capacity over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10 # Offsets title in the y-axis
)
)
.mark_bar(size = 25)
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888", # Sets title and label to gray
titleColor = "#888888",
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#d9dad9", "#4871b7"])
),
order = alt.Order("Metric", sort = "ascending") # Unmet demand on top
)
)
# Border detail, makes the graph more visible
border = (
alt.Chart(stacked_table)
.mark_bar(size = 25, filled = False) # Makes an unfilled bar
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
ticks = False
),
title = "2019"
),
order = alt.Order("Metric", sort = "ascending")
)
)
stacked = bars + border
stacked.configure_view(stroke = None).properties(width = 300, height = 200)
Dot plot�
For the next graph, a dot plot, the author reveals the challenges she had in Excel. To create the circles, she employed data markers from two line graphs, concealing the lines themselves. The region connecting the dots was achieved by employing a stacked bar of Unmet Demand, sitting on top of an transparent Capacity series.
This serves as a noteworthy example of the limitations of Excel when dealing with charts not inherently programmed into the tool. While certain graphs may be more straightforward in the Microsoft tool, unconventional visualizations might demand intricate and obscure workarounds, while in Altair, where the approach to data visualization is more flexible, documentation for this graph was readily available.
# Unfilled version
dots1 = (
alt.Chart(
bar_table,
title = alt.Title(
"Demand vs capacity over time", # Set title
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_circle(size = 600, opacity = 1) # Maximum opacity
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleFontWeight = "normal",
titleColor = "#888888",
titleAnchor = "start",
ticks = False,
labels = False, # Removes labels from axis
domain = False # Removes line from axis
),
scale = alt.Scale(domain = [0, 60000]),
title = "# OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
titleFontWeight = "normal",
labelColor = "#888888",
labelPadding = 10, # Makes label more distant to the axis
titleColor = "#888888",
ticks = False
),
title = "2019"
),
color = alt.Color("Metric", scale = alt.Scale(range = ["#4871b7"]), legend = None)
)
.properties(width = 400, height = 250)
.transform_filter(alt.datum.Metric == "CAPACITY") # Alternative way to filter, similar to using auxiliary table
)
dots2 = ( # Same graph but to Demand
alt.Chart(
bar_table,
title = alt.Title(
"Demand vs capacity over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_circle(size = 600, opacity = 1, filled = False) # Unfilled
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleFontWeight = "normal",
titleColor = "#888888",
titleAnchor = "start",
ticks = False,
labels = False,
domain = False
),
scale = alt.Scale(domain = [0, 60000]),
title = "# OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
titleFontWeight = "normal",
labelColor = "#888888",
labelPadding = 10, # Makes label more distant to axis
titleColor = "#888888",
ticks = False
),
title = "2019"
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#b4c6e4"]),
legend = None
) # Darker blue
)
.properties(width = 400, height = 250)
.transform_filter(alt.datum.Metric == "DEMAND")
)
# Lines between dots
line = (
alt.Chart(bar_table)
.mark_line(strokeWidth = 25, opacity = 0.25) # Sets width and transparency
.encode(
x = alt.X("month", sort = None),
y = "Value",
detail = "month") # detail = month makes a line per month
) # instead of a single one
# Text inside the dots
text = (
alt.Chart(bar_table)
.mark_text()
.encode(
x = alt.X("month", sort = None),
y = "Value",
text = alt.Text(
"Value:Q",
format = ".2s"
), # Formats 10000 as 10k
color = alt.condition(
alt.datum.Metric == "DEMAND",
alt.value("black"),
alt.value("white") # Set color depending on metric
)
)
)
# Set legend for Metric
label1 = (
alt.Chart({"values": [{"text": ["DEMAND"]}]})
.mark_text(
size = 11,
align = "left",
dx = 200,
dy = -17,
color = "#4871b7"
)
.encode(text = "text:N")
)
label2 = (
alt.Chart({"values": [{"text": ["CAPACITY"]}]})
.mark_text(
size = 11,
align = "left",
dx = 200,
dy = 25,
color = "#4871b7",
fontWeight = "bold"
)
.encode(text = "text:N")
)
dot_plot = line + dots1 + dots2 + text + label1 + label2
dot_plot.configure_view(stroke = None)
# Dot plot, filled-only version
dots = (
alt.Chart(
bar_table,
title = alt.Title(
"Demand vs capacity over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_circle(size = 600, opacity = 1) # Max opacity, filled by default
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleFontWeight = "normal",
titleColor = "#888888",
titleAnchor = "start",
ticks = False,
labels = False,
domain = False
),
scale = alt.Scale(domain = [0, 60000]),
title = "# OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
titleFontWeight = "normal",
labelColor = "#888888",
labelPadding = 10,
titleColor = "#888888",
ticks = False
),
title = "2019"
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#4871b7", "#b4c6e4"]),
legend = None
)
)
.properties(width = 400, height = 250)
)
dot_plot = line + dots + text + label1 + label2
dot_plot.configure_view(stroke = None)
Visualization as depicted in the book:

Graph the difference�
For the final visualization, it was chosen a simple line plot representing the unmet demand. Although minimalist and clean, this choice occults data from the actual value of demand and capacity.
alt.Chart(
unmet_2019,
title = alt.Title(
"Unmet demand over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
),
).mark_line().encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title="2019",
),
strokeWidth = alt.value(3) # Set thickness of the line
).properties(
width = 375, height = 250
).configure_view(
stroke = None
)
Visualization as depicted in the book:

Interactivity�
The initial idea was to make an interactive version of the Overlapping graph where the opacity adjusts when the mouse in near the column, similarly to the one in Exercise 2.1. Despite the tooltip functioning correctly, the hover feature failed to identify all graphs in the layering. As a result, only the Demand bars were highlighted upon hovering it.
# Define hover
hover = alt.selection_point(on = "mouseover", nearest = True, empty = False)
demand = (
alt.Chart(
demand_2019,
width = alt.Step(40),
title = alt.Title(
"Demand vs capacity over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start"
)
)
.mark_bar()
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
ticks = False
),
title = "2019"
),
opacity = alt.condition(
hover,
alt.value(1),
alt.value(0.5)
), # Set opacity according to hover
tooltip = ["Value:Q", "Metric"]
)
.add_params(hover)
)
# Capacity bar, bigger size and more transparency
capacity = (
alt.Chart(capacity_2019)
.mark_bar(size = 30)
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
opacity = alt.condition(hover, alt.value(1), alt.value(0.5)),
tooltip = ["Value:Q", "Metric"]
)
.add_params(hover)
)
overlap = demand + capacity
overlap.configure_scale(
bandPaddingInner = 0.5
).configure_view(
stroke = None
).properties(
height = 200
)
Due to a lack of documentation about interactivity in layered charts, no solution to this problem was found.
The final interactive graph for this exercise was the line graph. Rather than keeping the "Capacity and Demand" graph distinct from the "Unmet Demand" one, we opted for a consolidated approach with three lines.
Users can now choose the data to emphasize through a dropdown box. Additionally, each year features a tooltip marked by a point for enhanced clarity.
# Sorts so that the metrics in the tooltip aligns with the graphs order
custom_order = ["DEMAND", "CAPACITY", "UNMET DEMAND"]
sorted_metrics = [metric for metric in custom_order if metric in table_2019["Metric"].unique()]
# Creates the dropdown
dropdown = alt.binding_select(options = list(sorted_metrics), name = "SELECT LINE: ")
selection = alt.selection_point(fields = ["Metric"], bind = dropdown)
line = (
alt.Chart(
table_2019,
title = alt.Title(
"Demand and capacity over time: unmet demand calculated", # Changes title
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
),
)
.mark_line(point = True) # Create a point at each month
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None, # Disabling sorting for better time representation
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888", # Set colors to gray
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#1f77b4", "#1f77b4", "red"]), # Unmet demand is red
legend = None
),
opacity = alt.condition(
selection, alt.value(1), alt.value(0.1)
), # Adjusting opacity based on the dropdown
tooltip = ["Metric", "month", "Value"] # Sets tooltip
)
.properties(width = 350, height = 250)
).add_params(selection)
# Demand label
label1 = alt.Chart({"values":
[{"text": ['DEMAND']}]
}
).mark_text(size = 10,
align = "left",
dx = 165, dy = -15,
color = '#1f77b4', # Color it blue
fontWeight = 700 # Bold font
).encode(
text = "text:N",
opacity = alt.condition(
selection,
alt.value(1),
alt.value(0)
) # Label disappears when any line is selected
)
# Capacity label
label2 = alt.Chart({"values":
[{"text": ['CAPACITY']}]
}
).mark_text(size = 10,
align = "left",
dx = 165, dy = 25,
color = '#1f77b4', # Color it blue
fontWeight = 700
).encode(
text = "text:N",
opacity = alt.condition(
selection,
alt.value(1),
alt.value(0)
) # Label disappears when any line is selected
)
# Unmet demand label
label3 = alt.Chart({"values":
[{"text": ['UNMET DEMAND']}]
}
).mark_text(size = 10,
align = "left",
dx = 165, dy = 82,
color = 'red', # Color it blue
fontWeight = 700
).encode(
text = "text:N",
opacity = alt.condition(
selection,
alt.value(1),
alt.value(0)
) # Label disappears when any line is selected
)
line_final = line + label1 + label2 + label3
line_final.configure_view(stroke = None)
Alternatively, we can make a simple selection with a radio button, including an option for all lines.
options = ['DEMAND', 'CAPACITY', 'UNMET DEMAND']
labels = [option + ' ' for option in options]
input_dropdown = alt.binding_radio(
options = options + [None], # Create the option for all lines
labels = labels + ['ALL'],
name = 'SELECT LINE: '
)
# Create the selection
selection = alt.selection_point(
fields = ['Metric'],
bind = input_dropdown
)
line = (
alt.Chart(
table_2019,
title = alt.Title(
"Demand and capacity over time: unmet demand calculated", # Changes title
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
),
)
.mark_line(point = True) # Create a point at each month
.encode(
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
scale = alt.Scale(domain = [0, 60000]),
title = "NUMBER OF PROJECT HOURS"
),
x = alt.X(
"month",
sort = None, # Disabling sorting for better time representation
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888", # Set colors to gray
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
color = alt.Color(
"Metric",
scale = alt.Scale(domain = options) # Instead of changing the opacity
), # in this example, we will limit the domain
tooltip = ["Metric", "month", "Value"] # Sets tooltip
)
.properties(width = 350, height = 250)
).add_params(selection).transform_filter(selection)
line_final = line
line_final.configure_view(stroke = None)
Loading the data�
# Loading considering the NaN caused by Excel formatting
table = pd.read_excel(r"Data\2.5 EXERCISE.xlsx", usecols = [1, 2], header = 5)
table
| Year | Attrition Rate | |
|---|---|---|
| 0 | 2019 | 0.0910 |
| 1 | 2018 | 0.0820 |
| 2 | 2017 | 0.0450 |
| 3 | 2016 | 0.1230 |
| 4 | 2015 | 0.0560 |
| 5 | 2014 | 0.1510 |
| 6 | 2013 | 0.0700 |
| 7 | 2012 | 0.0100 |
| 8 | 2011 | 0.0200 |
| 9 | 2010 | 0.0970 |
| 10 | AVG | 0.0745 |
First, we will drop the AVG (Average) column, as it will not be a data point in our graphs. It is better to calculate it separately when needed.
table.drop(10, inplace = True)
Dot plot�
When attempting the first scatter plot, we realize that Altair incorrectly classifies the data type of the "Year" column. This can be fixed by specifying the correct date type (:O, as of, Ordinary).
# Without data type
alt.Chart(table).mark_point(filled = True).encode(
x = alt.X('Year'),
y = alt.Y('Attrition Rate')
)
# With data type equals temporal
alt.Chart(table).mark_point(filled = True).encode(
x = alt.X('Year:T'),
y = alt.Y('Attrition Rate')
)
# With data type equals ordinal
alt.Chart(table).mark_point(filled = True).encode(
x = alt.X('Year:O'),
y = alt.Y('Attrition Rate')
)
Initially, we will create a dot plot to visually represent the data over time, incorporating an average line to facilitate comparison.
# Create the base graph with title
base = alt.Chart(
table,
title = alt.Title(
"Attrition rate over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
# Make the filled dots
dots = base.mark_point(filled = True, size = 50, color = "#2c549d").encode(
x = alt.X(
"Year:O",
axis = alt.Axis(labelAngle = 0, labelColor = "#888888", ticks = False),
title = None,
scale = alt.Scale(align = 0) # Align the first dot with the y-axis (0 at x-axis)
),
y = alt.Y(
"Attrition Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
tickCount = 9, # Set fixed number of ticks (intervals)
format = "%", # y-axis data is a percentage
titleFontWeight = "normal"
),
title = "ATTRITION RATE"
),
opacity = alt.value(1) # Maximum opacity
)
# Makes a line at the average value
# strokeDash defines how dotted is the line
rule = base.mark_rule(
color = "#2c549d",
strokeDash = [3, 3]
).encode(
x = alt.value(0),
x2 = alt.value(315),
y = "mean(Attrition Rate)"
)
# Text above the average line
label = (
alt.Chart({"values": [{"text": ["AVERAGE: 7.5%"]}]})
.mark_text(
size = 10,
align = "left",
dx = -170,
dy = 0,
color = "#2c549d",
fontWeight = "bold"
)
.encode(text = "text:N")
)
final_dots = dots + rule + label
final_dots.properties(
width = 350, height = 200
).configure_view(stroke = None)
Visualization as depicted in the book:

Line graph�
Next, we will link the dots with a line, aiding in the comparison of value differences.
Once more, omitting the data type in the label specification causes the labels to accumulate on the right side of the graph.
line = base.mark_line(color = "#2c549d").encode(
x = alt.X(
"Year:O",
axis = alt.Axis(labelAngle = 0, labelColor = "#888888", ticks = False),
title = None,
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Attrition Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
tickCount = 9,
format = "%",
titleFontWeight = "normal"
),
title = "ATTRITION RATE"
)
)
# Label without specifying data type
label = base.mark_text(align = "left", dx = 3).encode(
x = alt.X("Year", aggregate = "max"),
y = alt.Y("Attrition Rate", aggregate = {"argmax": "Year"}),
text = alt.Text("Attrition Rate")
)
final_line = line + rule + label
final_line.properties(width = 350, height = 200).configure_view(stroke = None)
# With data type
label = base.mark_text(align = "left", dx = 3, color = "#2c549d").encode(
x = alt.X("Year:O", aggregate = "max"),
y = alt.Y("Attrition Rate", aggregate = {"argmax": "Year"}),
text = alt.Text("Attrition Rate")
)
final = line + rule + label
final.properties(
width = 350, height = 200
).configure_view(stroke = None)
The default method for placing the end label appeared ineffective, as it failed to filter out 2019 as the maximum value in the Year column. This issue can be rectified by straightforwardly filtering the entire dataset to encompass only values where Year == 2019.
# Filter 2019 manually
label = base.mark_text(align = 'left', dx = 3, color = '#2c549d', fontWeight = 'bold').encode(
x = alt.X('Year:O'),
y = alt.Y('Attrition Rate'),
text = alt.Text('Attrition Rate', format = ".1%"),
xOffset = alt.value(-10), # Offsets slightly in the x and y-axis
yOffset = alt.value(-10)
).transform_filter( # Filters
alt.FieldEqualPredicate(field = 'Year', equal = 2019)
)
# Label for the Average
label2 = alt.Chart({"values":
[{"text": ['AVG: 7.5%']}]
}
).mark_text(size = 10,
align = "left",
dx = 96, dy = 15,
color = '#2c549d',
fontWeight = 'bold'
).encode(text = "text:N")
# Filled point at the end of the line
point = base.mark_point(filled = True).encode(
x = alt.X('Year:O'),
y = alt.Y('Attrition Rate'),
opacity = alt.value(1)
).transform_filter(
alt.FieldEqualPredicate(field = 'Year', equal = 2019)
)
final_line = line + rule + label + label2 + point
final_line.properties(
width = 350,
height = 200
).configure_view(stroke = None)
Visualization as depicted in the book:

Coloring below the average line may help highlight values below it.
# Calculates average
avg = table['Attrition Rate'].mean()
# Creates a rectangle below the average line
rect = alt.Chart(pd.DataFrame({'y': [0], 'y2':[avg]})).mark_rect(
opacity = 0.2
).encode(y = 'y', y2 = 'y2', x = alt.value(0), x2 = alt.value(315))
# Makes a different average label, in lighter color
label2 = alt.Chart({"values":
[{"text": ['AVG:', '7.5%']}]
}
).mark_text(size = 10,
align = "left",
dx = 113, dy = 15,
color = '#9fb5db',
fontWeight = 'bold'
).encode(text = "text:N")
final_line2 = line + rect + label + label2 + point
final_line2.properties(
width = 350,
height = 200
).configure_view(stroke = None)
Visualization as depicted in the book:

Area graph�
An exploration using an area graph was undertaken; however, it conveys the impression that the area under the line holds significance, which is not the case for this dataset. This graph type may not be the most suitable choice for presenting this data.
Also, the decision was made to stray from the example given and connect the area to the y-axis. It is unclear if the decision to have it separated for this graph only was on purpose or an error.
# Area graph
area = base.mark_area().encode(
x = alt.X(
"Year:O",
axis = alt.Axis(labelAngle = 0, labelColor = "#888888", ticks = False),
title = None,
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Attrition Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
tickCount = 9,
format = "%",
titleFontWeight = "normal"
),
title = "ATTRITION RATE"
)
)
# Creates a lighter rule to contrast with area graph
rule_light = base.mark_rule(
color = "#9fb5db",
strokeDash = [3, 3]
).encode(
x = alt.value(0),
x2 = alt.value(315),
y = "mean(Attrition Rate)"
)
final_area = area + rule_light + label2
final_area.properties(
width = 350,
height = 200
).configure_view(stroke = None)
Visualization as depicted in the book:

Bar plot�
Finally, we can do a classic bar plot.
# Bar plot
bar = base.mark_bar(size = 25).encode(
x = alt.X(
"Year:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
ticks = False,
domain = False
),
title = None,
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Attrition Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
tickCount = 9,
format = "%",
titleFontWeight = "normal"
),
title = "ATTRITION RATE"
)
)
# Moves placement of average label
label = (
alt.Chart({"values": [{"text": ["AVG: 7.5%"]}]})
.mark_text(
size = 10,
align = "left",
dx = -130,
dy = 0,
color = "#2c549d",
fontWeight = "bold"
)
.encode(text = "text:N")
)
final_bar = bar + rule + label
final_bar.properties(
width = 320,
height = 200
).configure_view(stroke = None)
Visualization as depicted in the book:

Interactive�
For this interactive graph, the dot graph was selected. Inspired by the concept of the average line, we opted to create a "cut" line, allowing users to specify a value to partition the data. Dots falling below this threshold will be colored in red for enhanced visibility and focus.
# Create the slider
slider = alt.binding_range(min = 0, max = 0.16, step = 0.005, name = "CUT: ")
selector = alt.param(name = "SelectorName", value = 0.03, bind = slider)
# Remove space from column name
table["AttRate"] = table["Attrition Rate"]
base = alt.Chart(
table,
title = alt.Title(
"Attrition rate over time",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
dots = (
base.mark_point(filled = True, size = 50)
.encode(
x = alt.X(
"Year:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
ticks = False
),
title = None,
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Attrition Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
tickCount = 9,
format = "%",
titleFontWeight = "normal"
),
title = "ATTRITION RATE",
),
color = alt.condition( # Change Attrition Rate is less than the slider
alt.datum.AttRate < selector,
alt.value("#ef476f"),
alt.value("#118ab2")
),
opacity = alt.value(1)
)
.add_params(selector)
)
# Normal Average Rule
rule = base.mark_rule(
color = "#118ab2",
strokeDash = [3, 3]
).encode(
x = alt.value(0),
x2 = alt.value(315),
y = "mean(AttRate)"
)
# "Cut" rule - moves based on the slider
rule2 = (
base.mark_rule(
color = "#ef476f",
strokeDash = [3, 3]
)
.encode(
x = alt.value(0),
x2 = alt.value(315),
y = alt.value(200 - selector * 1250), # alt.datum did not work,
opacity = alt.value(0.1) # this converts pixel value into actual data
)
.add_params(selector)
)
label = (
alt.Chart({"values": [{"text": ["AVERAGE: 7.5%"]}]})
.mark_text(
size = 10,
align = "left",
dx = -170,
dy = 0,
color = "#118ab2",
fontWeight = "bold"
)
.encode(text = "text:N")
)
# "Cut" label, moves along with rule2
label2 = (
alt.Chart({"values": [{"text": ["CUT"]}]})
.mark_text(
size = 10,
align = "left",
dx = 110,
color = "#ef476f",
fontWeight = "bold"
)
.encode(
text = "text:N",
y = alt.value(195 - selector * 1250)
)
.add_params(selector)
)
final_interactive = dots + rule + label + rule2 + label2
final_interactive.properties(
width = 350,
height = 200
).configure_view(stroke = None)
Chapter 3 - Identify and eliminate Cluster�
"This lesson is simple but the impact is huge: get rid of the stuff that doesn’t need to be there" - Cole Nussbaumer Knaflic
Exercise 3.2 - how can we tie words to the graph?�
The main focus of this exercise is to apply the Gestalt Principles of Visual Perception to declutter graphs. For principles will be demonstrated, and each of them will be clarified through the visualization employing it.
If you would like to return to the Table of Contents, you can click here.
Loading the data�
# Loading considering the NaN caused by Excel formatting
table = pd.read_excel(r"Data\3.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 4, skipfooter = 6)
table
| 2019 | Rate | # exits | |
|---|---|---|---|
| 0 | JAN | 0.0040 | 120 |
| 1 | FEB | 0.0010 | 30 |
| 2 | MAR | 0.0015 | 45 |
| 3 | APR | 0.0080 | 240 |
| 4 | MAY | 0.0030 | 90 |
| 5 | JUN | 0.0014 | 42 |
| 6 | JUL | 0.0044 | 132 |
| 7 | AUG | 0.0050 | 150 |
| 8 | SEP | 0.0022 | 66 |
| 9 | OCT | 0.0015 | 45 |
| 10 | NOV | 0.0005 | 15 |
| 11 | DEC | 0.0010 | 30 |
The column name for 2019 is currently an integer, which might pose issues in the future. To avoid potential complications, we will modify the column name to a string.
For example, trying to run the following code returns an error:
alt.Chart(table).mark_bar().encode(
x = alt.X('2019'),
y = alt.Y('Rate')
)
ValueError: Dataframe contains invalid column name: 2019. Column names must be strings
table.rename(columns = {2019:'Date'}, inplace = True)
Cluttered graph�
# Graph with cluttered text
bar = (
alt.Chart(
table,
title = alt.Title(
"2019 monthly voluntary attrition rate",
fontSize = 15,
anchor = "start",
offset = 10,
fontWeight = "normal"
)
)
.mark_bar(size = 20, color = "#b0b0b0")
.encode(
x = alt.X(
"Date",
sort = None, # Avoids alphabetical order
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
ticks = False,
titleAnchor = "start",
titleFontWeight = "normal"
),
title="2019"
),
y = alt.Y(
"Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%", # y-axis is a percentage
tickCount = 10 # Number of ticks (intervals) in axis
),
scale = alt.Scale(domain = [0, 0.01]),
title = "ATTRITION RATE"
)
)
.properties(width = 300, height = 200)
)
# Text next to the graph
text = (
alt.Chart(
{
"values": [
{ # Each item is a line
"text": [
"Highlights:",
" ",
"In April there was a",
"reorganization. No jobs",
"were eliminated, but many",
"people chose to leave.",
" ",
"Attrition rates tend to be",
"higher in the Summer",
"months when it is",
"common for associates",
"to leave to go back to",
"school.",
" ",
"Attrition is typically low in",
"November and December",
"due to the holidays.",
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dy = -20,
dx = -10
) # Size and placement
.encode(text = "text:N")
)
# Using the | symbols makes it so Altair unites the bar and the text next to each other horizontally
final_cluttered = bar | text
final_cluttered.configure_view(stroke = None)
Visualization as depicted in the book:

Proximity�
The "Proximity Principle" says that we tend to associate objects close to each other as being part of a single group. To apply this is our graph, we bring the texts near the data they represent.
# The text now needs to be broken into parts
# First paragraph
text_april = (
alt.Chart(
{
"values": [
{
"text": [
"In April there was a",
"reorganization. No jobs",
"were eliminated, but many",
"people chose to leave.",
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = -145,
dy = -105
)
.encode(text = "text:N")
)
# Second paragraph
text_summer = (
alt.Chart(
{
"values": [
{
"text": [
"Attrition rates tend to be",
"higher in the Summer",
"months when it is",
"common for associates to",
"leave to go back to",
"school.",
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = -10,
dy = -65
)
.encode(text = "text:N")
)
# Third paragraph
text_nov_dec = (
alt.Chart(
{
"values": [
{
"text": [
"Attrition is",
"typically low in",
"November &",
"December due",
"to the holidays.",
]
}
]
}
)
.mark_text(
size = 11,
align = "right",
dx = 150,
dy = 5
)
.encode(text = "text:N")
)
# Now we sum the graphs, so that the texts lie on top of the bar, instead of next to it
final_prox = bar + text_april + text_summer + text_nov_dec
final_prox.configure_view(stroke = None)
Visualization as depicted in the book:

Proximity with emphasis�
We can enhance the visual impact by emphasizing the bars and keywords.
Given that Altair does not support bold text within regular content, a strategy is to introduce blank spaces in the text and create a distinct object for the bold keywords.
bar_highlight = (
alt.Chart(
table,
title = alt.Title(
"2019 monthly voluntary attrition rate",
fontSize = 15,
anchor = "start",
offset = 10,
fontWeight = "normal"
),
)
.mark_bar(size = 20)
.encode(
x = alt.X(
"Date",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleX = 12,
labelColor = "#888888",
titleColor = "#888888",
ticks = False,
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "2019"
),
y = alt.Y(
"Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%",
tickCount = 10
),
scale = alt.Scale(domain = [0, 0.01]),
title = "ATTRITION RATE"
),
color = alt.Color(
"Date",
sort = None,
scale = alt.Scale(
range = [
"#b0b0b0",
"#b0b0b0",
"#b0b0b0",
"#666666", # It was also possible to color by condition
"#b0b0b0", # where Date == [list of highlighted months]
"#b0b0b0",
"#666666",
"#666666",
"#b0b0b0",
"#b0b0b0",
"#666666",
"#666666"
]
),
legend = None
)
)
.properties(width = 300, height = 200)
)
# First paragraph with blank space
text_april_blank = (
alt.Chart(
{
"values": [
{
"text": [
"In there was a",
"reorganization. No jobs",
"were eliminated, but many",
"people chose to leave."
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = -145,
dy = -105
)
.encode(text = "text:N")
)
# Second paragraph with blank space
text_summer_blank = (
alt.Chart(
{
"values": [
{
"text": [
"Attrition rates tend to be",
"higher in the",
"months when it is",
"common for associates to",
"leave to go back to",
"school.",
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = -10,
dy = -65
)
.encode(text = "text:N")
)
# Third paragraph with blank space
text_nov_dec_blank = (
alt.Chart(
{
"values": [
{
"text": [
"Attrition is",
"typically low in",
"&",
"due",
"to the holidays.",
]
}
]
}
)
.mark_text(
size = 11,
align = "right",
dx = 150,
dy = 5
)
.encode(text = "text:N")
)
# Bold "April" word
text_april_bold = (
alt.Chart({"values": [{"text": ["April"]}]})
.mark_text(
size = 11,
align = "left",
dx = -133,
dy = -105,
fontWeight = 800
)
.encode(text = "text:N")
)
# Bold "Summer" word
text_summer_bold = (
alt.Chart({"values": [{"text": ["Summer"]}]})
.mark_text(
size = 11,
align = "left",
dx = 54,
dy = -52,
fontWeight = 800
)
.encode(text = "text:N")
)
# Bold "November" word
text_nov_bold = (
alt.Chart({"values": [{"text": ["November"]}]})
.mark_text(
size = 11,
align = "left",
dx = 80,
dy = 31,
fontWeight = 800
)
.encode(text = "text:N")
)
# Bold "December" word
text_dec_bold = (
alt.Chart({"values": [{"text": ["December"]}]})
.mark_text(
size = 11,
align = "left",
dx = 68,
dy = 44,
fontWeight = 800
)
.encode(text = "text:N")
)
# Adds everything
final_prox_emph = (
bar_highlight
+ text_april_blank
+ text_april_bold
+ text_summer_blank
+ text_summer_bold
+ text_nov_dec_blank
+ text_nov_bold
+ text_dec_bold
)
final_prox_emph.configure_view(stroke=None)
Visualization as depicted in the book:

Similarity�
The "Similarity Principle" pertains to our tendency to perceive objects as part of the same group when they share similar color, shape, or size. For this example, this means coloring the columns in the same shade as the chosen keywords.
bar_highlight_color = (
alt.Chart(
table,
title = alt.Title(
"2019 monthly voluntary attrition rate",
fontSize = 15,
anchor = "start",
offset = 10,
fontWeight = "normal"
),
)
.mark_bar(size = 20)
.encode(
x = alt.X(
"Date",
sort = None,
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
ticks = False,
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "2019"
),
y = alt.Y(
"Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%",
tickCount = 10
),
scale = alt.Scale(domain = [0, 0.01]),
title = "ATTRITION RATE"
),
color = alt.Color(
"Date",
sort = None,
scale = alt.Scale(
range = [
"#b0b0b0",
"#b0b0b0", # Since there are multiple colors,
"#b0b0b0", # setting by condition would be harder
"#ed1e24",
"#b0b0b0",
"#b0b0b0",
"#ec7c30",
"#ec7c30",
"#b0b0b0",
"#b0b0b0",
"#5d9bd1",
"#5d9bd1"
]
),
legend = None
)
)
.properties(width = 300, height = 200)
)
# Blank texts with different position
text_april_blank2 = (
alt.Chart(
{
"values": [
{
"text": [
"Highlights:",
" ",
"In there was a",
"reorganization. No jobs",
"were eliminated, but many",
"people chose to leave.",
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dy = -25,
dx = -10
)
.encode(text = "text:N")
)
text_summer_blank2 = (
alt.Chart(
{
"values": [
{
"text": [
"Attrition rates tend to be",
"higher in the",
"months when it is",
"common for associates to",
"leave to go back to",
"school.",
]
}
]
}
)
.mark_text(
size = 11, align = "left",
dx = -10, dy = 65
)
.encode(text = "text:N")
)
text_nov_dec_blank2 = (
alt.Chart(
{
"values": [
{"text": [
"Attrition is typically low in",
" ",
"due to the holidays."
]}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = -10,
dy = 155
)
.encode(text = "text:N")
)
# Colored texts
text_april_color = (
alt.Chart({"values": [{"text": ["April"]}]})
.mark_text(
size = 11,
align = "left",
dx = 3,
dy = 1,
fontWeight = 800,
color = "#ed1e24"
)
.encode(text = "text:N")
)
text_summer_color = (
alt.Chart({"values": [{"text": ["Summer"]}]})
.mark_text(
size = 11,
align = "left",
dx = 55,
dy = 78,
fontWeight = 800,
color = "#ec7c30"
)
.encode(text = "text:N")
)
text_nov_dec_color = (
alt.Chart({"values": [{"text": ["November & December"]}]})
.mark_text(
size = 11,
align = "left",
dx = -10,
dy = 168,
fontWeight = 800,
color = "#5d9bd1"
)
.encode(text = "text:N")
)
# While we could have used '&' to arrange the texts vertically,
# employing '+' provides greater flexibility in determining the layout of the text.
final_sim = bar_highlight_color | (
text_april_blank2
+ text_april_color
+ text_summer_blank2
+ text_summer_color
+ text_nov_dec_blank2
+ text_nov_dec_color
)
final_sim.configure_view(stroke=None)
Visualization as depicted in the book:

Enclosure�
The "Enclosure Principle" says simply that, when objects are enclosed together, we perceive them as belonging to the same group.
Attempting to combine charts using the expression (bar | text) + rect_nov_dec + rect_summer + rect_april results in an error:
Concatenated charts cannot be layered. Instead, layer the charts before concatenating.
The most straightforward way to solve this is to add the text to the bar using bar + text, but doing so means assigning another position (dx, dy) to the text.
# Assign another position to text
text_enclosure = (
alt.Chart(
{
"values": [
{
"text": [
"Highlights:",
" ",
"In April there was a",
"reorganization. No jobs",
"were eliminated, but many",
"people chose to leave.",
" ",
"Attrition rates tend to be",
"higher in the Summer",
"months when it is",
"common for associates",
"to leave to go back to",
"school.",
" ",
"Attrition is typically low in",
"November and December",
"due to the holidays.",
]
}
]
}
)
.mark_text(
size = 11, align = "left",
dx = 160, dy = -113
)
.encode(text = "text:N")
)
# Defines the rectangles that are going to enclose the text
rect_nov_dec = (
alt.Chart(
pd.DataFrame({"y": [0], "y2": [0.0019], "x": [10], "x2": [8.4]})
)
.mark_rect(opacity = 0.2) # Low opacity
.encode(
y = "y",
y2 = "y2",
x = alt.X("x", axis = None),
x2 = "x2"
)
)
rect_summer = (
alt.Chart(
pd.DataFrame({
"y": [0.0023],
"y2": [0.0063],
"x": [10],
"x2": [5.1]
}))
.mark_rect(opacity = 0.2)
.encode(
y = "y",
y2 = "y2",
x = alt.X("x", axis = None),
x2 = "x2"
)
)
rect_april = (
alt.Chart(pd.DataFrame({
"y": [0.0068],
"y2": [0.0095],
"x": [10],
"x2": [2.6]
}))
.mark_rect(opacity = 0.2)
.encode(
y = "y",
y2 = "y2",
x = alt.X("x", axis = None),
x2 = "x2"
)
)
bar + text_enclosure + rect_nov_dec + rect_summer + rect_april
Utilizing a DataFrame to define the rectangles seems to prevent them from reaching the text section. As a next step, we will explicitly define the coordinates of the rectangles in pixels.
# Rectangles with position defined by pixels
rect_nov_dec = alt.Chart(
pd.DataFrame({'values':[{}]})
).mark_rect(
opacity = 0.2,
color = "#b0b0b0"
).encode(
y = alt.value(5),
y2 = alt.value(60),
x = alt.value(75),
x2 = alt.value(440)
)
rect_summer = alt.Chart(
pd.DataFrame({'values':[{}]})
).mark_rect(
opacity = 0.2,
color = "#b0b0b0"
).encode(
y = alt.value(70),
y2 = alt.value(150),
x = alt.value(150),
x2 = alt.value(440)
)
rect_april = alt.Chart(
pd.DataFrame({'values':[{}]})
).mark_rect(
opacity = 0.2,
color = "#b0b0b0"
).encode(
y = alt.value(160),
y2 = alt.value(202),
x = alt.value(250),
x2 = alt.value(440)
)
# The text_enclosure and bar comes after the rectangles so that they sit on top
final_enc = (
rect_nov_dec + rect_summer +
rect_april + bar + text_enclosure
)
final_enc.configure_view(stroke = None)
Visualization as depicted in the book:

Enclosure with color differentiation�
We can use color to emphasize the different enclosures.
# Colored rectangles
rect_nov_dec_color = (
alt.Chart(pd.DataFrame({"values": [{}]}))
.mark_rect(opacity = 0.2, color = "#ed1e24")
.encode(
y = alt.value(5),
y2 = alt.value(60),
x = alt.value(75),
x2 = alt.value(450)
)
)
rect_summer_color = (
alt.Chart(pd.DataFrame({"values": [{}]}))
.mark_rect(opacity = 0.2, color = "#ec7c30")
.encode(
y = alt.value(70),
y2 = alt.value(150),
x = alt.value(150),
x2 = alt.value(450)
)
)
rect_april_color = (
alt.Chart(pd.DataFrame({"values": [{}]}))
.mark_rect(opacity = 0.2, color = "#5d9bd1")
.encode(
y = alt.value(160),
y2 = alt.value(202),
x = alt.value(250),
x2 = alt.value(450)
)
)
final_enc_color = (
rect_nov_dec_color +
rect_summer_color +
rect_april_color +
bar +
text_enclosure
)
final_enc_color.configure_view(stroke = None)
Visualization as depicted in the book:

Enclosure + Similarity�
We can make use of both Enclosure and Similarity principles. First, we will try to add already existing components.
bar_highlight_color + rect_april_color | (
text_april_blank2 + text_april_color
+ text_summer_blank2 + text_summer_color
+ text_nov_dec_blank2 + text_nov_dec_color
)
bar_highlight_color | (
text_april_blank2 + text_april_color
+ text_summer_blank2 + text_summer_color
+ text_nov_dec_blank2 + text_nov_dec_color
) + rect_april_color
Since attempting to layer only already assigned variables seems to be ineffective, we will recreate the texts using alternative positions to enable their addition using the + symbol.
# Same texts, different dx and dy values
text_april_blank2_enclosure = (
alt.Chart(
{
"values": [
{
"text": [
"Highlights:",
" ",
"In there was a",
"reorganization. No jobs",
"were eliminated, but many",
"people chose to leave."
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = 160,
dy = -113
)
.encode(text = "text:N")
)
text_summer_blank2_enclosure = (
alt.Chart(
{
"values": [
{
"text": [
"Attrition rates tend to be",
"higher in the",
"months when it is",
"common for associates to",
"leave to go back to",
"school.",
]
}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = 160,
dy = -21
)
.encode(text = "text:N")
)
text_nov_dec_blank2_enclosure = (
alt.Chart(
{
"values": [
{"text": [
"Attrition is typically low in",
" ",
"due to the holidays."]}
]
}
)
.mark_text(
size = 11,
align = "left",
dx = 160,
dy = 68
)
.encode(text = "text:N")
)
text_april_color_enclosure = (
alt.Chart({"values": [{"text": ["April"]}]})
.mark_text(
size = 11,
align = "left",
dx = 172,
dy = -87,
fontWeight = 800,
color = "#ed1e24"
)
.encode(text = "text:N")
)
text_summer_color_enclosure = (
alt.Chart({"values": [{"text": ["Summer"]}]})
.mark_text(
size = 11,
align = "left",
dx = 225,
dy = -8,
fontWeight = 800,
color = "#ec7c30"
)
.encode(text = "text:N")
)
text_nov_dec_color_enclosure = (
alt.Chart({"values": [{"text": ["November & December"]}]})
.mark_text(
size = 11,
align = "left",
dx = 160,
dy = 82,
fontWeight = 800,
color = "#5d9bd1"
)
.encode(text = "text:N")
)
final_enc_sim = (
rect_nov_dec_color
+ rect_summer_color
+ rect_april_color
+ bar_highlight_color
+ text_april_blank2_enclosure
+ text_summer_blank2_enclosure
+ text_nov_dec_blank2_enclosure
+ text_april_color_enclosure
+ text_summer_color_enclosure
+ text_nov_dec_color_enclosure
)
final_enc_sim.configure_view(stroke = None)
Visualization as depicted in the book:

Connection�
The "Connection" relies on the fact that objects that are physically connected are often perceived as part of a single group. In this example, we will connect the texts and the data using a line.
# Rules connecting text with bar
rule_april = (
alt.Chart()
.mark_rule(point = {"fill": "gray"}) # With a dot at the end
.encode(
x = alt.value(102),
y = alt.value(45),
x2 = alt.value(300),
strokeWidth = alt.value(0.5)
)
)
rule_summer = (
alt.Chart()
.mark_rule(point = {"fill": "gray"})
.encode(
x = alt.value(202),
y = alt.value(105),
x2 = alt.value(300),
strokeWidth = alt.value(0.5)
)
)
# Part one of the November/December rule
rule_nov_dec_1 = (
alt.Chart()
.mark_rule() # Without the dot
.encode(
x = alt.value(260),
y = alt.value(175),
x2 = alt.value(300),
strokeWidth = alt.value(0.5)
)
)
# Part two of the November/December rule
rule_nov_dec_2 = (
alt.Chart()
.mark_rule(point = {"fill": "gray"}) # With the dot
.encode(
x = alt.value(260),
y = alt.value(185),
y2 = alt.value(175),
strokeWidth = alt.value(0.5)
)
)
final_conn = (
bar +
text_enclosure +
rule_april +
rule_summer +
rule_nov_dec_1 +
rule_nov_dec_2
)
final_conn.configure_view(stroke = None)
Visualization as depicted in the book:

Connection + Similarity�
Now we can use the connection and the similarity principles to connect highlighted texts with colored data.
final_conn_sim = (
bar_highlight_color
+ text_april_blank2_enclosure
+ text_summer_blank2_enclosure
+ text_nov_dec_blank2_enclosure
+ text_april_color_enclosure
+ text_summer_color_enclosure
+ text_nov_dec_color_enclosure
+ rule_april
+ rule_summer
+ rule_nov_dec_1
+ rule_nov_dec_2
)
final_conn_sim.configure_view(stroke = None)
Visualization as depicted in the book:

Interactivity�
In designing this interactive graph, we opted to apply the Gestalt Principle of Proximity for tooltips. Rather than displaying the text next to the bars it represents, the information will appear when the user hovers over the respective bars.
Having experimented with more sophisticated code solutions without achieving success, we opted for an alternative approach. We created a "base" bar in light gray without any tooltips. Subsequently, for each of the three groups of bars, we crafted distinct charts containing only the data relevant to the tooltips, in darker gray. Consequently, the final graph is composed of four charts added on top of each other.
# Base graph
bar_base = (
alt.Chart(
table,
title = alt.Title(
"2019 monthly voluntary attrition rate",
fontSize = 15,
anchor = "start",
offset = 10,
fontWeight = "normal"
)
)
.mark_bar(size = 20)
.encode(
x = alt.X(
"Date",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleX = 12,
labelColor = "#888888",
titleColor = "#888888",
ticks = False,
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "2019"
),
y = alt.Y(
"Rate",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%",
tickCount = 10
),
scale = alt.Scale(domain = [0, 0.01]),
title = "ATTRITION RATE",
),
color = alt.Color( # Light gray
"Date",
sort = None,
scale = alt.Scale(range = ["#b0b0b0"]),
legend = None
)
)
.properties(width = 300, height = 200)
)
# Graph for April
bar_apr = (
alt.Chart(table)
.mark_bar()
.encode(
x = alt.X("Date", sort = None),
y = alt.Y(
"Rate",
scale = alt.Scale(domain = [0, 0.01])
),
color = alt.value("#666666"), # Darker gray
tooltip = alt.value(
"In April there was a reorganization. No jobs were eliminated, but many people chose to leave."
) # Text for April
)
.transform_filter(
alt.FieldEqualPredicate(
field = "Date",
equal = "APR"
)
) # Only data for April
)
# Graph for summer
bar_jul_aug = (
alt.Chart(table)
.mark_bar(size = 20)
.encode(
x = alt.X(
"Date",
sort = None
),
y = alt.Y(
"Rate",
scale = alt.Scale(domain = [0, 0.01])
),
color = alt.value("#666666"), # Darker gray
tooltip = alt.value(
"Attrition rates tend to be higher in the Summer months when it is common for associates to leave to go back to school."
) # Text for summer
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Date",
oneOf = ["JUL", "AUG"])
)
) # Filter summer data
# Graph for end of year
bar_dec_nov = (
alt.Chart(table)
.mark_bar(size = 20)
.encode(
x = alt.X("Date", sort = None),
y = alt.Y("Rate", scale = alt.Scale(domain = [0, 0.01])),
color = alt.value("#666666"), # Darker gray
tooltip = alt.value(
"Attrition is typically low in November and December due to the holidays."
) # Text for the end of year
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Date",
oneOf = ["NOV", "DEC"]
)
)
)
final_tooltip = (
bar_base +
bar_apr +
bar_jul_aug +
bar_dec_nov
)
final_tooltip.configure_view(stroke = None)
Chapter 4 - Focus Attention�
"Where do you want your audience to look?" - Cole Nussbaumer Knaflic
Loading the data�
# Loading table
table = pd.read_excel(r"Data\4.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 5, skipfooter = 30)
# Fixing names
table['Brands'] = table['Unnamed: 1']
table['Change'] = table['$ Vol % change']
table.drop(columns = ['Unnamed: 1', '$ Vol % change'], inplace = True)
table
| spacing for dot plot | Brands | Change | |
|---|---|---|---|
| 0 | 0 | Fran's Recipe | -0.14 |
| 1 | 1 | Wholesome Goodness | -0.13 |
| 2 | 2 | Lifestyle | -0.10 |
| 3 | 3 | Coat protection | -0.09 |
| 4 | 4 | Diet Lifestyle | -0.08 |
| 5 | 5 | Feline Basics | -0.05 |
| 6 | 6 | Lifestyle Plus | -0.04 |
| 7 | 7 | Feline Freedom | -0.02 |
| 8 | 8 | Feline Gold | 0.01 |
| 9 | 9 | Feline Platinum | 0.01 |
| 10 | 10 | Feline Instinct | 0.02 |
| 11 | 11 | Feline Pro | 0.03 |
| 12 | 12 | Farm Fresh Tasties | 0.04 |
| 13 | 13 | Feline Royal | 0.05 |
| 14 | 14 | Feline Focus | 0.09 |
| 15 | 15 | Feline Grain Free | 0.09 |
| 16 | 16 | Feline Silver | 0.12 |
| 17 | 17 | Nutri Balance | 0.16 |
| 18 | 18 | Farm Fresh Basics | 0.17 |
Graph without highlights�
Initially, we will show the graph in a light gray color. Notice that not using sort = None results in the brands being arranged in alphabetical order.
# Brans in alphabetical order
alt.Chart(table).mark_bar().encode(
x = "Change",
y = "Brands"
)
# "Not sorted" graph
alt.Chart(table).mark_bar().encode(
x = "Change",
y = alt.Y("Brands", sort = None)
)
# Customized graph
chart = (
alt.Chart(
table,
title = alt.Title(
"Cat food brands: YoY sales change",
subtitle = "% CHANGE IN VOLUME ($)", # Add subtitle
color = "black",
subtitleColor = "gray",
offset = 10,
anchor = "start",
fontSize = 19,
subtitleFontSize = 11,
fontWeight = "normal"
),
)
.mark_bar(color = "#8b8b8b", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]), # x-axis from -20% to 20%
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "DECREASED | INCREASED"
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Brand names that go on the right
label1 = (
alt.Chart(table.loc[table["Change"] < 0])
.mark_text(
align = "left",
color = "#8b8b8b",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
# Brand names that go on the left
label2 = (
alt.Chart(table.loc[table["Change"] > 0])
.mark_text(
align = "right",
color = "#8b8b8b",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
gray = chart + label1 + label2
gray.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

Highlighting the Lifestyle brands�
Following this, the author suggests highlighting a specific brand of cat food, the Lifestyle line. All labels and bars associated with this brand will be colored black to captivate the viewer's attention.
# Creates a list of brand in the Lifestyle line
conditions = [
f'datum.Brands == "{brand}"' for brand in table["Brands"] if "Lifestyle" in brand
]
condition = f"({'|'.join(conditions)})"
# Create the chart
chart = (
alt.Chart(
table,
title = alt.Title(
"Cat food brands:", # Title
subtitle = "YEAR-OVER-YEAR % CHANGE IN VOLUME ($)",
color = "black",
subtitleColor = "gray",
anchor = "start",
fontSize = 19,
subtitleFontSize = 11,
fontWeight = "normal"
)
)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "DECREASED | INCREASED"
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition( # If brand in Lifestyle, then color equals black
condition,
alt.value("black"),
alt.value("#c6c6c6")
)
)
)
# Labels to the right
label1_bw = (
alt.Chart(table.loc[table["Change"] < 0])
.mark_text(align = "left", fontWeight = 700)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
# If brand in Lifestyle, then label equals black
color = alt.condition(
condition,
alt.value("black"),
alt.value("#c6c6c6")
)
)
)
# Labels to the left
# There are no lifestyle brand line in this category
label2_bw = (
alt.Chart(table.loc[table["Change"] > 0])
.mark_text(
align = "right",
color = "#c6c6c6",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
# Add bold part of the title separately
title_bw = (
alt.Chart({"values": [
{"text": ["Lifestyle line brands decline"]}
]})
.mark_text(
size = 19,
align = "left",
dx = 172,
dy = -250,
fontWeight = 700,
color = "black"
)
.encode(text = "text:N")
)
(chart + label1_bw + label2_bw + title_bw).properties(
width = 400
).configure_view(
stroke = None
)
Given that incorporating the bold section of the title next to the chart's predefined title proved unsuccessful, we will proceed by generating the chart without a title. Subsequently, we will add both texts separately.
# Graph without title
chart_bw = alt.Chart(table).mark_bar(
size = 15
).encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False, orient = "top",
labelColor = "#888888", titleColor = '#888888',
titleFontWeight = 'normal', format = "%"
),
title = "DECREASED | INCREASED"
),
y = alt.Y(
"Brands",
sort = None,
axis = None
),
color = alt.condition(
condition,
alt.value('black'),
alt.value('#c6c6c6')
)
)
# Part one of title
title_bw = (
alt.Chart({"values": [
{"text": ["Cat food brands:"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -200,
dy = -270,
fontWeight = "normal",
color = "black"
)
.encode(text = "text:N")
)
# Part two of title
title_bw_bold = (
alt.Chart({"values": [
{"text": ["Lifestyle line brands decline"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "black"
)
.encode(text = "text:N")
)
# Subtitle
subtitle_bw = (
alt.Chart({"values": [
{"text": ["YEAR-OVER-YEAR % CHANGE IN VOLUME ($)"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -200,
dy = -250,
fontWeight = "normal",
color = "gray"
)
.encode(text = "text:N")
)
lifestyle = (
chart_bw +
label1_bw +
label2_bw +
title_bw +
title_bw_bold +
subtitle_bw
)
lifestyle.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

Highlighting the Feline brand�
In the next step, our focus is on highlighting the Feline brand line. The difference is that we have the information that the brand uses a purple logo, and therefore, we can incorporate this color into our design.
# Set condition to list of Feline products
conditions = [
f'datum.Brands == "{brand}"' for brand in table["Brands"] if "Feline" in brand
]
condition_purple = f"({'|'.join(conditions)})"
# Highlights Feline products with purple color
chart_purple = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "DECREASED | INCREASED"
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_purple,
alt.value("#713a97"),
alt.value("#c6c6c6")
)
)
)
# Labels to the right
label1_purple = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(
align = "left",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_purple,
alt.value("#713a97"),
alt.value("#c6c6c6")
)
)
)
# Labels to the left
# Now we need to add condition as there are Feline products in this
label2_purple = (
alt.Chart(
table.loc[table["Change"] > 0]
)
.mark_text(
align = "right",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_purple,
alt.value("#713a97"),
alt.value("#c6c6c6")
)
)
)
# Second title in purple
title_purple = (
alt.Chart({"values": [
{"text": ["most in Feline line increased"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "#713a97"
)
.encode(text = "text:N")
)
feline = (
chart_purple +
label1_purple +
label2_purple +
title_bw +
title_purple +
subtitle_bw
)
feline.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

In the displayed chart from the book, there is an error where the bar corresponding to "Feline Gold" cat food appears light gray instead of the correct purple, consistent with the labeling. A request for information on the process of formally submitting an erratum has been made.
Highlighting brands that declined�
Moving forward, we will selectively color the brands that experienced a decline in sales. Recognizing the accessibility challenges associated with the conventional red and green combination, the author suggests using orange and blue to cater to individuals with colorblindness.
# Condition based on negative values of Change
condition = "datum.Change < 0"
# Chart with orange color
chart_orange = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y(
"Brands",
sort = None,
axis = None
),
color = alt.condition(
condition,
alt.value("#ec7c30"),
alt.value("#c6c6c6")
)
)
)
# The right label is for decreased brands
# so we will color all of it with orange
label1_orange = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(
align = "left",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.value("#ec7c30")
)
)
# There is no need to create labeling for the left
# it will all be gray and we already have that variable
# Second title in orange
title_orange = (
alt.Chart({"values": [
{"text": ["8 brands decreased in sale"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "#ec7c30"
)
.encode(text = "text:N")
)
# "DECREASED" text in orange, serves as a label
decreased_orange = (
alt.Chart({"values": [
{"text": ["DECREASED"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -80,
dy = -220,
fontWeight = 700,
color = "#ec7c30"
)
.encode(text = "text:N")
)
# "INCREASED" text in gray
increased_gray = (
alt.Chart({"values": [
{"text": ["| INCREASED"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -0,
dy = -220,
fontWeight = 700,
color = "#8b8b8b"
)
.encode(text = "text:N")
)
decreased = (
chart_orange
+ label1_orange
+ label2_bw
+ title_bw
+ title_orange
+ subtitle_bw
+ decreased_orange
+ increased_gray
)
decreased.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

Highlighting the two brands that decreased the most�
To achieve this, the author has opted to maintain the coloration of all brands that experienced a decrease while placing additional emphasis on the two with the most significant declines by using a brighter shade of orange compared to the others.
# Create a list of brands that decreased the most
decreased_most = table.nsmallest(2, "Change")
brands_decreased = decreased_most["Brands"].tolist()
# Create a condition based on that list
conditions = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition = f"({'|'.join(conditions)})"
# Create a list of brands with positive changes for auxiliary graph
positive_brands = table.loc[table["Change"] > 0, "Brands"].unique()
positive_brands_list = positive_brands.tolist()
# Chart of brands that decreased
chart_oranges = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
# If the Brand is in the list of decreased most,
# color it orange - else, color it light orange
color = alt.condition(
condition,
alt.value("#ec7c30"),
alt.value("#efb284")
)
)
)
# Auxiliary graph with gray bottom
chart_oranges2 = (
alt.Chart(table)
# Color equals gray
.mark_bar(
size = 15,
color = "#c6c6c6",
opacity = 1
)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y(
"Brands",
sort = None,
axis = None
)
)
.transform_filter( # Only positive brands
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = positive_brands_list
)
)
)
# Label to the right
label1_oranges = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(align = "left", fontWeight = 700)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
# Color it orange or light orange
color = alt.condition(
condition,
alt.value("#ec7c30"),
alt.value("#efb284")
)
)
)
# Again, no need for new labels for the left
# Second title in orange
title_oranges = (
alt.Chart({"values": [
{"text": ["2 brands decreased the most"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "#ec7c30"
)
.encode(text = "text:N")
)
decreased2 = (
chart_oranges
+ chart_oranges2
+ label1_oranges
+ label2_bw
+ title_bw
+ title_oranges
+ subtitle_bw
+ decreased_orange
+ increased_gray
)
decreased2.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

Highlighting brands that increased�
This code closely resembles the one used to highlight all brands that experienced a decrease, with modifications made to the condition and color, now blue.
# Change signal of condition from < to >
condition = "datum.Change > 0"
# Graph highlighting increased brands
chart_blue = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
# If brand increased, color it blue, else, color it gray
color = alt.condition(
condition,
alt.value("#4772b8"),
alt.value("#c6c6c6")
)
)
)
# Create labels to the left
label2_blue = (
alt.Chart(
table.loc[table["Change"] > 0]
)
.mark_text(
align = "right",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition,
alt.value("#4772b8"),
alt.value("#c6c6c6")
)
)
)
# This time there is no need to create labels on the right
# Second title in blue
title_blue = (
alt.Chart({"values": [
{"text": ["11 brands flat to increasing"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "#4772b8"
)
.encode(text = "text:N")
)
# "DECREASED" text in gray
decreased_gray = (
alt.Chart({"values": [
{"text": ["DECREASED |"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -80,
dy = -220,
fontWeight = 700,
color = "#8b8b8b"
)
.encode(text = "text:N")
)
# "INCREASED" text in blue
increased_blue = (
alt.Chart({"values": [{"text": ["INCREASED"]}]})
.mark_text(
size = 11,
align = "left",
dx = 20,
dy = -220,
fontWeight = 700,
color = "#4772b8"
)
.encode(text = "text:N")
)
increased = (
chart_blue
+ label1
+ label2_blue
+ title_bw
+ title_blue
+ subtitle_bw
+ decreased_gray
+ increased_blue
)
increased.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

Final view with all highlights�
For the last question, the author explores the creation of a conclusive slide that highlights all the points discussed above. For this, we will create two graphs, one with both Feline and Lifestyle Brands and one showcasing increased and decreased sales. The author initially paired the visualizations with text, but as the manual labor-intensive method has already been explored, we will abstain from incorporating additional textual elements.
# Graph with both brands
# Lifestyle brand
chart_bw2 = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "DECREASED | INCREASED"
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.value("black") # Color black with no condition
)
.transform_filter( # Filter to be only lifestyle products
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = ["Lifestyle", "Lifestyle Plus", "Diet Lifestyle"]
)
)
)
# We can reuse the left labeling for the lifestyle brand
# Second title in black
title_bw_bold_2 = (
alt.Chart({"values": [
{"text": ["mixed results in sales year-over-year"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "black"
)
.encode(text = "text:N")
)
# Creates list of all brands that are not Lifestyle
not_lifestyle = table[
~table["Brands"].isin(["Lifestyle", "Lifestyle Plus", "Diet Lifestyle"])
]
not_lifestyle = not_lifestyle["Brands"].tolist()
# Create the labeling in the right for the Feline brand in purple
# If we reuse the one we already have, it will overwrite the labels in black
label1_purple_2 = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(
align = "left",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_purple,
alt.value("#713a97"),
alt.value("#c6c6c6")
)
)
# Filter out the Lifestyle brand so that it is not overwritten
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = not_lifestyle
)
)
)
# Since there are no Lifestyle brand labeling on the left,
# there is no need to create a new left labeling for Feline
mixed = (
chart_purple
+ chart_bw2
+ label1_bw
+ label1_purple_2
+ label2_purple
+ title_bw
+ title_bw_bold_2
+ subtitle_bw
)
mixed.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

# Create a list to the most increased and most decreased brands
decreased_most = table.nsmallest(2, "Change")
increased_most = table.nlargest(2, "Change")
brands_decreased = decreased_most["Brands"].tolist()
brands_increased = increased_most["Brands"].tolist()
# Create a condition for brands that increased and decreased
conditions_decreased = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition_decreased = f"({'|'.join(conditions_decreased)})"
conditions_increased = [f'datum.Brands == "{brand}"' for brand in brands_increased]
condition_increased = f"({'|'.join(conditions_increased)})"
# Create a gray graph representing middle brands
chart_gray = (
alt.Chart(table)
.mark_bar(color = "#c6c6c6", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y(
"Brands",
sort = None,
axis = None
)
)
)
# Labels on the right for those middle brands
label1_gray = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(
align = "left",
color = "#c6c6c6",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
# Labels on the left for those middle brands
label2_gray = (
alt.Chart(
table.loc[table["Change"] > 0]
)
.mark_text(
align = "right",
color = "#c6c6c6",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
# Graph for decreased values
chart_oranges_mix = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
# If on top 2 decreased, bright orange
# Else, light orange
condition_decreased,
alt.value("#ec7c30"),
alt.value("#efb284")
)
)
# Filter to only the five most decreased brands
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = list(table["Brands"][0:5])
)
)
)
# Labeling on the right for decreased brands
label_oranges = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(
align = "left",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
# If on top 2 decreased, bright orange
# Else, light orange
condition_decreased,
alt.value("#ec7c30"),
alt.value("#efb284")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = list(table["Brands"][0:5])
)
)
)
# Graph for increased values
chart_blue_mix = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y(
"Brands",
sort = None,
axis = None
),
color = alt.condition(
# If on top 2 increased, color blue
# Else, color it light blue
condition_increased,
alt.value("#4772b8"),
alt.value("#91a9d5")
)
)
# Filter to only increased brands
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = list(table["Brands"][14:19])
)
)
)
# Labeling on the left for increased brands
label_blue = (
alt.Chart(
table.loc[table["Change"] > 0]
)
.mark_text(align = "right", fontWeight = 700)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_increased,
alt.value("#4772b8"),
alt.value("#91a9d5")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = list(table["Brands"][14:19])
)
)
)
# We will unite the "DECREASED" orange with the "INCREASED" blue
# So we will create the "|" separation in gray
separation = (
alt.Chart({"values": [
{"text": ["|"]}
]})
.mark_text(
size = 11,
align = "left",
dx = 3,
dy = -220,
fontWeight = 700,
color = "#c6c6c6"
)
.encode(text = "text:N")
)
mixed2 = (
chart_gray
+ chart_oranges_mix
+ chart_blue_mix
+ label1_gray
+ label2_gray
+ label_oranges
+ label_blue
+ title_bw
+ title_bw_bold_2
+ subtitle_bw
+ decreased_orange
+ increased_blue
+ separation
)
mixed2.properties(width = 400).configure_view(stroke = None)
Visualization as depicted in the book:

Although this exercise does not have an interactive version of the graph, we will discuss the color palette of it in a subsequent exercise. There, an interactive version relating to the colors will be demonstrated.
Loading the data�
# Load Excel file
table = pd.read_excel(r"Data\4.3 EXERCISE.xlsx", usecols = [1, 2, 3, 4], header = 5, skipfooter = 5)
table
| YEAR | Total | Organic | Referral | |
|---|---|---|---|---|
| 0 | 2005 | 0.087 | 0.033 | 0.054 |
| 1 | 2006 | 0.083 | 0.035 | 0.048 |
| 2 | 2007 | 0.086 | 0.037 | 0.049 |
| 3 | 2008 | 0.089 | 0.036 | 0.053 |
| 4 | 2009 | 0.084 | 0.034 | 0.050 |
| 5 | 2010 | 0.086 | 0.031 | 0.055 |
| 6 | 2011 | 0.075 | 0.032 | 0.043 |
| 7 | 2012 | 0.072 | 0.035 | 0.037 |
| 8 | 2013 | 0.069 | 0.032 | 0.037 |
| 9 | 2014 | 0.074 | 0.038 | 0.036 |
| 10 | 2015 | 0.066 | 0.035 | 0.031 |
| 11 | 2016 | 0.080 | 0.045 | 0.035 |
| 12 | 2017 | 0.073 | 0.041 | 0.032 |
| 13 | 2018 | 0.070 | 0.042 | 0.028 |
| 14 | 2019 | 0.063 | 0.038 | 0.025 |
# Transform into the long format
melted_table = pd.melt(
table,
id_vars = ['YEAR'],
var_name = 'Metric',
value_name = 'Value'
)
melted_table["Metric"] = melted_table["Metric"].str.upper()
melted_table
| YEAR | Metric | Value | |
|---|---|---|---|
| 0 | 2005 | TOTAL | 0.087 |
| 1 | 2006 | TOTAL | 0.083 |
| 2 | 2007 | TOTAL | 0.086 |
| 3 | 2008 | TOTAL | 0.089 |
| 4 | 2009 | TOTAL | 0.084 |
| 5 | 2010 | TOTAL | 0.086 |
| 6 | 2011 | TOTAL | 0.075 |
| 7 | 2012 | TOTAL | 0.072 |
| 8 | 2013 | TOTAL | 0.069 |
| 9 | 2014 | TOTAL | 0.074 |
| 10 | 2015 | TOTAL | 0.066 |
| 11 | 2016 | TOTAL | 0.080 |
| 12 | 2017 | TOTAL | 0.073 |
| 13 | 2018 | TOTAL | 0.070 |
| 14 | 2019 | TOTAL | 0.063 |
| 15 | 2005 | ORGANIC | 0.033 |
| 16 | 2006 | ORGANIC | 0.035 |
| 17 | 2007 | ORGANIC | 0.037 |
| 18 | 2008 | ORGANIC | 0.036 |
| 19 | 2009 | ORGANIC | 0.034 |
| 20 | 2010 | ORGANIC | 0.031 |
| 21 | 2011 | ORGANIC | 0.032 |
| 22 | 2012 | ORGANIC | 0.035 |
| 23 | 2013 | ORGANIC | 0.032 |
| 24 | 2014 | ORGANIC | 0.038 |
| 25 | 2015 | ORGANIC | 0.035 |
| 26 | 2016 | ORGANIC | 0.045 |
| 27 | 2017 | ORGANIC | 0.041 |
| 28 | 2018 | ORGANIC | 0.042 |
| 29 | 2019 | ORGANIC | 0.038 |
| 30 | 2005 | REFERRAL | 0.054 |
| 31 | 2006 | REFERRAL | 0.048 |
| 32 | 2007 | REFERRAL | 0.049 |
| 33 | 2008 | REFERRAL | 0.053 |
| 34 | 2009 | REFERRAL | 0.050 |
| 35 | 2010 | REFERRAL | 0.055 |
| 36 | 2011 | REFERRAL | 0.043 |
| 37 | 2012 | REFERRAL | 0.037 |
| 38 | 2013 | REFERRAL | 0.037 |
| 39 | 2014 | REFERRAL | 0.036 |
| 40 | 2015 | REFERRAL | 0.031 |
| 41 | 2016 | REFERRAL | 0.035 |
| 42 | 2017 | REFERRAL | 0.032 |
| 43 | 2018 | REFERRAL | 0.028 |
| 44 | 2019 | REFERRAL | 0.025 |
Graph without highlights�
First, we will create the graph without bringing attention to any specific data.
# Line graph
line_simple = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line(strokeWidth = 3) # Set thickness of the line
.encode(
x = alt.X(
"YEAR:O", # Inform that Year is an Ordinal data
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0) # Align horizontally
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1]) # y-axis goes from 0 to 0.1
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
) # All lines gray
)
.properties(width = 500)
)
# Labeling for the lines
text_simple = (
alt.Chart(melted_table)
.mark_text(
align = "left",
dx = 20,
size = 13,
color = "#aaaaaa"
)
.encode(
# Place text at the end of the line
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric"
)
)
gray_line = line_simple + text_simple
gray_line.configure_view(stroke = None)
Visualization as depicted in the book:

Not implemented versions�
Before initiating the generated graphs, it is important to acknowledge that two highlight options—Arrows and Circles—were ultimately not implemented for this project. These versions are: Arrows and Circles. The decision not to include them stemmed from the challenges associated with their execution in Altair and their perceived lack of effectiveness in conveying the intended information, as the author herself admits by calling them "brute force options".
Arrow highlight in the book:

Circle highlight in the book:

Transparent white boxes�
In constructing this graph, the author employed transparent white boxes to obscure lines other than Referral, aiming to reduce their visibility and emphasize the desired data. However, a more elegant approach in Altair involves achieving the same effect by simply adjusting the opacity of each line.
# Create line graph
line = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%",
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
),
opacity = alt.condition(
# If Metric equals Referral, it has maximum opacity
# Otherwise, it is slightly transparent
alt.datum["Metric"] == "REFERRAL",
alt.value(1),
alt.value(0.5)
)
)
.properties(width = 500)
)
# Text for the end of each line
text = (
alt.Chart(melted_table)
.mark_text(
align = "left",
baseline = "middle",
dx = 20,
size = 13,
color = "#aaaaaa"
)
.encode(
x = alt.X(
"YEAR:O",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
opacity = alt.condition(
# If Metric equals Referral, it has maximum opacity
# Otherwise, it is slightly transparent
alt.datum["Metric"] == "REFERRAL",
alt.value(1),
alt.value(0.5)
)
)
)
final_boxes = line + text
final_boxes.configure_view(stroke = None)
Visualization as depicted in the book:

White boxes created by the author:

Thicken the line�
Next, we will increase the thickness of the Referral line to draw greater attention to it.
# Line graph with thicker line
line = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line()
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
),
strokeWidth = alt.condition(
# Make Referral line thicker
alt.datum["Metric"] == "REFERRAL",
alt.value(4),
alt.value(2)
)
)
.properties(width = 500)
)
# Text at the end of the line
text = (
alt.Chart(melted_table)
.mark_text(
align = "left",
baseline = "middle",
dx = 20,
size = 13,
color = "#aaaaaa"
)
.encode(
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
opacity = alt.condition(
# Make Referral line more opaque
alt.datum["Metric"] == "REFERRAL",
alt.value(1),
alt.value(0.7)
)
)
)
final_thick = line + text
final_thick.configure_view(stroke = None)
Visualization as depicted in the book:

Change line style�
We can change the Referral line to a dashed line, while maintaining other continuous. This type of highlight is recommended when you are representing some type of uncertain data, such as a prediction or goal.
# Line chart
line = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
),
strokeDash = alt.condition(
# Define dash based on condition if equal referral
alt.datum["Metric"] == "REFERRAL",
alt.value([5, 3]),
alt.value([1, 0])
)
)
.properties(width = 500)
)
# We will use the same text as the original version of the graph
final_dashed = line + text_simple
final_dashed.configure_view(stroke = None)
Visualization as depicted in the book:

Leverage intensity�
In the opposite idea of making the other lines slightly transparent, we can make the referral line darker in color.
Since the next part of the exercise is to covert the Referral line on visually top of the others and the default behavior Altair already does that, we will undertake the task of intentionally positioning the Referral line below for now. This will be achieved by creating the Referral line separately and adding it as the initial layer in the graph.
# Line only for referral
line_referral = (
alt.Chart(melted_table)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.value("black"), # Color it black
opacity = alt.value(1) # Maximum opacity
)
.properties(width = 500)
# Filter out other lines
.transform_filter(
alt.FieldEqualPredicate(
field = "Metric",
equal = "REFERRAL"
)
)
)
# Other lines
line_rest = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
# Gray color
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
),
opacity = alt.value(1) # Maximum opacity
)
.properties(width = 500)
# Filter referral out
.transform_filter(
alt.FieldOneOfPredicate(
field = "Metric",
oneOf = ["ORGANIC", "TOTAL"]
)
)
)
# Create text that highlights Referral in black
text_highlight = (
alt.Chart(melted_table)
.mark_text(
align = "left",
dx = 20,
size = 13
)
.encode(
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
color = alt.condition(
alt.datum["Metric"] == "REFERRAL",
alt.value("black"),
alt.value("#aaaaaa")
)
)
)
final_darker = line_referral + line_rest + text_highlight
final_darker.configure_view(stroke = None)
Adding the text created an undesired axis on top, that we could not remove by setting Axis = None. To solve this issue, we will define each word by pixel value.
# Text for Total
text_total = alt.Chart({"values":
[{"text": ['TOTAL']}]
}
).mark_text(size = 13,
align = "left",
dx = 230, dy = -37,
color = '#aaaaaa'
).encode(text = "text:N")
# Text for Organic
text_organic = alt.Chart({"values":
[{"text": ['ORGANIC']}]
}
).mark_text(size = 13,
align = "left",
dx = 230, dy = 37,
color = '#aaaaaa'
).encode(text = "text:N")
# Text for Referral
text_referral = alt.Chart({"values":
[{"text": ['REFERRAL']}]
}
).mark_text(size = 13,
align = "left",
dx = 230, dy = 78,
color = 'black'
).encode(text = "text:N")
final_darker = (
line_referral
+ line_rest
+ text_total
+ text_organic
+ text_referral
)
final_darker.configure_view(stroke = None)
Visualization as depicted in the book:

Position Referral on top of other lines�
We can create this effect by simply making a line graph where the color has a condition for Metric == Referral, or a scale with range [gray, black, gray]. However, since we already have the visualizations from above, we can just change the order we add them so that the line referral is on top of the others.
final_top = (
line_rest
+ line_referral
+ text_total
+ text_organic
+ text_referral
)
final_top.configure_view(stroke = None)
Visualization as depicted in the book:

Change the hue�
Similarly to setting the highlighted line to black, we can change it to other colors such as red.
# Create a red referral line
line_red = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
# The hex code in the middle corresponds to the referral line
scale = alt.Scale(
range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
),
legend = None
)
)
.properties(width = 500)
)
# Create a referral red text
text_red = (
alt.Chart(melted_table)
.mark_text(
align = "left",
baseline = "middle",
dx = 20,
size = 13
)
.encode(
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
color = alt.Color(
"Metric",
# Set referral color to red
scale = alt.Scale(range = ["#aaaaaa", "#d24b53", "#aaaaaa"]),
legend = None
)
)
)
final_red = line_red + text_red
final_red.configure_view(stroke = None)
Visualization as depicted in the book:

Use words�
For this example, we will add a second part for the title.
# Same graph as the one without highlights
# but with a bigger title
line = (
alt.Chart(
melted_table,
title = alt.Title(
# Add more explicit title
"Conversion rate over time: Referral decreasing markedly since 2010",
fontWeight = "normal",
anchor = "start",
fontSize = 17
),
)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
)
)
.properties(width = 500)
)
final_text = line + text_simple
final_text.configure_view(stroke = None)
Visualization as depicted in the book:

Eliminate other data�
For this, we will only display the pertinent data.
# Filter other data out of the original graph
final_referral = gray_line.transform_filter(
alt.FieldEqualPredicate(
field = "Metric",
equal = "REFERRAL")
)
final_referral.configure_view(stroke = None)
Visualization as depicted in the book:

Animate to appear�
This option is only described in the book and not visually demonstrated due to its static nature. The idea of the author is to animate in such a way that one line appears at a time. We will double this section as our interactive graph and display a visualization where users can click on the legend to reveal the desired line. Holding down the Shift key while clicking will enable the selection of multiple lines.
# Define an interactive selection
metric_selection = alt.selection_point(fields = ["Metric"])
# Base line chart
line = (
alt.Chart(
melted_table,
title = alt.Title(
"Conversion rate over time: Referral decreasing markedly since 2010",
fontWeight = "normal",
anchor = "start",
fontSize = 17
)
)
.mark_line(strokeWidth = 3)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal",
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"])
),
# Maximum transparency if not selected
opacity = alt.condition(
metric_selection,
alt.value(1),
alt.value(0)
)
)
.add_params(metric_selection)
.transform_filter(metric_selection)
.properties(width = 500)
)
# Text chart
text = (
alt.Chart(melted_table)
.mark_text(
align = "left",
baseline = "middle",
dx = 20,
size = 13
)
.encode(
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
),
# Maximum transparency if not selected
opacity = alt.condition(
metric_selection,
alt.value(1),
alt.value(0)
)
)
.add_params(metric_selection)
.transform_filter(metric_selection)
)
# CLickable legend
legend = (
alt.Chart(melted_table)
.mark_point()
.encode(
alt.Y(
"Metric",
axis = alt.Axis(orient = "right")
),
color = alt.condition(
metric_selection,
alt.value("#aaaaaa"),
alt.value("lightgrey")
)
)
.add_params(metric_selection)
)
final_interactive = line + text | legend
final_interactive.configure_view(stroke = None)
Add data markers�
We can add data markers simply by defining point = True in the line mark.
# Referral line with markers
line_referral_markers = (
alt.Chart(melted_table)
.mark_line(
strokeWidth = 3,
point = True
)
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.value("#aaaaaa"),
opacity = alt.value(1)
)
.properties(width = 500)
.transform_filter(
alt.FieldEqualPredicate(
field = "Metric",
equal = "REFERRAL")
)
)
# Text only for referral in gray color
# since adding the "text_simple" results in top axis
text_referral_gray = (
alt.Chart({"values": [
{"text": ["REFERRAL"]}
]})
.mark_text(
size = 13,
align = "left",
dx = 230,
dy = 78,
color = "#aaaaaa"
)
.encode(text = "text:N")
)
final_markers = (
line_rest
+ line_referral_markers
+ text_total
+ text_organic
+ text_referral_gray
)
final_markers.configure_view(stroke = None)
Visualization as depicted in the book:

Add data labels�
In addition to the markers, we can also add labels.
# Coding for the labels
label = alt.Chart(
melted_table
).mark_text(
align = 'left',
dx = 3,
color = '#aaaaaa'
).encode(
x = alt.X('YEAR:O'),
y = alt.Y('Value'),
text = alt.Text(
'Value',
format = ".1%"
),
xOffset = alt.value(-10),
yOffset = alt.value(-10)
).transform_filter(
alt.FieldEqualPredicate(
field = 'Metric',
equal = "REFERRAL"
)
)
final = (
line_rest
+ line_referral_markers
+ text_total
+ text_organic
+ text_referral_gray
+ label
)
final.configure_view(stroke = None)
Visualization as depicted in the book:

Add end markers and labels�
Instead of adding a marker and label to every data in Referral, we can add it only to the end of each line.
# Text with the Metric
text = (
alt.Chart(melted_table)
.mark_text(
align = "left",
baseline = "middle",
dx = 55,
size = 13
)
.encode(
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#aaaaaa"]),
legend = None
)
)
)
# Text with the Value
text2 = (
alt.Chart(melted_table)
.mark_text(
align = "left",
color = "#aaaaaa"
)
.encode(
x = alt.X(
"YEAR:O",
axis = None
),
y = alt.Y("Value"),
text = alt.Text(
"Value",
format = ".1%"
),
xOffset = alt.value(225)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "YEAR",
equal = 2019
)
)
)
# Add end point
point = (
alt.Chart(melted_table)
.mark_point(
filled = True,
color = "#aaaaaa"
)
.encode(
x = alt.X(
"YEAR:O",
axis = None
),
y = alt.Y("Value"),
opacity = alt.value(1)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "YEAR",
equal = "2019")
)
)
line_simple + text + text2 + point
The points appear to be in the incorrect location on the x-axis. Interestingly, running the visualization without the text elements seems to alter their positions.
line + point
To solve this issue, we will simply offset the point in the axis until they are in the right place.
point = alt.Chart(
melted_table
).mark_point(
filled = True,
color = '#aaaaaa'
).encode(
x = alt.X('YEAR:O', axis = None),
y = alt.Y('Value'),
opacity = alt.value(1),
# Offset points in the x-axis
xOffset = alt.value(217)
).transform_filter(
alt.FieldEqualPredicate(
field = 'YEAR',
equal = '2019'
)
)
final = line + text + text2 + point
final.configure_view(stroke = None)
Visualization as depicted in the book:

Combine�
To conclude the exercise, we will combine some attributes into one single graph. We will make a thicker colored line, with data markers and data labels, along with tied texts.
# Red and thicker referral line
line_red_thicker = (
alt.Chart(
melted_table # We need to add the title separately
)
.mark_line()
.encode(
x = alt.X(
"YEAR:O",
axis = alt.Axis(
labelAngle = 0,
labelColor = "#888888",
titleColor = "#888888",
titleAnchor = "start",
titleFontWeight = "normal"
),
title = "FISCAL YEAR",
scale = alt.Scale(align = 0)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = "CONVERSION RATE",
scale = alt.Scale(domain = [0, 0.1])
),
color = alt.Color(
"Metric",
# The hex code in the middle corresponds to the referral line
scale = alt.Scale(
range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
),
legend = None
),
strokeWidth = alt.condition(
# Make Referral line thicker
alt.datum["Metric"] == "REFERRAL",
alt.value(4),
alt.value(2)
)
)
.properties(width = 500)
)
# Red text with Metric
text_red = (
alt.Chart(melted_table)
.mark_text(
align = "left",
baseline = "middle",
dx = 55,
size = 13
)
.encode(
x = alt.X(
"YEAR",
aggregate = "max",
axis = None
),
y = alt.Y(
"Value",
aggregate = {"argmax": "YEAR"}
),
text = "Metric",
color = alt.Color(
"Metric",
scale = alt.Scale(
range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
),
legend = None
)
)
)
# Red text with Value
text2_red = (
alt.Chart(melted_table)
.mark_text(align = "left")
.encode(
x = alt.X("YEAR:O", axis = None),
y = alt.Y("Value"),
text = alt.Text("Value", format = ".1%"),
xOffset = alt.value(225),
color = alt.Color(
"Metric",
scale = alt.Scale(
range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
),
legend = None
)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "YEAR",
equal = 2019
)
)
)
# Red points
point_red = (
alt.Chart(melted_table)
.mark_point(filled = True)
.encode(
x = alt.X("YEAR:O", axis = None),
y = alt.Y("Value"),
opacity = alt.value(1),
xOffset = alt.value(217),
color = alt.Color(
"Metric",
scale = alt.Scale(
range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
),
legend = None
)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "YEAR",
equal = "2019"
)
)
)
# End points
other_points = (
alt.Chart(melted_table)
.mark_point(filled = True, color = "#d24b53")
.encode(
x = alt.X("YEAR:O", axis = None),
y = alt.Y("Value"),
opacity = alt.value(1)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "YEAR", oneOf = ["2010", "2016"]
)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "Metric", equal = "REFERRAL"
)
)
)
line_red_thicker + text_red + text2_red + point_red + other_points
Since rationally positioning the red points failed, we will manually insert the x and y values for each of them.
# Red point from 2010
point2010 = (
alt.Chart(melted_table)
.mark_point(filled = True, color = "#d24b53")
.encode(
x = alt.X("YEAR:O", axis = None),
y = alt.Y("Value"),
opacity = alt.value(1),
xOffset = alt.value(-84)
)
# Filter only Referral 2010
.transform_filter(
alt.FieldEqualPredicate(
field = "YEAR",
equal = "2010"
)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "Metric",
equal = "REFERRAL"
)
)
)
# Red point for referral 2016
point2016 = (
alt.Chart(melted_table)
.mark_point(filled = True, color = "#d24b53")
.encode(
x = alt.X("YEAR:O", axis = None),
y = alt.Y("Value"),
opacity = alt.value(1),
xOffset = alt.value(117)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "YEAR",
equal = "2016"
)
)
.transform_filter(
alt.FieldEqualPredicate(
field = "Metric",
equal = "REFERRAL"
)
)
)
# Line connecting 2016 point with text
rule2016 = (
alt.Chart()
.mark_rule()
.encode(
x = alt.value(367),
y = alt.datum(0.034),
y2 = alt.datum(0.013),
color = alt.value("#aaaaaa")
)
)
# Line connecting 2010 point with text
rule2010 = (
alt.Chart()
.mark_rule()
.encode(
x = alt.value(166),
y = alt.datum(0.054),
y2 = alt.datum(0.022),
color = alt.value("#aaaaaa")
)
)
# Red part of text for 2010
text2010red = (
alt.Chart({"values": [
{"text": ["2010: all time referral conversion high"]}
]})
.mark_text(
size = 10,
align = "left",
dx = -87,
dy = 93,
fontWeight = "bold",
color = "#d24b53"
)
.encode(text = "text:N")
)
# Red part of text for 2016
text2016red = (
alt.Chart({"values": [
{"text": ["2016: new campaigns"]}
]})
.mark_text(
size = 10,
align = "left",
dx = 115,
dy = 118,
fontWeight = "bold",
color = "#d24b53"
)
.encode(text = "text:N")
)
# Rest of text for 2010
text2010gray = (
alt.Chart(
{
"values": [
{
"text": [
"(5.5%). Strong partnerships historically",
"meant steady conversions. Entry of",
"competitor ABC has markedly impacted",
"referral quality: fewer are buying.",
]
}
]
}
)
.mark_text(
size = 10,
align = "left",
dx = -87,
dy = 105,
fontWeight = "bold",
color = "#aaaaaa"
)
.encode(text = "text:N")
)
# Rest of text for 2016
text2016gray = (
alt.Chart(
{"values": [
{"text": ["lead to brief uptick; steady", "decrease since then."]}
]}
)
.mark_text(
size = 10,
align = "left",
dx = 115,
dy = 130,
fontWeight = "bold",
color = "#aaaaaa"
)
.encode(text = "text:N")
)
# First half of title
title_black = (
alt.Chart({"values": [
{"text": ["Conversion rate over time: "]}
]})
.mark_text(
size = 16,
align = "left",
dx = -275,
dy = -168,
color = "black"
)
.encode(text = "text:N")
)
# Second half of title
title_red = (
alt.Chart({"values": [
{"text": ["referral decreasing markedly since 2010"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -83,
dy = -168,
color = "#d24b53"
)
.encode(text = "text:N")
)
final_combined = (
line_red_thicker
+ text_red
+ text2_red
+ rule2010
+ rule2016
+ point_red
+ point2010
+ point2016
+ text2010red
+ text2016red
+ text2010gray
+ text2016gray
+ title_black
+ title_red
)
final_combined.configure_view(stroke = None)
Visualization as depicted in the book:

Chapter 5 - Think like a designer�
"Where do you want your audience to look?" - Cole Nussbaumer Knaflic
Exercise 4 - Design in style (Inspired)�
This exercise revolves around integrating brand design into graphs. While the author in the book focuses on creating a graph with a Coke and Light Coke theme, our approach takes it a step further. We will delve into various color inspirations using the most colorful graph thus far, which is the one from exercise 4.2.
We will separate the exercise into four categories: Accessibility, Branding, Paintings, and Nature.
If you would like to return to the Table of Contents, you can click here.
Loading data�
This is the same data as exercise 4.2.
table = pd.read_excel(r"Data\4.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 5, skipfooter = 30)
table['Brands'] = table['Unnamed: 1']
table['Change'] = table['$ Vol % change']
table.drop(columns = ['Unnamed: 1', '$ Vol % change'], inplace = True)
table
| spacing for dot plot | Brands | Change | |
|---|---|---|---|
| 0 | 0 | Fran's Recipe | -0.14 |
| 1 | 1 | Wholesome Goodness | -0.13 |
| 2 | 2 | Lifestyle | -0.10 |
| 3 | 3 | Coat protection | -0.09 |
| 4 | 4 | Diet Lifestyle | -0.08 |
| 5 | 5 | Feline Basics | -0.05 |
| 6 | 6 | Lifestyle Plus | -0.04 |
| 7 | 7 | Feline Freedom | -0.02 |
| 8 | 8 | Feline Gold | 0.01 |
| 9 | 9 | Feline Platinum | 0.01 |
| 10 | 10 | Feline Instinct | 0.02 |
| 11 | 11 | Feline Pro | 0.03 |
| 12 | 12 | Farm Fresh Tasties | 0.04 |
| 13 | 13 | Feline Royal | 0.05 |
| 14 | 14 | Feline Focus | 0.09 |
| 15 | 15 | Feline Grain Free | 0.09 |
| 16 | 16 | Feline Silver | 0.12 |
| 17 | 17 | Nutri Balance | 0.16 |
| 18 | 18 | Farm Fresh Basics | 0.17 |
The original graph�
decreased_most = table.nsmallest(2, "Change")
increased_most = table.nlargest(2, "Change")
brands_decreased = decreased_most["Brands"].tolist()
brands_increased = increased_most["Brands"].tolist()
conditions_decreased = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition_decreased = f"({'|'.join(conditions_decreased)})"
conditions_increased = [f'datum.Brands == "{brand}"' for brand in brands_increased]
condition_increased = f"({'|'.join(conditions_increased)})"
chart_gray = (
alt.Chart(table)
.mark_bar(color = "#c6c6c6", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
chart_oranges_mix = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#ec7c30"),
alt.value("#efb284")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle",
]
)
)
)
chart_blue_mix = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#4772b8"),
alt.value("#91a9d5")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
label1_gray = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(
align = "left",
color = "#c6c6c6",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
label2_gray = (
alt.Chart(
table.loc[table["Change"] > 0]
)
.mark_text(
align = "right",
color = "#c6c6c6",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
label_oranges = (
alt.Chart(
table.loc[table["Change"] < 0]
)
.mark_text(align = "left", fontWeight = 700)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_decreased,
alt.value("#ec7c30"),
alt.value("#efb284")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
label_blue = (
alt.Chart(
table.loc[table["Change"] > 0]
)
.mark_text(
align = "right",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_increased,
alt.value("#4772b8"),
alt.value("#91a9d5")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
title_bw = (
alt.Chart({"values": [
{"text": ["Cat food brands:"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -200,
dy = -270,
fontWeight = "normal",
color = "black"
)
.encode(text = "text:N")
)
title_bw_bold = (
alt.Chart({"values": [
{"text": ["Lifestyle line brands decline"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "black"
)
.encode(text = "text:N")
)
title_bw_bold_2 = (
alt.Chart({"values": [
{"text": ["mixed results in sales year-over-year"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -78,
dy = -270,
fontWeight = 700,
color = "black"
)
.encode(text = "text:N")
)
subtitle_bw = (
alt.Chart({"values": [
{"text": ["YEAR-OVER-YEAR % CHANGE IN VOLUME ($)"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -200,
dy = -250,
fontWeight = "normal",
color = "gray"
)
.encode(text = "text:N")
)
decreased_orange = (
alt.Chart({"values": [
{"text": ["DECREASED"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -80,
dy = -220,
fontWeight = 700,
color = "#ec7c30"
)
.encode(text = "text:N")
)
increased_blue = (
alt.Chart({"values": [
{"text": ["INCREASED"]}
]})
.mark_text(
size = 11,
align = "left",
dx = 20,
dy = -220,
fontWeight = 700,
color = "#4772b8"
)
.encode(text = "text:N")
)
separation = (
alt.Chart({"values": [{"text": ["|"]}]})
.mark_text(
size = 11,
align = "left",
dx = 3,
dy = -220,
fontWeight = 700,
color = "#c6c6c6"
)
.encode(text = "text:N")
)
original = (
chart_gray
+ chart_oranges_mix
+ chart_blue_mix
+ label1_gray
+ label2_gray
+ label_oranges
+ label_blue
+ title_bw
+ title_bw_bold_2
+ subtitle_bw
+ decreased_orange
+ increased_blue
+ separation
)
original.properties(width = 400).configure_view(stroke = None)
Accessibility�
How do we assess the accessibility of the graph palette?
Colorblind�
First, we can start by checking if the original colors are accessible to people with color blindness. The online tool Coloring for Colorblindness helps simulate how your selected color palette appears to viewers with protanopia, deuteranopia (both being the inability to tell the difference between red and green), and tritanopia (inability to tell the difference between blue and green, purple and red, and yellow and pink), respectively.
The image below shows that our initial palette can be distinguished by those with these visual deficiency. The website also offers some famous colorblind friendly palette, such as the Wong palette.

Black and White�
Although the complete inability to distinguish colors is very rare, having a black and white version of your graph can be beneficial in certain situations, such as when intending to print it in a newspaper or an article. By using the Microsoft Photos App, we can preview how our visualization would appear without colors.

Since the top and bottom colors can not be distinguished easily, we can create a new black and white palette.
# Same graph as before, but
# with colors in the black and white spectrum
chart_gray = (
alt.Chart(table)
.mark_bar(color = "#666666", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
chart_light_gray_mix = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#bbbbbb"),
alt.value("#999999")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
chart_black_mix = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = alt.Axis(
grid = False,
orient = "top",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
format = "%"
),
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("black"),
alt.value("#333333")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
label1_gray = (
alt.Chart(table.loc[table["Change"] < 0])
.mark_text(
align = "left",
color = "#666666",
fontWeight = 700
)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
label2_gray = (
alt.Chart(table.loc[table["Change"] > 0])
.mark_text(
align = "right",
color = "#666666",
fontWeight = 700
)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands")
)
)
label_light_gray = (
alt.Chart(table.loc[table["Change"] < 0])
.mark_text(align = "left", fontWeight = 700)
.encode(
x = alt.value(207),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_decreased,
alt.value("#bbbbbb"),
alt.value("#999999")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
label_black = (
alt.Chart(table.loc[table["Change"] > 0])
.mark_text(align = "right", fontWeight = 700)
.encode(
x = alt.value(192),
y = alt.Y("Brands", sort = None),
text = alt.Text("Brands"),
color = alt.condition(
condition_increased,
alt.value("black"),
alt.value("#333333")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
title_bw = (
alt.Chart({"values": [
{"text": ["Cat food brands:"]}
]})
.mark_text(
size = 16,
align = "left",
dx = -200,
dy = -270,
fontWeight = "normal",
color = "black"
)
.encode(text = "text:N")
)
decreased_light_gray = (
alt.Chart({"values": [
{"text": ["DECREASED"]}
]})
.mark_text(
size = 11,
align = "left",
dx = -80,
dy = -220,
fontWeight = 700,
color = "#999999"
)
.encode(text = "text:N")
)
increased_black = (
alt.Chart({"values": [
{"text": ["INCREASED"]}
]})
.mark_text(
size = 11,
align = "left",
dx = 20,
dy = -220,
fontWeight = 700,
color = "black"
)
.encode(text = "text:N")
)
black_and_white = (
chart_gray
+ chart_light_gray_mix
+ chart_black_mix
+ label1_gray
+ label2_gray
+ label_light_gray
+ label_black
+ title_bw
+ title_bw_bold_2
+ subtitle_bw
+ decreased_light_gray
+ increased_black
+ separation
)
black_and_white.properties(width = 400).configure_view(stroke = None)
Branding�
Branding is a really important aspect of marketing. Most companies have set color palettes, logos and design. Tools such as Image Color Picker and Adobe Color can help you extract and implement those official palettes into themed data visualizations. For the following graphs, we will not use labeling or titles, as it will not represent the cat food brands and be used only as a canvas for the palettes.
Cookie Clicker�
Cookie Clicker is an idle game created by the French programmer Orteil, and it has been open in a separate tab baking and accumulating cookies throughout the entirety of this project. Hence, it only seems fitting to pay homage with a dedicated color palette!
# Orange middle (for the orange milk in the game)
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#ee8241", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Dark brown top (for the chocolate chips)
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#4a251d"),
alt.value("#64433a")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Light brown bottom (for the cookie dough)
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#81532d"),
alt.value("#c0a681")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
cookie_clicker = chart_middle + chart_down + chart_up
cookie_clicker.properties(width = 400).configure_view(stroke = None)
Game screenshot that inspired the color palette:

Google Maps�
One of the most distinctive brand designs is the vibrant Google palette, widely employed in various apps associated with the company. Inspired by the Google Maps logo, which features precisely five colors, we will demonstrate how our graph looks when incorporating this design.
# Green middle
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#34A852", size = 15)
.encode(
x=alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Red and yellow top
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#EA4335"),
alt.value("#FABB04")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Blue and light blue bottom
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#1A73E8"),
alt.value("#4285F4")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
google = chart_middle + chart_down + chart_up
google.properties(width = 400).configure_view(stroke = None)
Google maps logo used for reference:

# Light blue middle
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#7B95A6", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Yellow tones top for the stars
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#A65D05"),
alt.value("#D9B13B")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Blue tones bottom for the sky
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#0B1E38"),
alt.value("#304F8C")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
starry_night = chart_middle + chart_down + chart_up
starry_night.properties(width = 400).configure_view(stroke = None)
Painting that inspired the palette:

A Cuca by Tarsila do Amaral�
# Green middle for the plants
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#034001", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Blue and green top for the plants and pond
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#0339A6"),
alt.value("#067302")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Yellow and orange bottom for the Cuca
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y(
"Brands",
sort = None,
axis = None
),
color = alt.condition(
condition_increased,
alt.value("#F25C05"),
alt.value("#F29F05")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
cuca = chart_middle + chart_down + chart_up
cuca.properties(width = 400).configure_view(stroke = None)
Painting that inspired the palette:

The Great Wave Off Kanagawa by Hokusai�
# Light blue for the waves
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#8FBABF", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None, title = None
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Beige top for the sky and boats
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#E0C6A3"),
alt.value("#D9B779")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Darker blue for the waves
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#010326"),
alt.value("#010B40")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
wave = chart_middle + chart_down + chart_up
wave.properties(width = 400).configure_view(stroke = None)
Painting that inspired the palette:

Nature�
Inspired by the use of imagery of roses in the works of Theresa-Marie Rhyne, we will extract the color palette from pictures of nature.
Sunset�
This picture was taken by Sergio Mena Ferraira and can be found here.
# Bright orange middle for the sun
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#FF9200", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Purple tones top for the sky
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#AA4650"),
alt.value("#FD4044")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Orange tones bottom for the sky
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#FA4210"),
alt.value("#FD7E37")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
sunset = chart_middle + chart_down + chart_up
sunset.properties(width = 400).configure_view(stroke = None)
Picture that inspired the palette:

# Light green middle from leaves
chart_middle = (
alt.Chart(table)
.mark_bar(color = "#5C7346", size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None)
)
)
# Lighter green top from leaves
chart_up = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_decreased,
alt.value("#ABD9A9"),
alt.value("#83A66A")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Fran's Recipe",
"Wholesome Goodness",
"Lifestyle",
"Coat protection",
"Diet Lifestyle"
]
)
)
)
# Dark green and brown bottom from leaves and tree trunk
chart_down = (
alt.Chart(table)
.mark_bar(size = 15)
.encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = alt.condition(
condition_increased,
alt.value("#140F09"),
alt.value("#3B401B")
)
)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Brands",
oneOf = [
"Feline Focus",
"Feline Grain Free",
"Feline Silver",
"Nutri Balance",
"Farm Fresh Basics"
]
)
)
)
forest = chart_middle + chart_down + chart_up
forest.properties(width = 400).configure_view(stroke = None)
Picture that inspired the palette:

Interactivity�
For interactivity, we will empower the user to customize the palette of the graph without requiring any knowledge of Altair.
We have devised five charts, each representing a distinct color in this visualization. Additionally, we've provided a space where the viewer can select the hue in the spectrum or input the RGB, HSL or the hex code for each chart.
# Create the color parameters, with default in the original graph
color_one = alt.param(
value = "#ec7c30",
bind = alt.binding(
input = 'color',
name = 'First color: '
)
)
color_two = alt.param(
value = "#efb284",
bind = alt.binding(
input = 'color',
name = 'Second color: '
)
)
color_three = alt.param(
value = "#c6c6c6",
bind = alt.binding(
input = 'color',
name = 'Third color: '
)
)
color_four = alt.param(
value = "#91a9d5",
bind = alt.binding(
input = 'color',
name = 'Fourth color: '
)
)
color_five = alt.param(
value = "#4772b8",
bind = alt.binding(
input = 'color',
name = 'Fifth color: '
)
)
# First two bars
chart_one = alt.Chart(table).mark_bar(
size = 15
).encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = color_one
).transform_filter(
alt.FieldOneOfPredicate(
field = 'Brands',
oneOf = list(table['Brands'][0:2])
)
).add_params(
color_one
)
# Three next bars
chart_two = alt.Chart(table).mark_bar(
size = 15
).encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = color_two
).transform_filter(
alt.FieldOneOfPredicate(
field = 'Brands', oneOf = list(table['Brands'][2:5])
)
).add_params(
color_two
)
# Middle bars
chart_three = alt.Chart(
table
).mark_bar(size = 15).encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = color_three
).transform_filter(
alt.FieldOneOfPredicate(
field = 'Brands', oneOf = list(table['Brands'][5:14])
)
).add_params(
color_three
)
# Three bars after the middle
chart_four = alt.Chart(
table
).mark_bar(size = 15).encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = color_four
).transform_filter(
alt.FieldOneOfPredicate(
field = 'Brands', oneOf = list(table['Brands'][14:17])
)
).add_params(
color_four
)
# Last two bars
chart_five = alt.Chart(
table
).mark_bar(size = 15).encode(
x = alt.X(
"Change",
scale = alt.Scale(domain = [-0.20, 0.20]),
axis = None,
title = None
),
y = alt.Y("Brands", sort = None, axis = None),
color = color_five
).transform_filter(
alt.FieldOneOfPredicate(
field='Brands', oneOf = list(table['Brands'][17:19])
)
).add_params(
color_five
)
interactive_colors = (
chart_one
+ chart_two
+ chart_three
+ chart_four
+ chart_five
)
interactive_colors.properties(width = 400).configure_view(stroke = None)
Chapter 6 - Tell a story�
"Data in a spreadsheet or facts on a slide aren’t things that naturally stick with us — they are easily forgotten. Stories, on the other hand, are memorable." - Cole Nussbaumer Knaflic
Exercise 6 - Differentiate between live & standalone stories �
The objective of this exercise is to know how to create visualizations depending in the situation you are going to present them, being that a live presentation or a printed graph. Here, we will be creating a series of charts ready to be in a slide show, demonstrating the ideas step by step.
There is one standalone graph at the end, but the texts will not be inserted due to that process being presented earlier.
If you would like to return to the Table of Contents, you can click here.
Loading the data�
# Loading the table considering Excel formatting
table = pd.read_excel(r"Data\6.6 EXERCISE.xlsx", usecols = [2, 3, 4, 5, 6, 7], header = 4, skipfooter = 5)
table
| Unnamed: 2 | Unnamed: 3 | Internal | External | Overall | Goal | |
|---|---|---|---|---|---|---|
| 0 | Jan | 2019-01-01 | 47.6 | 44.8 | 45.05 | 60 |
| 1 | Feb | 2019-02-01 | 37.9 | 48.5 | 47.25 | 60 |
| 2 | Mar | 2019-03-01 | 17.6 | 49.5 | 46.15 | 60 |
| 3 | Apr | 2019-04-01 | 18.6 | 55.2 | 50.35 | 60 |
| 4 | May | 2019-05-01 | 40.6 | 56.5 | 55.55 | 60 |
| 5 | Jun | 2019-06-01 | 28.8 | 60.7 | 53.85 | 60 |
| 6 | Jul | 2019-07-01 | 27.1 | 44.2 | 42.85 | 60 |
| 7 | Aug | 2019-08-01 | 36.9 | 29.0 | 31.15 | 60 |
| 8 | Sep | 2019-09-01 | 37.1 | 61.2 | 59.15 | 60 |
| 9 | Oct | 2019-10-01 | 25.9 | 44.9 | 41.55 | 60 |
| 10 | Nov | 2019-11-01 | 51.2 | 76.6 | 71.85 | 60 |
| 11 | Dec | 2019-12-01 | 40.6 | 34.7 | 36.15 | 60 |
# Rename columns
table.rename(columns = {'Unnamed: 2': 'Month', 'Unnamed: 3': 'Date'}, inplace = True)
# Drop useless information
table.drop(columns = ['Date', 'Goal', 'Overall'], inplace = True)
# Create long-format version
melted_table = pd.melt(table, id_vars = ['Month'], var_name = 'Metric', value_name = 'Value')
melted_table
| Month | Metric | Value | |
|---|---|---|---|
| 0 | Jan | Internal | 47.6 |
| 1 | Feb | Internal | 37.9 |
| 2 | Mar | Internal | 17.6 |
| 3 | Apr | Internal | 18.6 |
| 4 | May | Internal | 40.6 |
| 5 | Jun | Internal | 28.8 |
| 6 | Jul | Internal | 27.1 |
| 7 | Aug | Internal | 36.9 |
| 8 | Sep | Internal | 37.1 |
| 9 | Oct | Internal | 25.9 |
| 10 | Nov | Internal | 51.2 |
| 11 | Dec | Internal | 40.6 |
| 12 | Jan | External | 44.8 |
| 13 | Feb | External | 48.5 |
| 14 | Mar | External | 49.5 |
| 15 | Apr | External | 55.2 |
| 16 | May | External | 56.5 |
| 17 | Jun | External | 60.7 |
| 18 | Jul | External | 44.2 |
| 19 | Aug | External | 29.0 |
| 20 | Sep | External | 61.2 |
| 21 | Oct | External | 44.9 |
| 22 | Nov | External | 76.6 |
| 23 | Dec | External | 34.7 |
The graph�
First, we will plot the normal graph.
# Line chart
line = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90]) # y-axis goes from 0 to 90
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
)
)
.properties(width = 500)
)
# Goal line
goal = alt.Chart().mark_rule(
strokeDash = [4, 4] # Dashed
).encode(
x = alt.datum("Jan"), # From January
x2 = alt.datum("Dec"), # To December
y = alt.datum(60) # At value 60
)
# Label at the end of the line
label = alt.Chart(melted_table).mark_text(
align = "left",
dx = 3
).encode(
alt.X(
"Month",
aggregate = "max",
sort = None
),
alt.Y(
"Value",
aggregate = {"argmax": "Month"}
),
alt.Text("Metric")
)
final = line + goal + label
final.configure_view(stroke=None)
Once again, we encounter the challenge of Altair considering September as the maximum value for months, given its alphabetical order, even when using Sorted = None. We will solve this by defining x as December and filtering the data.
# Updated label
label = alt.Chart(melted_table).mark_text(
align = 'left',
dx = 4
).encode(
x = alt.datum('Dec'),
y = alt.Y('Value'),
text = alt.Text('Metric')
).transform_filter(
# Only December values
(alt.datum.Month == 'Dec')
)
# Label for the goal line
label_goal = alt.Chart({"values": [
{"text": [ "GOAL"]}
]}).mark_text(
align = 'left',
dx = 4
).encode(
x = alt.datum('Dec'),
y = alt.datum(60),
text = "text:N"
)
final_original = line + goal + label + label_goal
final_original.configure_view(stroke = None)
Visualization as depicted in the book:

Empty graph�
We can start our presentation with only the skeleton of our visualization.
# Create a graph with full transparency
empty = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line(opacity = 0) # Minimum opacity
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False,
title = "2019"
)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90])
)
)
.properties(width = 500)
)
graph_1 = empty.configure_view(stroke = None)
graph_1
Visualization as depicted in the book:

Add the goal line�
We can do this by adding already existing graphs.
graph_2 = (empty + goal + label_goal).configure_view(stroke = None)
graph_2
Visualization as depicted in the book:

Create the first data point�
# Point for january
point_jan = alt.Chart().mark_point(
filled = True,
color = 'black',
size = 50,
opacity = 1
).encode(
x = alt.datum('Jan'),
y = alt.datum(
melted_table["Value"][12]
)
)
# Create opaque versions of the goal line and label
goal_opaque = alt.Chart().mark_rule(
strokeDash = [4,4]
).encode(
x = alt.datum('Jan'),
x2 = alt.datum('Dec'),
y = alt.datum(60),
opacity = alt.value(0.4)
)
label_goal_opaque = alt.Chart({"values": [
{"text": [ "GOAL"]}
]}).mark_text(align = 'left', dx = 4).encode(
x = alt.datum('Dec'),
y = alt.datum(60),
text = "text:N",
opacity = alt.value(0.4)
)
graph_3 = (
empty
+ goal_opaque
+ label_goal_opaque
+ point_jan
).configure_view(stroke = None)
graph_3
Visualization as depicted in the book:

Updating for the first half of the year�
# Create the line until June
partial_line1 = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019",
scale = alt.Scale(
# We need to set all months as a domain
# to make sure they appear in the axis
domain = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec"
]
)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90]),
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
)
)
.properties(width = 500)
.transform_filter(
# Filter only half of the year
alt.FieldOneOfPredicate(
field = "Month",
oneOf = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
)
)
)
# Point for June
point_jun = (
alt.Chart()
.mark_point(
filled = True,
color = "black",
size = 50,
opacity= 1
)
.encode(
x = alt.datum("Jun"),
y = alt.datum(melted_table["Value"][17])
)
)
graph_4 = (
partial_line1.transform_filter(alt.datum.Metric == "External")
+ goal_opaque
+ label_goal_opaque
+ point_jun
).configure_view(stroke = None)
graph_4
Visualization as depicted in the book:

Rest of the year with points�
We will complete the line for the rest of the year, highlighting some points.
# Condition for External points
points_condition = (alt.datum.Metric == "External") & (
(alt.datum.Month == "Aug")
| (alt.datum.Month == "Sep")
| (alt.datum.Month == "Oct")
| (alt.datum.Month == "Nov")
| (alt.datum.Month == "Dec")
)
# Define the points
points = (
alt.Chart(melted_table)
.mark_point(
filled = True,
size = 50,
opacity = 1
)
.encode(
x = alt.X("Month", sort = None),
y = "Value",
color = alt.condition(
# Color depends if it achieved the Goal
alt.datum.Value > 60,
alt.value("#f6792c"),
alt.value("#187cae")
)
)
# Only for the points in the condition
.transform_filter(points_condition)
)
graph_5 = (
# We can filter the line directly
line.transform_filter(alt.datum.Metric == "External")
+ goal_opaque
+ label_goal_opaque
+ points
).configure_view(stroke = None)
graph_5
Visualization as depicted in the book:

Focus on the internal�
Now we will lighten the external line to bring focus on the internal data.
# External line with less opacity
external_opaque = alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
).mark_line().encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
),
opacity = alt.value(0.4) # Set transparency
).properties(width = 500
# Filter data
).transform_filter(alt.datum.Metric == "External")
# Label with less opacity
label_opaque = (
alt.Chart(melted_table)
.mark_text(align = "left", dx = 4)
.encode(
x = alt.datum("Dec"),
y = alt.Y("Value"),
text = alt.Text("Metric"),
opacity = alt.value(0.4)
)
.transform_filter((alt.datum.Month == "Dec"))
.transform_filter(alt.datum.Metric == "External")
)
# Point for January in the Internal line
point_jan2 = (
alt.Chart()
.mark_point(
filled = True,
color = "black",
size = 50,
opacity = 1
)
.encode(
x = alt.datum("Jan"),
y = alt.datum(melted_table["Value"][0])
)
)
graph_6 = (
external_opaque
+ label_opaque
+ goal_opaque
+ label_goal_opaque
+ point_jan2
).configure_view(stroke = None)
graph_6
Visualization as depicted in the book:

First months of the year�
Similarly to the External line, we will plot the first months for Internal - this time stopping at April.
# Line for the first four months
partial_line2 = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019",
scale = alt.Scale(
domain = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec"
]
)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90]),
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
)
)
.properties(width = 500)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Month",
oneOf = ["Jan", "Feb", "Mar", "Apr"]
)
)
)
# Point for April
point_apr = (
alt.Chart()
.mark_point(
filled = True,
color = "black",
size = 50,
opacity = 1
)
.encode(
x = alt.datum("Apr"),
y = alt.datum(melted_table["Value"][3])
)
)
graph_7 = (
partial_line2.transform_filter(alt.datum.Metric == "Internal")
+ goal_opaque
+ label_goal_opaque
+ point_apr
+ external_opaque
+ label_opaque
).configure_view(stroke = None)
graph_7
Visualization as depicted in the book:

Add May�
After showing the decrease in time to fill in the first months, we can now add the next one (May) to show an increase.
#Line with May added
partial_line3 = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019",
scale = alt.Scale(
domain = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec"
]
)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
)
)
.properties(width = 500)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Month",
oneOf = ["Jan", "Feb", "Mar", "Apr", "May"]
)
)
)
# Point for May
point_may = (
alt.Chart()
.mark_point(
filled = True,
color = "black",
size = 50,
opacity = 1
)
.encode(
x = alt.datum("May"),
y = alt.datum(melted_table["Value"][4])
)
)
graph_8 = (
partial_line3.transform_filter(alt.datum.Metric == "Internal")
+ goal_opaque
+ label_goal_opaque
+ point_may
+ external_opaque
+ label_opaque
).configure_view(stroke = None)
graph_8
Visualization as depicted in the book:

Add months until September�
Here we can see a small dip, before the data rising again.
# Line until September
partial_line4 = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019",
scale = alt.Scale(
domain = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec"
]
)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
)
)
.properties(width = 500)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Month",
oneOf = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep"
]
)
)
)
# Point for September
point_sep = (
alt.Chart()
.mark_point(
filled = True,
color = "black",
size = 50,
opacity = 1
)
.encode(
x = alt.datum("Sep"),
y = alt.datum(melted_table["Value"][8])
)
)
graph_9 = (
partial_line4.transform_filter(alt.datum.Metric == "Internal")
+ goal_opaque
+ label_goal_opaque
+ point_sep
+ external_opaque
+ label_opaque
).configure_view(stroke = None)
graph_9
Visualization as depicted in the book:

Add November�
Here we can see another increase.
# Create line until November
partial_line5 = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019",
scale = alt.Scale(
domain = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec"
]
)
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["black"]),
legend = None
)
)
.properties(width = 500)
.transform_filter(
alt.FieldOneOfPredicate(
field = "Month",
oneOf = [
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov"
]
)
)
)
# Point for November
point_nov = (
alt.Chart()
.mark_point(
filled = True,
color = "black",
size = 50,
opacity = 1
)
.encode(
x = alt.datum("Nov"),
y = alt.datum(melted_table["Value"][10])
)
)
graph_10 = (
partial_line5.transform_filter(alt.datum.Metric == "Internal")
+ goal_opaque
+ label_goal_opaque
+ point_nov
+ external_opaque
+ label_opaque
).configure_view(stroke = None)
graph_10
Visualization as depicted in the book:

Add December�
Adding the final month we can see that, despise having a last dip, the Internal time to fill finished the year higher than the External.
# Define point for December
point_dec = alt.Chart().mark_point(
filled = True,
color = 'black',
size = 50,
opacity = 1
).encode(
x = alt.datum('Dec'),
y = alt.datum(melted_table["Value"][11])
)
# Add existing graphs
graph_11 = (
line.transform_filter(alt.datum.Metric == 'Internal')
+ label.transform_filter(alt.datum.Metric == 'Internal')
+ goal_opaque
+ label_goal_opaque
+ point_dec
+ external_opaque
+ label_opaque
).configure_view(stroke = None)
graph_11
Visualization as depicted in the book:

Final graph�
We can now exhibit a final graph with all the values and the External lines colored blue.
# Line chart with External colored blue
line_color = (
alt.Chart(
melted_table,
title = alt.Title(
"Time to fill",
fontSize = 18,
fontWeight = "normal",
anchor = "start",
offset = 10
)
)
.mark_line()
.encode(
x = alt.X(
"Month",
sort = None,
axis = alt.Axis(
labelAngle = 0,
titleAnchor = "start",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal",
ticks = False
),
title = "2019"
),
y = alt.Y(
"Value",
axis = alt.Axis(
grid = False,
titleAnchor = "end",
labelColor = "#888888",
titleColor = "#888888",
titleFontWeight = "normal"
),
title = "TIME TO FILL (DAYS)",
scale = alt.Scale(domain = [0, 90])
),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#1d779c", "#676767"]),
legend = None
)
)
.properties(width = 500)
)
# Labeling with External colored blue
label_color = (
alt.Chart(melted_table)
.mark_text(
align = "left",
dx = 4
)
.encode(
x = alt.datum("Dec"),
y = alt.Y("Value"),
text = alt.Text("Metric"),
color = alt.Color(
"Metric",
scale = alt.Scale(range = ["#1d779c", "#676767"]),
legend = None
)
)
.transform_filter((alt.datum.Month == "Dec"))
)
graph_12 = (
line_color
+ goal_opaque
+ label_goal_opaque
+ label_color
).configure_view(stroke = None)
graph_12
Visualization as depicted in the book:

The exercise still presents a graph to be used as a summarization of the ones above. However, instead of replicating it, we will focus on creating an animation with the visualizations we already have.
Last graph from the book:

Animation�
For our animation, we will leave Altair and use ipywidgets. The code is as follows.
# List of the graphs
graphs = [
graph_1,
graph_2,
graph_3,
graph_4,
graph_5,
graph_6,
graph_7,
graph_8,
graph_9,
graph_10,
graph_11,
graph_12
]
# Create function
def demo(i):
clear_output(wait = True)
if 0 <= i < len(graphs):
chart = graphs[i]
else:
chart = None
display(chart)
# Create the animation
interact(demo, i = widgets.Play(
value = 0,
min = 0,
max = 11,
step = 1,
description = "Press play",
interval = 2000
))
interactive(children=(Play(value=0, description='Press play', interval=2000, max=11), Output()), _dom_classes=…
<function __main__.demo(i)>
It is important to note, however, that the application above only works in Live Kernels (therefore not running on an HTML file). For that reason, we will upload a GIF of a screen recording.
GIF for the animation:

This code was created by Søren Fuglede Jørgensen and can be found here. Please make sure any changes are saved before running this cell.
with open('index.ipynb') as nb_file:
nb_contents = nb_file.read()
# Convert using the ordinary exporter
notebook = nbformat.reads(nb_contents, as_version=4)
# HTML Export
html_exporter = nbconvert.HTMLExporter()
body, res = html_exporter.from_notebook_node(notebook)
# Create a dict mapping all image attachments to their base64 representations
images = {}
for cell in notebook['cells']:
if 'attachments' in cell:
attachments = cell['attachments']
for filename, attachment in attachments.items():
for mime, base64 in attachment.items():
images[f'attachment:{filename}'] = f'data:{mime};base64,{base64}'
# Fix up the HTML and write it to disk
for src, base64 in images.items():
body = body.replace(f'src="{src}"', f'src="{base64}"')
# Write HTML to file
with open('index.html', 'w') as html_output_file:
html_output_file.write(body)