Storytelling with Data! in Altair�

by Maisa de Oliveira Fraiz�

Introduction�

This project aims to replicate the exercises from Cole Nussbaumer Knaflic's book, "Storytelling with Data - Let's Practice!", using Python's Vega-Altair. Our primary objective is to document the reasoning behind the modifications proposed by the author, while also highlighting the challenges that arise when transitioning from the book's Excel-based approach to programming in a different software environment.

Vega-Altair was selected for this project due to its declarative syntax, interactivity, grammar of graphics, and compatibility with web formatting tools, while within the user-friendly Python environment. Anticipated challenges include the comparatively smaller documentation and development community of Vega-Altair compared to more established libraries like Matplotlib, Seaborn, or Plotly, and seemingly straightforward tasks in Excel that may require multiple iterations to translate effectively into the language.

In addition to the broader objective, this notebook also serves as a personal journey of learning Altair, a syntax that was previously unfamiliar to me. By delving into it, I aim to widen my repertoire in the data visualization field, discovering new ways to create compelling visual representations.

The data for all exercises can be found in the book's official website: https://www.storytellingwithdata.com/letspractice/downloads

Imports�

These are the libraries necessary to run the code for this project.

In�[1]:
# For data manipulation and visualization
import pandas as pd
import numpy as np
import altair as alt

# For animation in Chapter 6 - Exercise 6
import ipywidgets as widgets
from ipywidgets import interact
from IPython.display import clear_output

# For converting .ipynb into .html
import nbconvert
import nbformat

# For printing current date
from datetime import date

And these are the versions used.

In�[2]:
# Python version
! python --version
Python 3.11.6
In�[3]:
# Library version

print("Pandas version: " + pd.__version__)
print("Numpy version: " + np.__version__)
print("Altair version: " + alt.__version__)
print("Ipywidgets version: " + widgets.__version__)
print("Nbconvert version: " + nbconvert.__version__)
print("Nbformat version: " + nbformat.__version__)
Pandas version: 2.1.2
Numpy version: 1.26.0
Altair version: 5.1.2
Ipywidgets version: 8.1.1
Nbconvert version: 7.12.0
Nbformat version: 5.9.2
In�[4]:
# Last update of notebook

print("Code last updated at:", date.today())
Code last updated at: 2023-12-10

Table of Contents�

  • Chapter 2

    • Exercise 1
    • Exercise 4
    • Exercise 5
  • Chapter 3

    • Exercise 2
  • Chapter 4

    • Exercise 2
    • Exercise 3
  • Chapter 5

    • Exercise 4 (Inspired)
  • Chapter 6

    • Exercise 6

Chapter 2 - Choose an effective visual�

"When I have some data I need to show, how do I do that in an effective way?" - Cole Nussbaumer Knaflic

Exercise 2.1 - Improve this table�

For this exercise, we will start with a simple table and work our way into transforming it into different types of commonly used visualizations.

If you would like to return to the Table of Contents, you can click here.

Loading the data�

The first problem with the Excel-to-Altair translation arises from the data itself, as it is polluted with titles and texts for readability in Excel. This, however, is not friendly when dealing with Python, so we should be careful when loading it. Alterations like this will happen in all subsequent exercises.

In�[5]:
# Example of polluted loading

table = pd.read_excel(r"Data\2.1 EXERCISE.xlsx")
table
Out[5]:
EXERCISE 2.1 Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5
0 NaN NaN NaN NaN NaN NaN
1 NaN FIG 2.1a NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN New client tier share NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN
5 NaN Tier # of Accounts % Accounts Revenue ($M) % Revenue
6 NaN A 77 0.070772 4.675 0.25
7 NaN A+ 19 0.017463 3.927 0.21
8 NaN B 338 0.310662 5.984 0.32
9 NaN C 425 0.390625 2.805 0.15
10 NaN D 24 0.022059 0.374 0.02
In�[6]:
del table
In�[7]:
# Correctly configures loading
table = pd.read_excel(r"Data\2.1 EXERCISE.xlsx", usecols = [1, 2, 3, 4, 5], header = 6)
table
Out[7]:
Tier # of Accounts % Accounts Revenue ($M) % Revenue
0 A 77 0.070772 4.675 0.25
1 A+ 19 0.017463 3.927 0.21
2 B 338 0.310662 5.984 0.32
3 C 425 0.390625 2.805 0.15
4 D 24 0.022059 0.374 0.02

Table�

The initial changes recommended in the book focus on improving the table's readability itself. These changes include reordering the tiers, adding a row to show the total value, incorporating a category called "All others" to account for unmentioned values when the total percentage doesn't add up to 100%, and rounding the numbers while adjusting the percentage format as required.

The following code implements these modifications.

In�[8]:
# Ordering the tiers

table = table.loc[[1, 0, 2, 3, 4]]
In�[9]:
# Fixing the percentages

table['% Accounts'] = table['% Accounts'].apply(lambda x: x*100)
table['% Revenue'] = table['% Revenue'].apply(lambda x: x*100)
In�[10]:
# Calculating and adding "All other" values

other_account_per = 100 - table['% Accounts'].sum()
other_revenue_per = 100 - table['% Revenue'].sum()

other_account_num = (other_account_per*table['# of Accounts'][0])/table['% Accounts'][0]
other_revenue_num = (other_revenue_per*table['Revenue ($M)'][0])/table['% Revenue'][0]

table.loc[len(table)] = ["All other", other_account_num, other_account_per, other_revenue_num, other_revenue_per]
In�[11]:
# Since we will not use rounded values or the total row for the graphs,
# we should create a new variable before making the following alterations

table_charts = table.copy()
In�[12]:
# Adding total values row

table.loc[len(table)] = ["Total", table['# of Accounts'].sum(), table['% Accounts'].sum(),
                        table['Revenue ($M)'].sum(), table['% Revenue'].sum()]
In�[13]:
# Rounding the numbers

table['% Accounts'] = table['% Accounts'].apply(lambda x: round(x))
table['Revenue ($M)'] = table['Revenue ($M)'].apply(lambda x: round(x, 1))

The new table is as follows:

In�[14]:
table
Out[14]:
Tier # of Accounts % Accounts Revenue ($M) % Revenue
1 A+ 19.0 2 3.9 21.0
0 A 77.0 7 4.7 25.0
2 B 338.0 31 6.0 32.0
3 C 425.0 39 2.8 15.0
4 D 24.0 2 0.4 2.0
5 All other 205.0 19 0.9 5.0
6 Total 1088.0 100 18.7 100.0

or, for even better readability in Python:

In�[15]:
table.set_index("Tier")
Out[15]:
# of Accounts % Accounts Revenue ($M) % Revenue
Tier
A+ 19.0 2 3.9 21.0
A 77.0 7 4.7 25.0
B 338.0 31 6.0 32.0
C 425.0 39 2.8 15.0
D 24.0 2 0.4 2.0
All other 205.0 19 0.9 5.0
Total 1088.0 100 18.7 100.0

Some changes were not implemented, such as colors of rows, alignment of text, and embedding graphs into the table, for lack of compatibility with the Pandas DataFrame format. The percentage symbol (%) next to the number in the percentage columns was not added since doing this in Python will transform the data from int to string, and therefore it is not a recommended approach.

Pie chart�

Considering that percentages depict a fraction of a whole, the next proposal is to employ a pie chart. Here is the default Altair graph version:

In�[16]:
# Default pie chart

alt.Chart(table_charts).mark_arc().encode(
    theta = "% Accounts",
    color = alt.Color('Tier')
)
Out[16]:

Some of the adjustments needed to bring it closer to the original include reordering the tiers, changing the labels position, altering the color palette, and adding an title.

In�[17]:
## % of Accounts Pie Chart

# Creating a base chart with a title, aligned to the left and with normal font weight
base = alt.Chart(
    table_charts, 
    title = alt.Title(r"% of Total Accounts", anchor = 'start', fontWeight = 'normal')
).encode(
    theta = alt.Theta("% Accounts:Q", stack = True), # Encoding the angle (theta) for the pie chart
    color = alt.Color('Tier', legend = None), # Encoding color based on the 'Tier' field
    order = alt.Order(field = 'Tier') # Ordering the sectors of the pie chart based on the 'Tier' field
)

# Creating the pie chart with an outer radius of 115
pie = base.mark_arc(outerRadius = 115)

# Creating text labels for each sector of the pie chart
text = base.mark_text(radius = 140, size = 15).encode(text = alt.Text("Tier"))

# Combining the pie chart and text labels
acc_pie = pie + text
acc_pie
Out[17]:

Not informing the data type for the field order makes it so Altair rearranges the Tiers alphabetically instead of using the order provided by dataframe. We can fix this by identifying Tier as Ordered (O).

In�[18]:
## % of Accounts Pie Chart

base = alt.Chart(
    table_charts, 
    title = alt.Title(r"% of Total Accounts", anchor = 'start', fontWeight = 'normal')
).encode(
    theta = alt.Theta("% Accounts:Q", stack = True),
    color = alt.Color('Tier',
                      scale = alt.Scale(
                          range = ['#4d71bc', '#5d9bd4', '#6fae45', '#febf0f', '#e77e2d', '#a6a6a6']
                        ), # Setting custom colors for each sector of the pie chart
                      sort = None, # So that the colors don't follow the alphabetic order
                      legend = None
                      ),
    order = alt.Order(field = 'Tier:O'))


pie = base.mark_arc(outerRadius = 115)
text = base.mark_text(radius = 140, size = 15).encode(text = alt.Text("Tier"))


acc_pie = pie + text
acc_pie
Out[18]:

Initially, offset was used instead of anchor, manually specifying the title location in the x-axis by pixels. This produces a more replica-like result, as you define the texts to be exactly to the same place as the example. While this approach yields a result that closely mimics the example, we acknowledge that anchoring provides a faster and cleaner solution. The decision has been made to adopt anchoring for the remainder of this project, prioritizing efficiency and universality across all graphs, even if it means sacrificing pinpoint accuracy in text placement.

The HEX color code values of the palette from the book were acquired through the use of the online tool "Color Picker Online", which is freely accessible at https://imagecolorpicker.com/.

The pie chart above can now be easily modified to represent the percentage of total revenue.

In�[19]:
# % of Revenue Pie Chart

base = alt.Chart(
    table_charts, 
    title = alt.Title(r"% of Total Revenue", anchor = 'start', fontWeight = 'normal')
).encode(
    theta = alt.Theta("% Revenue:Q", stack = True),
    color = alt.Color('Tier',
                      scale = alt.Scale(
                          range = ['#4d71bc', '#5d9bd4', '#6fae45', '#febf0f', '#e77e2d', '#a6a6a6']
                        ),
                      sort = None,
                      legend = None
                      ),
    order = alt.Order(field = 'Tier:O'))


pie = base.mark_arc(outerRadius = 115)
text = base.mark_text(radius = 140, size = 15).encode(text = alt.Text("Tier"))


rev_pie = pie + text
rev_pie
Out[19]:

With both graphs available, we can add them next to each other and include a main title.

In�[20]:
# Combining two pie charts using the vertical concatenation operator '|'
pies = acc_pie | rev_pie

# Setting properties for the combined pie charts
pies.properties(
    title = alt.Title('New Client Tier Share', offset = 10, fontSize = 20)  # Adding a title with specific offset and font size
)
Out[20]:

Visualization as depicted in the book:

Alt text

Pie charts can present readability challenges, as the human eye struggles to differentiate the relative volumes of slices effectively. While adding data percentages next to the slices can enhance comprehension, it may also introduce unnecessary clutter to the visualization.

Bar chart�

The next graph proposed to tackle is a horizontal bar chart. Since now the comparison does not involve angles and are aligned at the start point, discerning the segment's scale is easier.

This is the default representation in Altair:

In�[21]:
# Default altair bar chart

alt.Chart(table_charts).mark_bar().encode(
    y = alt.Y('Tier'),
    x = alt.X('% Accounts')
    )
Out[21]:

The necessary adjustments involve placing the Tier label in the upper left corner, displaying values next to the bars instead of using an x-axis, and adding a title while rearranging the tiers.

In�[22]:
# Creating a base chart with a title, aligned to the left and with normal font weight
base = alt.Chart(
    table_charts,
    title = alt.Title('TIER | % OF TOTAL ACCOUNTS', anchor = 'start', fontWeight = 'normal')
).mark_bar().encode(
    y = alt.Y('Tier', title = None),  # Encoding the 'Tier' field on the y-axis, without a specific title
    x = alt.X('% Accounts', axis = None),  # Encoding the '% Accounts' field on the x-axis, without axis labels
    order = alt.Order(field = 'Tier:O'),  # Ordering the bars based on the 'Tier' field
    text = alt.Text("% Accounts", format = ".0f")  # Displaying the '% Accounts' values as text, formatted to have no decimal places
)

# Creating the final bar chart by combining the bars and text labels
final_acc = base.mark_bar() + base.mark_text(align = 'left', dx = 2)

# Displaying the final bar chart
final_acc
Out[22]:

Adding the order by Tier:O didn't had the same effect as it did on the pie chart. The compatible method for this case is adding a sort keyword in the axis to be sorted.

In�[23]:
base = alt.Chart(
    table_charts,
    title = alt.Title('TIER   | % OF TOTAL ACCOUNTS     |', anchor = 'start', fontWeight = 'normal')
).encode(
    y = alt.Y('Tier', sort = ["A+"], title = None),  # Encoding the 'Tier' field on the y-axis, with a specific sorting order
    x = alt.X('% Accounts', axis = None), 
    text = alt.Text("% Accounts", format = ".0f")
)

final_acc = (base.mark_bar() + base.mark_text(align = 'left', dx = 2)).properties(width = 150) # Setting te width
final_acc
Out[23]:

Now we do the same for the revenue column. In addition, the y-axis is removed so it isn't repeated when uniting the charts.

In�[24]:
base = alt.Chart(
    table_charts, 
    title = alt.Title('% OF TOTAL REVENUE', anchor = 'start', fontWeight = 'normal')
).encode(
    y = alt.Y('Tier', sort = ["A+"]).axis(None),
    x = alt.X('% Revenue').axis(None),
    text = alt.Text("% Revenue", format = ".0f")
)

final_rev = (base.mark_bar() + base.mark_text(align = 'left', dx = 2)).properties(width = 150)
final_rev
Out[24]:

Similar to the pie chart, we can arrange these graphs side by side and include a main title.

In�[25]:
# Combining two charts horizontally using the concatenation operator '|'
hor_bar = final_acc | final_rev

# Configuring the view of the combined chart, removing strokes
hor_bar.configure_view(stroke = None).properties(
    title = alt.Title('New Client Tier Share', anchor = 'start', fontSize = 20)  # Adding title
)
Out[25]:

Visualization as depicted in the book:

Alt text

In both the pie and bar chart, the labeling beside the value is not in the same position as the examples provided. This discrepancy arises from the fact that adjusting these labels to match the book's examples, with variations in positions (some inside and some outside of the pie), different colors, and even omitting some numbers, would be a labor-intensive manual task in Altair. These adjustments are primarily for aesthetic purposes and do not significantly impact readability, in some cases even obscuring the information being presented.

Examples of how to manually define labels will be presented in future exercises.

Horizontal dual series bar chart�

The two graphs in the last visualization can be merged into a single grouped bar chart.

In�[26]:
# Altair with default settings
alt.Chart(table_charts).mark_bar().encode(
    x = alt.X('value:Q'),  # Encoding the quantitative variable 'value' on the x-axis
    y = alt.Y('variable:N'),  # Encoding the nominal variable 'variable' on the y-axis
    color = alt.Color(
        'variable:N', 
        legend = alt.Legend(title = 'Metric')
    ),  # Encoding color based on 'variable' with legend title 'Metric'
    row = alt.Row('Tier:O')  # Faceting by rows based on the ordinal variable 'Tier'
).transform_fold(
    fold = ['% Accounts', '% Revenue'],  # Transforming the data by folding the specified columns
    as_ = ['variable', 'value']  # Renaming the folded columns to 'variable' and 'value'
)
Out[26]:

The necessary alterations involve removing the grid, adjusting label positions and reducing redundancy, adding a title and subtitle, and changing the color palette.

In�[27]:
# Custom settings
merged_hor_bar = alt.Chart(
    table_charts,
    title = alt.Title('New client tier share', fontSize = 20)  # Adding a title with specific font size
).mark_bar().encode(
    x = alt.X(
        'value:Q',
        axis = alt.Axis(
            title = "TIER |  % OF TOTAL ACCOUNTS vs REVENUE",  # Setting a custom title for the x-axis
            grid = False, # Remove grid
            orient = 'top', # Put axis on top
            labelColor = "#888888",  # Setting the label color as gray
            titleColor = '#888888'  # Setting the title color as gray
        )
    ),
    y = alt.Y(
        'variable:N',
        axis = alt.Axis(title = None, labels = False, ticks = False)  # Removing y-axis title, labels, and ticks
    ),
    color = alt.Color(
        'variable:N',
        legend = alt.Legend(title = 'Metric'),  # Adding a legend with a custom title
        scale = alt.Scale(range = ['#b4c6e4', '#4871b7'])  # Setting a custom color range
    ),
    row = alt.Row(
        'Tier:O',
        header = alt.Header(labelAngle = 0, labelAlign = "left"),  # Rotating row labels and aligning to the left
        title = None,
        sort = ['A+'],  # Sorting rows based on 'Tier'
        spacing = 10  # Adding spacing between rows
    )
).transform_fold(
    fold = ['% Accounts', '% Revenue'],  # Transforming the data by folding the specified columns
    as_ = ['variable', 'value']  # Renaming the folded columns to 'variable' and 'value'
).properties(
    width = 200  # Setting the width of the chart
).configure_view(stroke = None)  # Removing the stroke from the view

merged_hor_bar
Out[27]:

Visualization as depicted in the book:

Alt text

Vertical bar chart�

We should can modify the bar chart to be in a vertical orientation. This can be done by switching the y and x axis and the Row class to the Column class, as well as reorient the labels.

In�[28]:
# Creating a vertical bar chart
vert_bar = alt.Chart(
    table_charts,
    title = alt.Title('New client tier share', fontSize = 20)  # Adding a title with specific font size
).mark_bar().encode(
    y = alt.Y(
        'value:Q',
        axis = alt.Axis(
            title = "% OF TOTAL ACCOUNTS vs REVENUE",  # Setting a custom title for the y-axis
            titleAlign = 'left',  # Aligning the title to the left
            titleAngle = 0,  # Setting the title angle to 0 degrees
            titleAnchor = 'end',  # Anchoring the title to the end
            titleY = -10,  # Adjusting the title position
            grid = False,  # Turning off grid lines
            labelColor = "#888888",  # Setting the label color to gray
            titleColor = '#888888'  # Setting the title color to gray
        )
    ),
    x = alt.X(
        'variable:N',
        axis = alt.Axis(title = None, labels = False, ticks = False)  # Removing x-axis title, labels, and ticks
    ),
    color = alt.Color(
        'variable:N',
        legend = alt.Legend(title = 'Metric'),  # Adding a legend with a custom title
        scale = alt.Scale(range = ['#b4c6e4', '#4871b7'])  # Setting a custom color range
    ),
    column = alt.Column(
        'Tier:O',
        header = alt.Header(labelOrient = 'bottom', titleOrient = "bottom", titleAnchor = "start"), # Adjusting column header settings
        sort = ['A+'],  # Sorting columns based on 'Tier'
        title = 'TIER'  # Adding a title for the column
    )
).transform_fold(
    fold = ['% Accounts', '% Revenue'],  # Transforming the data by folding the specified columns
    as_ = ['variable', 'value']  # Renaming the folded columns to 'variable' and 'value'
).properties(
    width = 50  # Setting the width of the chart
).configure_view(stroke = None)  # Removing the stroke from the view

vert_bar
Out[28]:

Visualization as depicted in the book:

Alt text

It is worth noting that titles in Altair do not readily support the option of changing the colors of individual words within them. As a simple solution for the time being, we will retain the legend that effectively indicates which column corresponds to each word. Future exercises will delve into a more complicated way to tackle this challenge.

In the code above, we've utilized the transform_fold method to generate the grouped bar chart because our data is structured in the 'wide form', which is the standard Excel format. However, Altair (as well as other visualization languages) is inherently designed to work with 'long form' data. The transform_fold function automates this conversion within the chart, enabling us to create the graph. This approach can obscure the process, making it preferable to perform the data transformation before creating the visualizations.

In�[29]:
# Transforms the data to the long-form format

melted_table = pd.melt(table_charts, id_vars = ['Tier'], var_name = 'Metric', value_name = 'Value')
melted_table
Out[29]:
Tier Metric Value
0 A+ # of Accounts 19.000000
1 A # of Accounts 77.000000
2 B # of Accounts 338.000000
3 C # of Accounts 425.000000
4 D # of Accounts 24.000000
5 All other # of Accounts 205.000000
6 A+ % Accounts 1.746324
7 A % Accounts 7.077206
8 B % Accounts 31.066176
9 C % Accounts 39.062500
10 D % Accounts 2.205882
11 All other % Accounts 18.841912
12 A+ Revenue ($M) 3.927000
13 A Revenue ($M) 4.675000
14 B Revenue ($M) 5.984000
15 C Revenue ($M) 2.805000
16 D Revenue ($M) 0.374000
17 All other Revenue ($M) 0.935000
18 A+ % Revenue 21.000000
19 A % Revenue 25.000000
20 B % Revenue 32.000000
21 C % Revenue 15.000000
22 D % Revenue 2.000000
23 All other % Revenue 5.000000

We can now use this table to remake the bar chart without the transform_fold method.

In�[30]:
# Selecting specific rows from the melted table based on the 'Metric' column
selected_rows = melted_table[melted_table['Metric'].isin(['% Accounts', '% Revenue'])]

vert_bar2 = alt.Chart(
    selected_rows,
    title = alt.Title('New client tier share', fontSize = 20)  # Adding a title with specific font size
).mark_bar().encode(
    y = alt.Y(
        'Value',
        axis = alt.Axis(
            title = "% OF TOTAL ACCOUNTS vs REVENUE",  # Setting a custom title for the y-axis
            titleAlign = 'left',  # Aligning the title to the left
            titleAngle = 0,  # Setting the title angle to 0 degrees
            titleAnchor = 'end',  # Anchoring the title to the end
            titleY = -10,  # Adjusting the title position
            grid = False,  # Turning off grid lines
            labelColor = "#888888",  # Setting the label color to gray
            titleColor = '#888888'  # Setting the title color to gray
        )
    ),
    x = alt.X(
        'Metric',
        axis = alt.Axis(title = None, labels = False, ticks = False)  # Removing x-axis labels and ticks
    ),
    color = alt.Color(
        'Metric',
        scale = alt.Scale(range = ['#b4c6e4', '#4871b7'])  # Setting a custom color range
    ),
    column = alt.Column(
        'Tier',
        header = alt.Header(labelOrient='bottom', titleOrient="bottom", titleAnchor="start"),  # Adjusting column header settings
        sort = ['A+'],  # Sorting columns based on 'Tier'
        title = 'TIER'  # Adding a title for the column
    )
).properties(
    height = 200, width = 50  # Setting the height and width of the chart
).configure_view(stroke = None)  # Removing the stroke from the view

vert_bar2
Out[30]:

Bar chart with lines�

The next proposed graph is an extension of the previous bar chart, featuring the addition of lines to accentuate the endpoints of the columns within the same tier.

However, due to the nature of faceted charts, we encounter an error (ValueError: Faceted charts cannot be layered. Instead, layer the charts before faceting) when attempting to layer it. This issue arises because, in faceted charts, the x-axis structure is altered.

Now that we've transformed our data into long-format, we can work around this problem by creating our graph without using the column method, and thereby, avoiding faceting. Instead of specifying x as Metric, y as Value, color as Metric, and column as Tier, we can redefine x as Tier, y as Value, color as Metric, and introduce XOffset for controlling the horizontal positioning of data points within a group. In essence, column primarily serves to define distinct x-axis categories, while XOffset is employed to manage the horizontal placement of data points within a group.

The following chart incorporates the alterations we discussed and yields a graph that closely resembles the previous one.

In�[31]:
bar = alt.Chart(
    selected_rows,
    title = alt.Title('New client tier share', fontSize = 20, anchor = 'start')
).mark_bar().encode(
    x = alt.X(
        'Tier',
        axis = alt.Axis(
            title = 'TIER',  # Setting a custom title for the x-axis
            labelAngle = 0,  # Setting the label angle to 0 degrees
            titleAnchor = "start",  # Anchoring the title to the start
            domain = False,  # Hiding the x-axis domain line
            ticks = False  # Hiding the x-axis ticks
        ),
        sort = ['A+']  # Sorting x-axis based on 'Tier'
    ),
    y = alt.Y(
        'Value',
        axis = alt.Axis(
            title = "% OF TOTAL ACCOUNTS vs REVENUE",  # Setting a custom title for the y-axis
            titleAlign = 'left',  # Aligning the title to the left
            titleAngle = 0,  # Setting the title angle to 0 degrees
            titleAnchor = 'end',  # Anchoring the title to the end
            titleY = -10,  # Adjusting the title position
            grid = False,  # Turning off grid lines
            labelColor = "#888888",  # Setting the label color to gray
            titleColor =' #888888'  # Setting the title color to gray
        )
    ),
    color = alt.Color(
        'Metric',
        scale = alt.Scale(range = ['#b4c6e4', '#4871b7'])  # Setting a custom color range
    ),
    xOffset = 'Metric'  # Adjusting the x-offset
).properties(
    height = 250, width = 375  # Setting the height and width of the chart
)

bar.configure_view(stroke = None)  # Removing the stroke from the view

bar
Out[31]:

Now, we can layer the graph and introduce the lines. It's worth noting that creating the lines in Altair is not a straightforward task and a considerable amount of documentation searching was necessary to achieve it.

In�[32]:
# x, y and y2 do not accept to be defined as "condition", so repetitive code is necessary

# Create a vertical rule chart for ascending lines
rule_asc = alt.Chart(selected_rows).mark_rule(x2Offset = 10, xOffset = -10).encode(
    x = alt.X('Tier', sort = ['A+']),  # X-axis encoding for 'Tier', sorted in a specific order
    x2 = alt.X2('Tier'),  # End point of the rule line
    y = alt.Y('min(Value)'),  # Start point of the rule line
    y2 = alt.Y2('max(Value)'),  # End point of the rule line
    strokeWidth = alt.value(2),  # Set the stroke width of the rule line
    opacity = alt.condition(
        (alt.datum.Tier == 'A+') |  # Condition for specific tiers
        (alt.datum.Tier == 'A') |   # to determine opacity settings
        (alt.datum.Tier == 'B'),
        alt.value(1), alt.value(0)  # Opacity set to 1 if condition is met, else 0
    )
)

# Create a vertical rule chart for descending lines
rule_desc = alt.Chart(selected_rows).mark_rule(x2Offset = 10, xOffset = -10
).encode(
    x = alt.X('Tier', sort = ['A+']),
    x2 = alt.X2('Tier'),
    y = alt.Y('max(Value)'),
    y2 = alt.Y2('min(Value)'),
    strokeWidth = alt.value(2), 
    opacity = alt.condition(
        (alt.datum.Tier == 'A+') | 
        (alt.datum.Tier == 'A')  |
        (alt.datum.Tier == 'B'), 
        alt.value(0), alt.value(1)
        )
    )

# Points of % Revenue where % Revenue > % Accounts
points1 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = 10, color = "black").encode(
    x = alt.X('Tier', sort = ['A+']),
    y = alt.Y('max(Value)'),
    opacity = alt.condition(
        (alt.datum.Tier == 'A+') | 
        (alt.datum.Tier == 'A')  |
        (alt.datum.Tier == 'B'), 
        alt.value(1), alt.value(0)
        )
    )

# Points of % Revenue where % Revenue < % Accounts
points2 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = 10, color = "black").encode(
    x = alt.X('Tier', sort = ['A+']),
    y = alt.Y('min(Value)'),
    opacity = alt.condition(
        (alt.datum.Tier == 'A+') | 
        (alt.datum.Tier == 'A')  |
        (alt.datum.Tier == 'B'), 
        alt.value(0), alt.value(1)
        )
    )

# Points of % Accounts where % Revenue < % Accounts
points3 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = -10, color = "black").encode(
    x = alt.X('Tier', sort = ['A+']),
    y = alt.Y('max(Value)'),
    opacity = alt.condition(
        (alt.datum.Tier == 'A+') | 
        (alt.datum.Tier == 'A')  |
        (alt.datum.Tier == 'B'), 
        alt.value(0), alt.value(1)
        )
    )

# Points of % Revenue where % Revenue > % Accounts
points4 = alt.Chart(selected_rows).mark_point(filled = True, xOffset = -10, color = "black").encode(
    x = alt.X('Tier', sort = ['A+']),
    y = alt.Y('min(Value)'),
    opacity = alt.condition(
        (alt.datum.Tier == 'A+') | 
        (alt.datum.Tier == 'A')  |
        (alt.datum.Tier == 'B'), 
        alt.value(1), alt.value(0)
        )
    )

bar_point = bar + rule_asc + rule_desc + points1 + points2 + points3 + points4
bar_point.configure_view(stroke = None)
Out[32]:

Visualization as depicted in the book:

Alt text

Lines only�

With two types of visualizations displaying the same data, the book suggests to eliminate the bars altogether. This can be done without the need to program more graphs:

In�[33]:
# Configure the legend to be disabled (hidden)
# The 'opacity' configuration is set to 0, making the bars transparent
point = bar_point.configure_mark(opacity = 0).configure_view(stroke = None).configure_legend(disable = True)
point
Out[33]:

Visualization as depicted in the book:

Alt text

Slope graph�

At last, we can reassemble the lines to create a slope graph.

In�[34]:
# Create base chart, setting the title
base = alt.Chart(
    selected_rows, 
    title = alt.Title("New client tier share", anchor = 'start', fontWeight = 'normal', fontSize = 20)
)

# Line chart configuration
line = base.mark_line(
    point = True # The lines have points at the end
).encode(
    x = alt.X(
        'Metric', 
        axis = alt.Axis(title = None, labelAngle = 0, domain = False, ticks = False)
    ),
    y = alt.Y('Value', axis = None),
    color = alt.Color(
        'Tier', 
        scale = alt.Scale(range = ['black']), # All Tier lines are black
        legend = None
        )
).properties(
    width = 300,
    height = 350
)

# Labels to the right of the slope
# These labels are for the Accounts
labels1 = base.mark_text(
    align = 'left',
    dx = 10
).encode(
    x = alt.X('Metric'),
    y = alt.Y('Value'),
    text = alt.Text('Value:Q', format = '.0f'),
    opacity = alt.condition(alt.datum.Metric == '% Accounts', alt.value(0), alt.value(0.7)) 
)

# Labels to the left of the slope
# These labels are for the Revenue
labels2 = base.mark_text(
    align ='left', 
    dx = -20
).encode(
    x = alt.X('Metric'),
    y = alt.Y('Value'),
    text = alt.Text('Value:Q', format='.0f'),
    opacity = alt.condition(alt.datum.Metric == '% Accounts', alt.value(0.7), alt.value(0)) 
)

# Labels for the Tiers
tier_labels = base.mark_text(
    align = 'left',
    dx = 30,
    fontWeight = 'bold'
).encode(
    x = alt.X('Metric'),
    y = alt.Y('Value'),
    text = 'Tier',
    opacity = alt.condition(alt.datum.Metric == '% Accounts', alt.value(0), alt.value(1)) # Show in only one side
)

# Tier title
tier_title = alt.Chart(
    {"values": [{"text":  ['TIER']}]}
).mark_text(
            align = "left", 
            dx = 105, 
            dy = -120,
            fontWeight = 'bold'
).encode(
    text = "text:N")

slope = line + labels1 + labels2 + tier_labels + tier_title
slope.configure_view(stroke = None)
Out[34]:

Notice how there are two numbers overlapping, the percentage of accounts of the A+ and D tiers. Since they are the same number when rounded, we can just eliminate one of the values display.

In�[35]:
# Eliminate Tier A+ from display
label_condition = (alt.datum.Metric == '% Accounts') & (alt.datum.Tier != 'A+')


labels2 = base.mark_text(
    align ='left', 
    dx = -20
).encode(
    x = alt.X('Metric'),
    y = alt.Y('Value'),
    text = alt.Text('Value:Q', format='.0f'),
    opacity = alt.condition(label_condition, alt.value(0.7), alt.value(0)) 
)

slope = line + labels1 + labels2 + tier_labels + tier_title
slope.configure_view(stroke = None)
Out[35]:

Visualization as depicted in the book:

Alt text

Interactivity�

In this exercise, the selected graph for interactivity is the simple vertical bar chart, without the lines.

The chosen interactive features include a simple tooltip, revealing the precise values of each column upon hovering. Additionally, it shows the legend to provide further clarity about the corresponding data categories.

Finally, the columns are designed to highlight dynamically when the viewer hovers over them. Because of this feature, the color palette was changed, since the monochromatic version made the highlighted column and the not highlighted neighbor too similar.

In�[36]:
# Selection for interactive points on hover
hover = alt.selection_point(on = 'mouseover', nearest = True, empty = False)

# Bar chart configuration with interactivity
bar_interactive = alt.Chart(
    selected_rows, 
    title = alt.Title('New client tier share', fontSize = 20, anchor = 'start')
).mark_bar().encode(
    x = alt.X(
        'Tier',
        axis = alt.Axis(title = 'TIER', labelAngle = 0, titleAnchor = "start", domain = False, ticks = False),
        sort = ['A+']
    ),
    y = alt.Y('Value', axis = alt.Axis(
        title = "% OF TOTAL ACCOUNTS vs REVENUE",
        titleAlign = 'left',
        titleAngle = 0,
        titleAnchor = 'end',
        titleY = -10,
        grid = False,
        labelColor = "#888888",
        titleColor = '#888888'
    )),
    color = alt.Color('Metric', scale = alt.Scale(range = ['#0a2f73', '#096b2b'])),
    xOffset = 'Metric',
    opacity = alt.condition(hover, alt.value(1), alt.value(0.5)),  # Set opacity based on hover
    tooltip = ['Value:Q', 'Metric']  # Show tooltip with specified fields
).properties(
    height = 250, width = 375
).add_params(hover)  # Add the hover selection to the chart

# Configure view settings for the interactive bar chart
bar_interactive.configure_view(stroke = None)
Out[36]:

Exercise 2.4 - Practice in your tool�

This exercise proposes to display the same data in six different formats, hand-drawn by the author in the theoretical exercise 2.3. The purpose of the activity is to practice in our own tool, and while C. Nussbaumer uses Excel, we will proceed with Altair.

If you would like to return to the Table of Contents, you can click here.

Alt text

Loading the data�

In�[37]:
# Loading considering the NaN caused by Excel formatting
table = pd.read_excel(r"Data\2.4 EXERCISE.xlsx", usecols = [1, 2, 3], header = 4)
table
Out[37]:
DATE CAPACITY DEMAND
0 2019-04 29263 46193
1 2019-05 28037 49131
2 2019-06 21596 50124
3 2019-07 25895 48850
4 2019-08 25813 47602
5 2019-09 22427 43697
6 2019-10 23605 41058
7 2019-11 24263 37364
8 2019-12 24243 34364
9 2020-01 25533 34149
10 2020-02 24467 25573
11 2020-03 25194 25284

In the graphs for this exercise, we require the inclusion of the "unmet demand" column, which is currently absent from the dataset. To obtain this value, we can calculate the difference between demand and capacity for each date.

In�[38]:
# Calculate Unmet Demand
table['UNMET DEMAND'] = table['DEMAND'] - table['CAPACITY']

# Show only the first five lines
table.head()
Out[38]:
DATE CAPACITY DEMAND UNMET DEMAND
0 2019-04 29263 46193 16930
1 2019-05 28037 49131 21094
2 2019-06 21596 50124 28528
3 2019-07 25895 48850 22955
4 2019-08 25813 47602 21789

Now we transform the data from the wide-format used in Excel to the long-format used in Altair.

In�[39]:
# Transforming data into long-format

melted_table = pd.melt(table, id_vars = ['DATE'], var_name = 'Metric', value_name = 'Value')
melted_table
Out[39]:
DATE Metric Value
0 2019-04 CAPACITY 29263
1 2019-05 CAPACITY 28037
2 2019-06 CAPACITY 21596
3 2019-07 CAPACITY 25895
4 2019-08 CAPACITY 25813
5 2019-09 CAPACITY 22427
6 2019-10 CAPACITY 23605
7 2019-11 CAPACITY 24263
8 2019-12 CAPACITY 24243
9 2020-01 CAPACITY 25533
10 2020-02 CAPACITY 24467
11 2020-03 CAPACITY 25194
12 2019-04 DEMAND 46193
13 2019-05 DEMAND 49131
14 2019-06 DEMAND 50124
15 2019-07 DEMAND 48850
16 2019-08 DEMAND 47602
17 2019-09 DEMAND 43697
18 2019-10 DEMAND 41058
19 2019-11 DEMAND 37364
20 2019-12 DEMAND 34364
21 2020-01 DEMAND 34149
22 2020-02 DEMAND 25573
23 2020-03 DEMAND 25284
24 2019-04 UNMET DEMAND 16930
25 2019-05 UNMET DEMAND 21094
26 2019-06 UNMET DEMAND 28528
27 2019-07 UNMET DEMAND 22955
28 2019-08 UNMET DEMAND 21789
29 2019-09 UNMET DEMAND 21270
30 2019-10 UNMET DEMAND 17453
31 2019-11 UNMET DEMAND 13101
32 2019-12 UNMET DEMAND 10121
33 2020-01 UNMET DEMAND 8616
34 2020-02 UNMET DEMAND 1106
35 2020-03 UNMET DEMAND 90

To simplify the data transformation process in the graphs, we will deviate from the "yyyy-mm" format for the date. Instead, we will create two separate columns, one for the year and another for the abbreviated name of the month. This adjustment will streamline our visualization efforts by reducing the need for extensive data transformations within the graphs themselves.

In�[40]:
# Transform the column into datetime format
melted_table['DATE'] = pd.to_datetime(melted_table['DATE'])

# Extracting year and month
melted_table['year'] = melted_table['DATE'].dt.year
melted_table['month'] = melted_table['DATE'].apply(lambda x: x.strftime('%b'))
In�[41]:
# The DATE column is no longer useful

melted_table.drop('DATE', axis = 1, inplace = True)
melted_table.head()
Out[41]:
Metric Value year month
0 CAPACITY 29263 2019 Apr
1 CAPACITY 28037 2019 May
2 CAPACITY 21596 2019 Jun
3 CAPACITY 25895 2019 Jul
4 CAPACITY 25813 2019 Aug

To further avoid data manipulation within the chart code, we will create auxiliary tables.

In�[42]:
# Making new sets of data

# Just 2019
table_2019  = melted_table[melted_table['year'].isin([2019])]
# Just Demand from 2019
demand_2019 = table_2019[table_2019['Metric'].isin(['DEMAND'])]
# Just Capacity from 2019
capacity_2019 = table_2019[table_2019['Metric'].isin(['CAPACITY'])]
# Just Unmet Demand from 2019
unmet_2019 = table_2019[table_2019['Metric'].isin(['UNMET DEMAND'])]
# Demand and Capacity from 2019
bar_table = table_2019[table_2019['Metric'].isin(['CAPACITY', 'DEMAND'])]
# Unmet Demand and Capacity from 2019
stacked_table = table_2019[table_2019["Metric"].isin(["CAPACITY", "UNMET DEMAND"])]

Bar chart�

While the author deliberately filled the Capacity columns while leaving Demand only outlined in the attempt to visually distinguish between what can be fulfilled (Capacity) and the unmet portion of the requirement (Unmet Demand), Altair is not easily compatible with that choice.

The variable which dictates if the mark will be filled does not accept a condition as its value. Since the author itself admits the shortcomings of this approach ("I find the outline plus the white space between the bars visually jarring"), we chose to differentiate the data by color, as it is traditional.

In�[43]:
# Unfilled version
alt.Chart(
    bar_table,
    title = alt.Title(
        "Demand vs capacity over time", anchor = "start", offset = 20, fontSize = 16 # Set customized title
    ),
).mark_bar(filled = False).encode( # Filled = False makes the bars unfilled
    y = alt.Y(
        "Value",
        axis = alt.Axis(
            grid = False, # Removes grid
            titleAnchor = "end",
            labelColor = "#888888", # Changes the label color to gray
            titleColor = "#888888", # Changes the title color to gray
            titleFontWeight = "normal"
        ),
        scale = alt.Scale(domain = [0, 60000]), # y-axis goes from 0 to 60000
        title = "NUMBER OF PROJECT HOURS"
    ),
    x = alt.X(
        "month",
        sort = None,
        axis = alt.Axis(
            labelAngle = 0, # Makes label horizontal
            titleAnchor = "start",
            labelColor = "#888888", # Changes label and title color to gray
            titleColor = "#888888",
            titleFontWeight = "normal",
            ticks = False # Removes ticks from axis
        ),
        title = "2019"
    ),
    color = alt.Color( # Sets colors based on Metric (Demand and Capacity)
        "Metric", scale = alt.Scale(range = ["#b4c6e4", "#4871b7"]), sort = "descending"
    ), 
    xOffset = alt.XOffset("Metric", sort = "descending") # Sets offset on the x-axis 
).configure_view(
    stroke = None
)  # Remove the chart border
Out[43]:
In�[44]:
# Filled version
alt.Chart(
    bar_table,
    title = alt.Title(
        "Demand vs capacity over time", anchor = "start", offset = 20, fontSize = 16 # Set customized title
    ),
).mark_bar().encode( # Filled = True is default
    y = alt.Y(
        "Value",
        axis = alt.Axis(
            grid = False, # Removes grid
            titleAnchor = "end",
            labelColor = "#888888", # Changes the label color to gray
            titleColor = "#888888", # Changes the title color to gray
            titleFontWeight = "normal"
        ),
        scale = alt.Scale(domain = [0, 60000]), # y-axis goes from 0 to 60000
        title = "NUMBER OF PROJECT HOURS"
    ),
    x = alt.X(
        "month",
        sort = None,
        axis = alt.Axis(
            labelAngle = 0, # Makes label horizontal
            titleAnchor = "start",
            labelColor = "#888888", # Changes label and title color to gray
            titleColor = "#888888",
            titleFontWeight = "normal",
            ticks = False # Removes ticks from axis
        ),
        title = "2019"
    ),
    color = alt.Color( # Sets colors based on Metric (Demand and Capacity)
        "Metric", scale = alt.Scale(range = ["#b4c6e4", "#4871b7"]), sort = "descending"
    ), 
    xOffset = alt.XOffset("Metric", sort = "descending") # Sets offset on the x-axis 
).configure_view(
    stroke = None
)  # Remove the chart border
Out[44]:

Visualization as depicted in the book:

Alt text

Line graph�

Cleaner than the bar chart, the next step was to convey the data using the line graph, with the labeling beside each line, along with the final value of the year. This helps the viewer to visualize the difference between the capacity and the demand.

In�[45]:
line = (
    alt.Chart(
        bar_table,
        title = alt.Title(
            "Demand vs capacity over time",
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10 # Offsets the title in the y-axis
        )
    )
    .mark_line()  # Using a line mark for the chart
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,  # Disabling sorting for better time representation
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888", # Set colors to gray
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color(
            "Metric",
            scale = alt.Scale(range = ["#1f77b4", "#1f77b4"]), 
            legend = None
        ),
        strokeWidth = alt.condition(
            "datum.Metric == 'CAPACITY'", alt.value(3), alt.value(1)
        )  # Adjusting line thickness based on the metric
    )
    .properties(width = 350, height = 250) # Set size of the graph
)

# Adding labels
label = (
    alt.Chart(bar_table)
    .mark_text(align = "left", dx = 3)
    .encode(
        x = alt.X("month", sort = None, aggregate = "max"),
        y = alt.Y("Value", aggregate = {"argmax": "month"}),
        text = alt.Text("Metric"), # The text itself is the Metric
        color = alt.Color("Metric", scale = alt.Scale(range = ["#1f77b4", "#1f77b4"]))
    )
)

# Combining the line chart and labels
line + label
Out[45]:

As it is possible to notice, defining the label position as the maximum argument of the y-axis did not yield the intended result. This is because Altair is considering the values in an alphabetical order (making Sept the last month), even when setting sort = None in the x-axis.

Since documentation fixing this issue was not found, the next approach was adding the label manually. This also assist the process of adding the value next to the metric.

In�[46]:
# Demand label
label1 = alt.Chart({"values": 
                    [{"text":  ['34K DEMAND']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 160, dy = -15, 
                                color = '#1f77b4' # Color it blue
                                ).encode(text = "text:N")

# Capacity label
label2 = alt.Chart({"values": 
                    [{"text":  ['24K CAPACITY']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 160, dy = 25, 
                                color = '#1f77b4', # Color it blue
                                fontWeight = 'bold'
                                ).encode(text = "text:N")

line_final = line + label1 +  label2
line_final.configure_view(stroke = None)
Out[46]:

Visualization as depicted in the book:

Alt text

Overlapping bars�

The author now explores overlapping bars, wherein two bar graphs are positioned on top of each other, sharing the same axis. The Capacity data is displayed with transparency to prevent any potential confusion that might arise with a stacked bar chart.

In this particular graph, our choice was to emulate the column labeling using a title with different colors, despite Altair not providing a straightforward method for such customization. Unlike previous examples where the default legend effectively distinguished colors, the current data distinction — "opaque" or "transparent" — is better conveyed by utilizing normal or bold text in the title instead of relying on a legend with colors.

In�[47]:
# Demand bar, with bigger spacing between them, unfilled
demand = (
    alt.Chart(
        demand_2019,
        width = alt.Step(40), # Defines the width of the bars (including distance between them)
        title = alt.Title(
            "Demand vs capacity over time",
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start"
        )
    )
    .mark_bar(filled = False) # Unfilled
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888", # Gray label and title
                titleColor = "#888888"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0, # Horizontal label
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False
            ),
            title = "2019"
        )
    )
)


# Capacity bar, bigger size and more transparency
capacity = (
    alt.Chart(capacity_2019)
    .mark_bar(size = 30) # Makes the bar thicker but keeps the distance the same
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        opacity = alt.value(0.5) # Makes the bar transparent
    )
)

# Labeling for subtitle
label1 = (
    alt.Chart({"values": [{"text": ["DEMAND  |"]}]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -235, 
        dy = -120, 
        color = "#1f77b4"
        )
    .encode(text = "text:N")
)

label2 = (
    alt.Chart({"values": [{"text": ["CAPACITY"]}]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -177, 
        dy = -120, 
        color = "#1f77b4", 
        fontWeight = 800
        )
    .encode(text = "text:N")
)

overlap = capacity + demand + label1 + label2

# Sets space (padding) between bands
overlap.configure_scale(
    bandPaddingInner = 0.5
    ).configure_view(
        stroke = None
        ).properties(
            height = 200
            )
Out[47]:

Visualization as depicted in the book:

Alt text

Stacked bars�

In the stacked bars configuration, the Demand bar chart has been replaced with Unmet Demand (i.e., Demand - Capacity). This modification allows the stacking to represent the cumulative Demand value. Additionally, a color adjustment has been made, with Unmet Demand now rendered in a darker shade to emphasize its significance as a more meaningful metric.

In�[48]:
# Stacked bar
bars = (
    alt.Chart(
        stacked_table,
        title = alt.Title(
            "Demand vs capacity over time",
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10 # Offsets title in the y-axis
        )
    )
    .mark_bar(size = 25)
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888", # Sets title and label to gray
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#d9dad9", "#4871b7"])
            ),
        order = alt.Order("Metric", sort = "ascending") # Unmet demand on top
    )
)

# Border detail, makes the graph more visible
border = (
    alt.Chart(stacked_table)
    .mark_bar(size = 25, filled = False) # Makes an unfilled bar
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False
            ),
            title = "2019"
        ),
        order = alt.Order("Metric", sort = "ascending")
    )
)

stacked = bars + border
stacked.configure_view(stroke = None).properties(width = 300, height = 200)
Out[48]:

Dot plot�

For the next graph, a dot plot, the author reveals the challenges she had in Excel. To create the circles, she employed data markers from two line graphs, concealing the lines themselves. The region connecting the dots was achieved by employing a stacked bar of Unmet Demand, sitting on top of an transparent Capacity series.

This serves as a noteworthy example of the limitations of Excel when dealing with charts not inherently programmed into the tool. While certain graphs may be more straightforward in the Microsoft tool, unconventional visualizations might demand intricate and obscure workarounds, while in Altair, where the approach to data visualization is more flexible, documentation for this graph was readily available.

In�[49]:
# Unfilled version
dots1 = (
    alt.Chart(
        bar_table,
        title = alt.Title(
            "Demand vs capacity over time", # Set title
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10
        )
    )
    .mark_circle(size = 600, opacity = 1) # Maximum opacity
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleFontWeight = "normal",
                titleColor = "#888888",
                titleAnchor = "start",
                ticks = False, 
                labels = False, # Removes labels from axis
                domain = False # Removes line from axis
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "# OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                titleFontWeight = "normal",
                labelColor = "#888888",
                labelPadding = 10, # Makes label more distant to the axis
                titleColor = "#888888",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color("Metric", scale = alt.Scale(range = ["#4871b7"]), legend = None) 
    )
    .properties(width = 400, height = 250)
    .transform_filter(alt.datum.Metric == "CAPACITY") # Alternative way to filter, similar to using auxiliary table
)

dots2 = ( # Same graph but to Demand
    alt.Chart(
        bar_table,
        title = alt.Title(
            "Demand vs capacity over time",
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10
        )
    )
    .mark_circle(size = 600, opacity = 1, filled = False) # Unfilled
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleFontWeight = "normal",
                titleColor = "#888888",
                titleAnchor = "start",
                ticks = False,
                labels = False,
                domain = False
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "# OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                titleFontWeight = "normal",
                labelColor = "#888888",
                labelPadding = 10, # Makes label more distant to axis
                titleColor = "#888888",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#b4c6e4"]), 
            legend = None
            ) # Darker blue
    )
    .properties(width = 400, height = 250)
    .transform_filter(alt.datum.Metric == "DEMAND")
)

# Lines between dots
line = (
    alt.Chart(bar_table)
    .mark_line(strokeWidth = 25, opacity = 0.25) # Sets width and transparency
    .encode(
        x = alt.X("month", sort = None), 
        y = "Value", 
        detail = "month") # detail = month makes a line per month
)                         # instead of a single one

# Text inside the dots
text = (
    alt.Chart(bar_table)
    .mark_text()
    .encode(
        x = alt.X("month", sort = None),
        y = "Value",
        text = alt.Text(
            "Value:Q", 
            format = ".2s"
            ), # Formats 10000 as 10k
        color = alt.condition(
            alt.datum.Metric == "DEMAND", 
            alt.value("black"), 
            alt.value("white") # Set color depending on metric
        )
    )
)

# Set legend for Metric
label1 = (
    alt.Chart({"values": [{"text": ["DEMAND"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 200, 
        dy = -17, 
        color = "#4871b7"
        )
    .encode(text = "text:N")
)

label2 = (
    alt.Chart({"values": [{"text": ["CAPACITY"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 200, 
        dy = 25, 
        color = "#4871b7", 
        fontWeight = "bold"
        )
    .encode(text = "text:N")
)

dot_plot = line + dots1 + dots2 + text + label1 + label2
dot_plot.configure_view(stroke = None)
Out[49]:
In�[50]:
# Dot plot, filled-only version
dots = (
    alt.Chart(
        bar_table,
        title = alt.Title(
            "Demand vs capacity over time",
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10
        )
    )
    .mark_circle(size = 600, opacity = 1) # Max opacity, filled by default
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleFontWeight = "normal",
                titleColor = "#888888",
                titleAnchor = "start",
                ticks = False,
                labels = False,
                domain = False
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "# OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                titleFontWeight = "normal",
                labelColor = "#888888",
                labelPadding = 10,
                titleColor = "#888888",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#4871b7", "#b4c6e4"]), 
            legend = None
        )
    )
    .properties(width = 400, height = 250)
)

dot_plot = line + dots + text + label1 + label2
dot_plot.configure_view(stroke = None)
Out[50]:

Visualization as depicted in the book:

Alt text

Graph the difference�

For the final visualization, it was chosen a simple line plot representing the unmet demand. Although minimalist and clean, this choice occults data from the actual value of demand and capacity.

In�[51]:
alt.Chart(
    unmet_2019,
    title = alt.Title(
        "Unmet demand over time",
        fontSize = 18,
        fontWeight = "normal",
        anchor = "start",
        offset = 10
    ),
).mark_line().encode(
    y = alt.Y(
        "Value",
        axis = alt.Axis(
            grid = False,
            titleAnchor = "end",  
            labelColor = "#888888",
            titleColor = "#888888",
            titleFontWeight = "normal"
        ),
        title = "NUMBER OF PROJECT HOURS"
    ),
    x = alt.X(
        "month",
        sort = None,
        axis = alt.Axis(
            labelAngle = 0,
            titleAnchor = "start",
            labelColor = "#888888",
            titleColor = "#888888",
            titleFontWeight = "normal",
            ticks = False
        ),
        title="2019",
    ),
    strokeWidth = alt.value(3) # Set thickness of the line
).properties(
    width = 375, height = 250
).configure_view(
    stroke = None
)
Out[51]:

Visualization as depicted in the book:

Alt text

Interactivity�

The initial idea was to make an interactive version of the Overlapping graph where the opacity adjusts when the mouse in near the column, similarly to the one in Exercise 2.1. Despite the tooltip functioning correctly, the hover feature failed to identify all graphs in the layering. As a result, only the Demand bars were highlighted upon hovering it.

In�[52]:
# Define hover
hover = alt.selection_point(on = "mouseover", nearest = True, empty = False)

demand = (
    alt.Chart(
        demand_2019,
        width = alt.Step(40),
        title = alt.Title(
            "Demand vs capacity over time",
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start"
        )
    )
    .mark_bar()
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False
            ),
            title = "2019"
        ),
        opacity = alt.condition(
            hover, 
            alt.value(1), 
            alt.value(0.5)
            ),  # Set opacity according to hover
        tooltip = ["Value:Q", "Metric"]
    )
    .add_params(hover)
)

# Capacity bar, bigger size and more transparency
capacity = (
    alt.Chart(capacity_2019)
    .mark_bar(size = 30)
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        opacity = alt.condition(hover, alt.value(1), alt.value(0.5)),
        tooltip = ["Value:Q", "Metric"]
    )
    .add_params(hover)
)

overlap = demand + capacity 

overlap.configure_scale(
    bandPaddingInner = 0.5
    ).configure_view(
        stroke = None
        ).properties(
            height = 200
            )
Out[52]:

Due to a lack of documentation about interactivity in layered charts, no solution to this problem was found.

The final interactive graph for this exercise was the line graph. Rather than keeping the "Capacity and Demand" graph distinct from the "Unmet Demand" one, we opted for a consolidated approach with three lines.

Users can now choose the data to emphasize through a dropdown box. Additionally, each year features a tooltip marked by a point for enhanced clarity.

In�[53]:
# Sorts so that the metrics in the tooltip aligns with the graphs order
custom_order = ["DEMAND", "CAPACITY", "UNMET DEMAND"] 
sorted_metrics = [metric for metric in custom_order if metric in table_2019["Metric"].unique()]

# Creates the dropdown
dropdown = alt.binding_select(options = list(sorted_metrics), name = "SELECT LINE:  ")
selection = alt.selection_point(fields = ["Metric"], bind = dropdown)

line = (
    alt.Chart(
        table_2019,
        title = alt.Title(
            "Demand and capacity over time: unmet demand calculated", # Changes title
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10
        ),
    )
    .mark_line(point = True)  # Create a point at each month
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,  # Disabling sorting for better time representation
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888", # Set colors to gray
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color(
            "Metric",
            scale = alt.Scale(range = ["#1f77b4", "#1f77b4", "red"]), # Unmet demand is red
            legend = None
        ),
        opacity = alt.condition(
            selection, alt.value(1), alt.value(0.1)
        ),  # Adjusting opacity based on the dropdown
        tooltip = ["Metric", "month", "Value"] # Sets tooltip
    )
    .properties(width = 350, height = 250) 
).add_params(selection)


# Demand label
label1 = alt.Chart({"values": 
                    [{"text":  ['DEMAND']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 165, dy = -15, 
                                color = '#1f77b4', # Color it blue
                                fontWeight = 700 # Bold font
                                ).encode(
                                    text = "text:N",
                                    opacity = alt.condition(
                                            selection, 
                                            alt.value(1), 
                                            alt.value(0)
                                            ) # Label disappears when any line is selected
                                        )

# Capacity label
label2 = alt.Chart({"values": 
                    [{"text":  ['CAPACITY']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 165, dy = 25, 
                                color = '#1f77b4', # Color it blue
                                fontWeight = 700
                                ).encode(
                                    text = "text:N",
                                    opacity = alt.condition(
                                            selection, 
                                            alt.value(1), 
                                            alt.value(0)
                                            ) # Label disappears when any line is selected
                                        )
# Unmet demand label
label3 = alt.Chart({"values": 
                    [{"text":  ['UNMET DEMAND']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 165, dy = 82, 
                                color = 'red', # Color it blue
                                fontWeight = 700
                                ).encode(
                                    text = "text:N",
                                    opacity = alt.condition(
                                            selection, 
                                            alt.value(1), 
                                            alt.value(0)
                                            ) # Label disappears when any line is selected
                                        )

line_final = line + label1 + label2 + label3
line_final.configure_view(stroke = None)
Out[53]:

Alternatively, we can make a simple selection with a radio button, including an option for all lines.

In�[54]:
options = ['DEMAND', 'CAPACITY', 'UNMET DEMAND']
labels = [option + ' ' for option in options]


input_dropdown = alt.binding_radio(
    options = options + [None], # Create the option for all lines
    labels = labels + ['ALL'],
    name = 'SELECT LINE: '
)

# Create the selection
selection = alt.selection_point(
    fields = ['Metric'],
    bind = input_dropdown
)


line = (
    alt.Chart(
        table_2019,
        title = alt.Title(
            "Demand and capacity over time: unmet demand calculated", # Changes title
            fontSize = 18,
            fontWeight = "normal",
            anchor = "start",
            offset = 10
        ),
    )
    .mark_line(point = True)  # Create a point at each month
    .encode(
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            scale = alt.Scale(domain = [0, 60000]),
            title = "NUMBER OF PROJECT HOURS"
        ),
        x = alt.X(
            "month",
            sort = None,  # Disabling sorting for better time representation
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888", # Set colors to gray
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        color = alt.Color(
            "Metric",
            scale = alt.Scale(domain = options) # Instead of changing the opacity 
        ),                                       # in this example, we will limit the domain
        tooltip = ["Metric", "month", "Value"] # Sets tooltip
    )
    .properties(width = 350, height = 250) 
).add_params(selection).transform_filter(selection)

line_final = line 
line_final.configure_view(stroke = None)
Out[54]:

Exercise 2.5 - How would you show this data?�

This exercise shows a table and asks the viewer how would they show the data. We will replicate the visual answers given by the author.

If you would like to return to the Table of Contents, you can click here.

Loading the data�

In�[55]:
# Loading considering the NaN caused by Excel formatting
table = pd.read_excel(r"Data\2.5 EXERCISE.xlsx", usecols = [1, 2], header = 5)
table
Out[55]:
Year Attrition Rate
0 2019 0.0910
1 2018 0.0820
2 2017 0.0450
3 2016 0.1230
4 2015 0.0560
5 2014 0.1510
6 2013 0.0700
7 2012 0.0100
8 2011 0.0200
9 2010 0.0970
10 AVG 0.0745

First, we will drop the AVG (Average) column, as it will not be a data point in our graphs. It is better to calculate it separately when needed.

In�[56]:
table.drop(10, inplace = True)

Dot plot�

When attempting the first scatter plot, we realize that Altair incorrectly classifies the data type of the "Year" column. This can be fixed by specifying the correct date type (:O, as of, Ordinary).

In�[57]:
# Without data type

alt.Chart(table).mark_point(filled = True).encode(
    x = alt.X('Year'),
    y = alt.Y('Attrition Rate')
    )
Out[57]:
In�[58]:
# With data type equals temporal

alt.Chart(table).mark_point(filled = True).encode(
    x = alt.X('Year:T'),
    y = alt.Y('Attrition Rate')
    )
Out[58]:
In�[59]:
# With data type equals ordinal 

alt.Chart(table).mark_point(filled = True).encode(
    x = alt.X('Year:O'),
    y = alt.Y('Attrition Rate')
    )
Out[59]:

Initially, we will create a dot plot to visually represent the data over time, incorporating an average line to facilitate comparison.

In�[60]:
# Create the base graph with title
base = alt.Chart(
    table,
    title = alt.Title(
        "Attrition rate over time",
        fontSize = 18,
        fontWeight = "normal",
        anchor = "start",
        offset = 10
    )
)

# Make the filled dots
dots = base.mark_point(filled = True, size = 50, color = "#2c549d").encode(
    x = alt.X(
        "Year:O",
        axis = alt.Axis(labelAngle = 0, labelColor = "#888888", ticks = False),
        title = None,
        scale = alt.Scale(align = 0) # Align the first dot with the y-axis (0 at x-axis)
    ),
    y = alt.Y(
        "Attrition Rate",
        axis = alt.Axis(
            grid = False,
            titleAnchor = "end",
            labelColor = "#888888",
            titleColor = "#888888",
            tickCount = 9, # Set fixed number of ticks (intervals)
            format = "%", # y-axis data is a percentage
            titleFontWeight = "normal"
        ),
        title = "ATTRITION RATE"
    ),
    opacity = alt.value(1) # Maximum opacity
)

# Makes a line at the average value
# strokeDash defines how dotted is the line
rule = base.mark_rule(
    color = "#2c549d", 
    strokeDash = [3, 3]
    ).encode(
        x = alt.value(0), 
        x2 = alt.value(315), 
        y = "mean(Attrition Rate)"
        )

# Text above the average line
label = (
    alt.Chart({"values": [{"text": ["AVERAGE: 7.5%"]}]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -170, 
        dy = 0, 
        color = "#2c549d", 
        fontWeight = "bold"
        )
    .encode(text = "text:N")
)

final_dots = dots + rule + label
final_dots.properties(
    width = 350, height = 200
    ).configure_view(stroke = None)
Out[60]:

Visualization as depicted in the book:

Alt text

Line graph�

Next, we will link the dots with a line, aiding in the comparison of value differences.

Once more, omitting the data type in the label specification causes the labels to accumulate on the right side of the graph.

In�[61]:
line = base.mark_line(color = "#2c549d").encode(
    x = alt.X(
        "Year:O",
        axis = alt.Axis(labelAngle = 0, labelColor = "#888888", ticks = False),
        title = None,
        scale = alt.Scale(align = 0)
    ),
    y = alt.Y(
        "Attrition Rate",
        axis = alt.Axis(
            grid = False,
            titleAnchor = "end",
            labelColor = "#888888",
            titleColor = "#888888",
            tickCount = 9,
            format = "%",
            titleFontWeight = "normal"
        ),
        title = "ATTRITION RATE"
    )
)

# Label without specifying data type
label = base.mark_text(align = "left", dx = 3).encode(
    x = alt.X("Year", aggregate = "max"),
    y = alt.Y("Attrition Rate", aggregate = {"argmax": "Year"}),
    text = alt.Text("Attrition Rate")
)

final_line = line + rule + label

final_line.properties(width = 350, height = 200).configure_view(stroke = None)
Out[61]:
In�[62]:
# With data type

label = base.mark_text(align = "left", dx = 3, color = "#2c549d").encode(
    x = alt.X("Year:O", aggregate = "max"),
    y = alt.Y("Attrition Rate", aggregate = {"argmax": "Year"}),
    text = alt.Text("Attrition Rate")
)

final = line + rule + label

final.properties(
    width = 350, height = 200
    ).configure_view(stroke = None)
Out[62]:

The default method for placing the end label appeared ineffective, as it failed to filter out 2019 as the maximum value in the Year column. This issue can be rectified by straightforwardly filtering the entire dataset to encompass only values where Year == 2019.

In�[63]:
# Filter 2019 manually

label = base.mark_text(align = 'left', dx = 3, color = '#2c549d', fontWeight = 'bold').encode(
    x = alt.X('Year:O'),
    y = alt.Y('Attrition Rate'),
    text = alt.Text('Attrition Rate', format = ".1%"),
    xOffset = alt.value(-10), # Offsets slightly in the x and y-axis
    yOffset = alt.value(-10)
).transform_filter( # Filters
    alt.FieldEqualPredicate(field = 'Year', equal = 2019)
    )

# Label for the Average
label2 = alt.Chart({"values": 
                    [{"text":  ['AVG: 7.5%']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 96, dy = 15, 
                                color = '#2c549d',
                                fontWeight = 'bold'
                                ).encode(text = "text:N")


# Filled point at the end of the line
point = base.mark_point(filled = True).encode(
    x = alt.X('Year:O'),
    y = alt.Y('Attrition Rate'),
    opacity = alt.value(1)
).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', equal = 2019)
    )

final_line = line + rule + label + label2 + point

final_line.properties(
    width = 350,
    height = 200
).configure_view(stroke = None)
Out[63]:

Visualization as depicted in the book:

Alt text

Coloring below the average line may help highlight values below it.

In�[64]:
# Calculates average
avg = table['Attrition Rate'].mean()

# Creates a rectangle below the average line
rect = alt.Chart(pd.DataFrame({'y': [0], 'y2':[avg]})).mark_rect(
    opacity = 0.2
).encode(y = 'y', y2 = 'y2', x = alt.value(0), x2 = alt.value(315))

# Makes a different average label, in lighter color
label2 = alt.Chart({"values": 
                    [{"text":  ['AVG:', '7.5%']}]
                    }
                    ).mark_text(size = 10, 
                                align = "left", 
                                dx = 113, dy = 15, 
                                color = '#9fb5db',
                                fontWeight = 'bold'
                                ).encode(text = "text:N")

final_line2 = line + rect + label + label2 + point

final_line2.properties(
    width = 350,
    height = 200
).configure_view(stroke = None)
Out[64]:

Visualization as depicted in the book:

Alt text

Area graph�

An exploration using an area graph was undertaken; however, it conveys the impression that the area under the line holds significance, which is not the case for this dataset. This graph type may not be the most suitable choice for presenting this data.

Also, the decision was made to stray from the example given and connect the area to the y-axis. It is unclear if the decision to have it separated for this graph only was on purpose or an error.

In�[65]:
# Area graph
area = base.mark_area().encode(
    x = alt.X(
        "Year:O",
        axis = alt.Axis(labelAngle = 0, labelColor = "#888888", ticks = False),
        title = None,
        scale = alt.Scale(align = 0)
    ),
    y = alt.Y(
        "Attrition Rate",
        axis = alt.Axis(
            grid = False,
            titleAnchor = "end",
            labelColor = "#888888",
            titleColor = "#888888",
            tickCount = 9,
            format = "%",
            titleFontWeight = "normal"
        ),
        title = "ATTRITION RATE"
        )
)

# Creates a lighter rule to contrast with area graph
rule_light = base.mark_rule(
    color = "#9fb5db", 
    strokeDash = [3, 3]
    ).encode(
    x = alt.value(0), 
    x2 = alt.value(315), 
    y = "mean(Attrition Rate)"
)

final_area = area + rule_light + label2
final_area.properties(
    width = 350, 
    height = 200
    ).configure_view(stroke = None)
Out[65]:

Visualization as depicted in the book:

Alt text

Bar plot�

Finally, we can do a classic bar plot.

In�[66]:
# Bar plot 
bar = base.mark_bar(size = 25).encode(
    x = alt.X(
        "Year:O",
        axis = alt.Axis(
            labelAngle = 0, 
            labelColor = "#888888", 
            ticks = False, 
            domain = False
            ),
        title = None,
        scale = alt.Scale(align = 0)
    ),
    y = alt.Y(
        "Attrition Rate",
        axis = alt.Axis(
            grid = False,
            titleAnchor = "end",
            labelColor = "#888888",
            titleColor = "#888888",
            tickCount = 9,
            format = "%",
            titleFontWeight = "normal"
        ),
        title = "ATTRITION RATE"
    )
)

# Moves placement of average label 
label = (
    alt.Chart({"values": [{"text": ["AVG: 7.5%"]}]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -130, 
        dy = 0, 
        color = "#2c549d", 
        fontWeight = "bold"
        )
    .encode(text = "text:N")
)

final_bar = bar + rule + label
final_bar.properties(
    width = 320, 
    height = 200
    ).configure_view(stroke = None)
Out[66]:

Visualization as depicted in the book:

Alt text

Interactive�

For this interactive graph, the dot graph was selected. Inspired by the concept of the average line, we opted to create a "cut" line, allowing users to specify a value to partition the data. Dots falling below this threshold will be colored in red for enhanced visibility and focus.

In�[67]:
# Create the slider
slider = alt.binding_range(min = 0, max = 0.16, step = 0.005, name = "CUT: ")
selector = alt.param(name = "SelectorName", value = 0.03, bind = slider)

# Remove space from column name
table["AttRate"] = table["Attrition Rate"]

base = alt.Chart(
    table,
    title = alt.Title(
        "Attrition rate over time",
        fontSize = 18,
        fontWeight = "normal",
        anchor = "start",
        offset = 10
    )
)

dots = (
    base.mark_point(filled = True, size = 50)
    .encode(
        x = alt.X(
            "Year:O",
            axis = alt.Axis(
                labelAngle = 0, 
                labelColor = "#888888", 
                ticks = False
                ),
            title = None,
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Attrition Rate",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                tickCount = 9,
                format = "%",
                titleFontWeight = "normal"
            ),
            title = "ATTRITION RATE",
        ),
        color = alt.condition( # Change Attrition Rate is less than the slider
            alt.datum.AttRate < selector, 
            alt.value("#ef476f"), 
            alt.value("#118ab2")
        ),
        opacity = alt.value(1)
    )
    .add_params(selector)
)

# Normal Average Rule
rule = base.mark_rule(
    color = "#118ab2", 
    strokeDash = [3, 3]
    ).encode(
    x = alt.value(0), 
    x2 = alt.value(315), 
    y = "mean(AttRate)"
)

# "Cut" rule - moves based on the slider
rule2 = (
    base.mark_rule(
        color = "#ef476f", 
        strokeDash = [3, 3]
        )
    .encode(
        x = alt.value(0),
        x2 = alt.value(315),
        y = alt.value(200 - selector * 1250), # alt.datum did not work,
        opacity = alt.value(0.1)              # this converts pixel value into actual data
    )
    .add_params(selector)
)

label = (
    alt.Chart({"values": [{"text": ["AVERAGE: 7.5%"]}]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -170, 
        dy = 0, 
        color = "#118ab2", 
        fontWeight = "bold"
        )
    .encode(text = "text:N")
)

# "Cut" label, moves along with rule2
label2 = (
    alt.Chart({"values": [{"text": ["CUT"]}]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = 110, 
        color = "#ef476f", 
        fontWeight = "bold"
        )
    .encode(
        text = "text:N",  
        y = alt.value(195 - selector * 1250)
        )
    .add_params(selector)
)

final_interactive = dots + rule + label + rule2 + label2
final_interactive.properties(
    width = 350, 
    height = 200
    ).configure_view(stroke = None)
Out[67]:

Chapter 3 - Identify and eliminate Cluster�

"This lesson is simple but the impact is huge: get rid of the stuff that doesn’t need to be there" - Cole Nussbaumer Knaflic

Exercise 3.2 - how can we tie words to the graph?�

The main focus of this exercise is to apply the Gestalt Principles of Visual Perception to declutter graphs. For principles will be demonstrated, and each of them will be clarified through the visualization employing it.

If you would like to return to the Table of Contents, you can click here.

Loading the data�

In�[68]:
# Loading considering the NaN caused by Excel formatting
table = pd.read_excel(r"Data\3.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 4, skipfooter = 6)
table
Out[68]:
2019 Rate # exits
0 JAN 0.0040 120
1 FEB 0.0010 30
2 MAR 0.0015 45
3 APR 0.0080 240
4 MAY 0.0030 90
5 JUN 0.0014 42
6 JUL 0.0044 132
7 AUG 0.0050 150
8 SEP 0.0022 66
9 OCT 0.0015 45
10 NOV 0.0005 15
11 DEC 0.0010 30

The column name for 2019 is currently an integer, which might pose issues in the future. To avoid potential complications, we will modify the column name to a string.

For example, trying to run the following code returns an error:

alt.Chart(table).mark_bar().encode(
        x = alt.X('2019'), 
        y = alt.Y('Rate')
        )
ValueError: Dataframe contains invalid column name: 2019. Column names must be strings
In�[69]:
table.rename(columns = {2019:'Date'}, inplace = True)

Cluttered graph�

In�[70]:
# Graph with cluttered text

bar = (
    alt.Chart(
        table,
        title = alt.Title(
            "2019 monthly voluntary attrition rate",
            fontSize = 15,
            anchor = "start",
            offset = 10,
            fontWeight = "normal"
        )
    )
    .mark_bar(size = 20, color = "#b0b0b0")
    .encode(
        x = alt.X(
            "Date",
            sort = None, # Avoids alphabetical order
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False,
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title="2019"
        ),
        y = alt.Y(
            "Rate",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%", # y-axis is a percentage
                tickCount = 10 # Number of ticks (intervals) in axis
            ),
            scale = alt.Scale(domain = [0, 0.01]),
            title = "ATTRITION RATE"
        )
    )
    .properties(width = 300, height = 200)
)

# Text next to the graph
text = (
    alt.Chart(
        {
            "values": [
                {   # Each item is a line
                    "text": [
                        "Highlights:",
                        " ",
                        "In April there was a",
                        "reorganization. No jobs",
                        "were eliminated, but many",
                        "people chose to leave.",
                        " ",
                        "Attrition rates tend to be",
                        "higher in the Summer",
                        "months when it is",
                        "common for associates",
                        "to leave to go back to",
                        "school.",
                        " ",
                        "Attrition is typically low in",
                        "November and December",
                        "due to the holidays.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dy = -20, 
        dx = -10
        ) # Size and placement
    .encode(text = "text:N")
)


# Using the | symbols makes it so Altair unites the bar and the text next to each other horizontally
final_cluttered = bar | text

final_cluttered.configure_view(stroke = None)
Out[70]:

Visualization as depicted in the book:

Alt text

Proximity�

The "Proximity Principle" says that we tend to associate objects close to each other as being part of a single group. To apply this is our graph, we bring the texts near the data they represent.

In�[71]:
# The text now needs to be broken into parts

# First paragraph
text_april = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "In April there was a",
                        "reorganization. No jobs",
                        "were eliminated, but many",
                        "people chose to leave.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -145, 
        dy = -105
        )
    .encode(text = "text:N")
)

# Second paragraph
text_summer = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Attrition rates tend to be",
                        "higher in the Summer",
                        "months when it is",
                        "common for associates to",
                        "leave to go back to",
                        "school.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -10, 
        dy = -65
        )
    .encode(text = "text:N")
)

# Third paragraph
text_nov_dec = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Attrition is",
                        "typically low in",
                        "November &",
                        "December due",
                        "to the holidays.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "right", 
        dx = 150, 
        dy = 5
        )
    .encode(text = "text:N")
)

# Now we sum the graphs, so that the texts lie on top of the bar, instead of next to it
final_prox = bar + text_april + text_summer + text_nov_dec

final_prox.configure_view(stroke = None)
Out[71]:

Visualization as depicted in the book:

Alt text

Proximity with emphasis�

We can enhance the visual impact by emphasizing the bars and keywords.

Given that Altair does not support bold text within regular content, a strategy is to introduce blank spaces in the text and create a distinct object for the bold keywords.

In�[72]:
bar_highlight = (
    alt.Chart(
        table,
        title = alt.Title(
            "2019 monthly voluntary attrition rate",
            fontSize = 15,
            anchor = "start",
            offset = 10,
            fontWeight = "normal"
        ),
    )
    .mark_bar(size = 20)
    .encode(
        x = alt.X(
            "Date",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleX = 12,
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False,
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "2019"
        ),
        y = alt.Y(
            "Rate",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%",
                tickCount = 10
            ),
            scale = alt.Scale(domain = [0, 0.01]),
            title = "ATTRITION RATE"
        ),
        color = alt.Color(
            "Date",
            sort = None,
            scale = alt.Scale(
                range = [
                    "#b0b0b0",
                    "#b0b0b0",
                    "#b0b0b0",
                    "#666666", # It was also possible to color by condition
                    "#b0b0b0", # where Date == [list of highlighted months]
                    "#b0b0b0",
                    "#666666",
                    "#666666",
                    "#b0b0b0",
                    "#b0b0b0",
                    "#666666",
                    "#666666"
                ]
            ),
            legend = None
        )
    )
    .properties(width = 300, height = 200)
)

# First paragraph with blank space
text_april_blank = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "In            there was a",
                        "reorganization. No jobs",
                        "were eliminated, but many",
                        "people chose to leave."
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -145, 
        dy = -105
        )
    .encode(text = "text:N")
)

# Second paragraph with blank space
text_summer_blank = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Attrition rates tend to be",
                        "higher in the",
                        "months when it is",
                        "common for associates to",
                        "leave to go back to",
                        "school.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -10, 
        dy = -65
        )
    .encode(text = "text:N")
)

# Third paragraph with blank space
text_nov_dec_blank = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Attrition is",
                        "typically low in",
                        "&",
                        "due",
                        "to the holidays.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "right", 
        dx = 150, 
        dy = 5
        )
    .encode(text = "text:N")
)

# Bold "April" word
text_april_bold = (
    alt.Chart({"values": [{"text": ["April"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -133, 
        dy = -105, 
        fontWeight = 800
        )
    .encode(text = "text:N")
)

# Bold "Summer" word
text_summer_bold = (
    alt.Chart({"values": [{"text": ["Summer"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 54, 
        dy = -52, 
        fontWeight = 800
        )
    .encode(text = "text:N")
)

# Bold "November" word
text_nov_bold = (
    alt.Chart({"values": [{"text": ["November"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 80, 
        dy = 31, 
        fontWeight = 800
        )
    .encode(text = "text:N")
)

# Bold "December" word
text_dec_bold = (
    alt.Chart({"values": [{"text": ["December"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 68, 
        dy = 44, 
        fontWeight = 800
        )
    .encode(text = "text:N")
)

# Adds everything
final_prox_emph = (
    bar_highlight
    + text_april_blank
    + text_april_bold
    + text_summer_blank
    + text_summer_bold
    + text_nov_dec_blank
    + text_nov_bold
    + text_dec_bold
)

final_prox_emph.configure_view(stroke=None)
Out[72]:

Visualization as depicted in the book:

Alt text

Similarity�

The "Similarity Principle" pertains to our tendency to perceive objects as part of the same group when they share similar color, shape, or size. For this example, this means coloring the columns in the same shade as the chosen keywords.

In�[73]:
bar_highlight_color = (
    alt.Chart(
        table,
        title = alt.Title(
            "2019 monthly voluntary attrition rate",
            fontSize = 15,
            anchor = "start",
            offset = 10,
            fontWeight = "normal"
        ),
    )
    .mark_bar(size = 20)
    .encode(
        x = alt.X(
            "Date",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False,
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "2019"
        ),
        y = alt.Y(
            "Rate",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%",
                tickCount = 10
            ),
            scale = alt.Scale(domain = [0, 0.01]),
            title = "ATTRITION RATE"
        ),
        color = alt.Color(
            "Date",
            sort = None,
            scale = alt.Scale(
                range = [
                    "#b0b0b0",
                    "#b0b0b0", # Since there are multiple colors,
                    "#b0b0b0", # setting by condition would be harder
                    "#ed1e24",
                    "#b0b0b0",
                    "#b0b0b0",
                    "#ec7c30",
                    "#ec7c30",
                    "#b0b0b0",
                    "#b0b0b0",
                    "#5d9bd1",
                    "#5d9bd1"
                ]
            ),
            legend = None
        )
    )
    .properties(width = 300, height = 200)
)

# Blank texts with different position
text_april_blank2 = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Highlights:",
                        " ",
                        "In            there was a",
                        "reorganization. No jobs",
                        "were eliminated, but many",
                        "people chose to leave.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dy = -25, 
        dx = -10
        )
    .encode(text = "text:N")
)

text_summer_blank2 = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Attrition rates tend to be",
                        "higher in the",
                        "months when it is",
                        "common for associates to",
                        "leave to go back to",
                        "school.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, align = "left", 
        dx = -10, dy = 65
        )
    .encode(text = "text:N")
)

text_nov_dec_blank2 = (
    alt.Chart(
        {
            "values": [
                {"text": [
                    "Attrition is typically low in", 
                    " ", 
                    "due to the holidays."
                    ]}
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -10, 
        dy = 155
        )
    .encode(text = "text:N")
)

# Colored texts
text_april_color = (
    alt.Chart({"values": [{"text": ["April"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 3, 
        dy = 1, 
        fontWeight = 800, 
        color = "#ed1e24"
        )
    .encode(text = "text:N")
)

text_summer_color = (
    alt.Chart({"values": [{"text": ["Summer"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 55, 
        dy = 78, 
        fontWeight = 800, 
        color = "#ec7c30"
        )
    .encode(text = "text:N")
)

text_nov_dec_color = (
    alt.Chart({"values": [{"text": ["November & December"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -10, 
        dy = 168, 
        fontWeight = 800, 
        color = "#5d9bd1"
        )
    .encode(text = "text:N")
)

# While we could have used '&' to arrange the texts vertically,
# employing '+' provides greater flexibility in determining the layout of the text.

final_sim = bar_highlight_color | (
    text_april_blank2
    + text_april_color
    + text_summer_blank2
    + text_summer_color
    + text_nov_dec_blank2
    + text_nov_dec_color
)

final_sim.configure_view(stroke=None)
Out[73]:

Visualization as depicted in the book:

Alt text

Enclosure�

The "Enclosure Principle" says simply that, when objects are enclosed together, we perceive them as belonging to the same group.

Attempting to combine charts using the expression (bar | text) + rect_nov_dec + rect_summer + rect_april results in an error:

Concatenated charts cannot be layered. Instead, layer the charts before concatenating.

The most straightforward way to solve this is to add the text to the bar using bar + text, but doing so means assigning another position (dx, dy) to the text.

In�[74]:
# Assign another position to text
text_enclosure = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Highlights:",
                        " ",
                        "In April there was a",
                        "reorganization. No jobs",
                        "were eliminated, but many",
                        "people chose to leave.",
                        " ",
                        "Attrition rates tend to be",
                        "higher in the Summer",
                        "months when it is",
                        "common for associates",
                        "to leave to go back to",
                        "school.",
                        " ",
                        "Attrition is typically low in",
                        "November and December",
                        "due to the holidays.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, align = "left", 
        dx = 160, dy = -113
        )
    .encode(text = "text:N")
)

# Defines the rectangles that are going to enclose the text
rect_nov_dec = (
    alt.Chart(
        pd.DataFrame({"y": [0], "y2": [0.0019], "x": [10], "x2": [8.4]})
        )
    .mark_rect(opacity = 0.2) # Low opacity
    .encode(
        y = "y", 
        y2 = "y2", 
        x = alt.X("x", axis = None), 
        x2 = "x2"
        )
)

rect_summer = (
    alt.Chart(
        pd.DataFrame({
            "y": [0.0023], 
            "y2": [0.0063], 
            "x": [10], 
            "x2": [5.1]
            }))
    .mark_rect(opacity = 0.2)
    .encode(
        y = "y", 
        y2 = "y2", 
        x = alt.X("x", axis = None), 
        x2 = "x2"
        )
)

rect_april = (
    alt.Chart(pd.DataFrame({
        "y": [0.0068], 
        "y2": [0.0095], 
        "x": [10], 
        "x2": [2.6]
        }))
    .mark_rect(opacity = 0.2)
    .encode(
        y = "y", 
        y2 = "y2", 
        x = alt.X("x", axis = None), 
        x2 = "x2"
        )
)

bar + text_enclosure + rect_nov_dec + rect_summer + rect_april
Out[74]:

Utilizing a DataFrame to define the rectangles seems to prevent them from reaching the text section. As a next step, we will explicitly define the coordinates of the rectangles in pixels.

In�[75]:
# Rectangles with position defined by pixels
rect_nov_dec = alt.Chart(
    pd.DataFrame({'values':[{}]})
    ).mark_rect(
      opacity = 0.2, 
      color = "#b0b0b0"
      ).encode(
         y = alt.value(5), 
         y2 = alt.value(60), 
         x = alt.value(75), 
         x2 = alt.value(440)
         )

rect_summer = alt.Chart(
   pd.DataFrame({'values':[{}]})
   ).mark_rect(
      opacity = 0.2, 
      color = "#b0b0b0"
      ).encode(
         y = alt.value(70), 
         y2 = alt.value(150), 
         x = alt.value(150), 
         x2 = alt.value(440)
         )

rect_april = alt.Chart(
   pd.DataFrame({'values':[{}]})
   ).mark_rect(
      opacity = 0.2, 
      color = "#b0b0b0"
      ).encode(
         y = alt.value(160), 
         y2 = alt.value(202), 
         x = alt.value(250), 
         x2 = alt.value(440)
         )

# The text_enclosure and bar comes after the rectangles so that they sit on top
final_enc = (
   rect_nov_dec + rect_summer + 
   rect_april + bar + text_enclosure
)

final_enc.configure_view(stroke = None)
Out[75]:

Visualization as depicted in the book:

Alt text

Enclosure with color differentiation�

We can use color to emphasize the different enclosures.

In�[76]:
# Colored rectangles
rect_nov_dec_color = (
    alt.Chart(pd.DataFrame({"values": [{}]}))
    .mark_rect(opacity = 0.2, color = "#ed1e24")
    .encode(
        y = alt.value(5), 
        y2 = alt.value(60), 
        x = alt.value(75), 
        x2 = alt.value(450)
        )
)

rect_summer_color = (
    alt.Chart(pd.DataFrame({"values": [{}]}))
    .mark_rect(opacity = 0.2, color = "#ec7c30")
    .encode(
        y = alt.value(70), 
        y2 = alt.value(150), 
        x = alt.value(150), 
        x2 = alt.value(450)
        )
)

rect_april_color = (
    alt.Chart(pd.DataFrame({"values": [{}]}))
    .mark_rect(opacity = 0.2, color = "#5d9bd1")
    .encode(
        y = alt.value(160), 
        y2 = alt.value(202), 
        x = alt.value(250), 
        x2 = alt.value(450)
        )
)

final_enc_color = (
    rect_nov_dec_color + 
    rect_summer_color + 
    rect_april_color + 
    bar + 
    text_enclosure
    )

final_enc_color.configure_view(stroke = None)
Out[76]:

Visualization as depicted in the book:

Alt text

Enclosure + Similarity�

We can make use of both Enclosure and Similarity principles. First, we will try to add already existing components.

In�[77]:
bar_highlight_color + rect_april_color | (
    text_april_blank2 + text_april_color 
    + text_summer_blank2 + text_summer_color 
    + text_nov_dec_blank2 + text_nov_dec_color
    )
Out[77]:
In�[78]:
bar_highlight_color | (
    text_april_blank2 + text_april_color 
    + text_summer_blank2 + text_summer_color 
    + text_nov_dec_blank2 + text_nov_dec_color
    ) + rect_april_color
Out[78]:

Since attempting to layer only already assigned variables seems to be ineffective, we will recreate the texts using alternative positions to enable their addition using the + symbol.

In�[79]:
# Same texts, different dx and dy values

text_april_blank2_enclosure = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Highlights:",
                        " ",
                        "In            there was a",
                        "reorganization. No jobs",
                        "were eliminated, but many",
                        "people chose to leave."
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 160, 
        dy = -113
        ) 
    .encode(text = "text:N")
)

text_summer_blank2_enclosure = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "Attrition rates tend to be",
                        "higher in the",
                        "months when it is",
                        "common for associates to",
                        "leave to go back to",
                        "school.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 160, 
        dy = -21
        )
    .encode(text = "text:N")
)

text_nov_dec_blank2_enclosure = (
    alt.Chart(
        {
            "values": [
                {"text": [
                    "Attrition is typically low in", 
                    " ", 
                    "due to the holidays."]}
            ]
        }
    )
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 160, 
        dy = 68
        )
    .encode(text = "text:N")
)

text_april_color_enclosure = (
    alt.Chart({"values": [{"text": ["April"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 172, 
        dy = -87, 
        fontWeight = 800, 
        color = "#ed1e24"
        )
    .encode(text = "text:N")
)

text_summer_color_enclosure = (
    alt.Chart({"values": [{"text": ["Summer"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 225, 
        dy = -8, 
        fontWeight = 800, 
        color = "#ec7c30"
        )
    .encode(text = "text:N")
)

text_nov_dec_color_enclosure = (
    alt.Chart({"values": [{"text": ["November & December"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 160, 
        dy = 82, 
        fontWeight = 800, 
        color = "#5d9bd1"
        )
    .encode(text = "text:N")
)

final_enc_sim = (
    rect_nov_dec_color
    + rect_summer_color
    + rect_april_color
    + bar_highlight_color
    + text_april_blank2_enclosure
    + text_summer_blank2_enclosure
    + text_nov_dec_blank2_enclosure
    + text_april_color_enclosure
    + text_summer_color_enclosure
    + text_nov_dec_color_enclosure
)

final_enc_sim.configure_view(stroke = None)
Out[79]:

Visualization as depicted in the book:

Alt text

Connection�

The "Connection" relies on the fact that objects that are physically connected are often perceived as part of a single group. In this example, we will connect the texts and the data using a line.

In�[80]:
# Rules connecting text with bar
rule_april = (
    alt.Chart()
    .mark_rule(point = {"fill": "gray"}) # With a dot at the end
    .encode(
        x = alt.value(102), 
        y = alt.value(45), 
        x2 = alt.value(300), 
        strokeWidth = alt.value(0.5)
    )
)

rule_summer = (
    alt.Chart()
    .mark_rule(point = {"fill": "gray"})
    .encode(
        x = alt.value(202),
        y = alt.value(105),
        x2 = alt.value(300),
        strokeWidth = alt.value(0.5)
    )
)

# Part one of the November/December rule
rule_nov_dec_1 = (
    alt.Chart()
    .mark_rule() # Without the dot
    .encode(
        x = alt.value(260),
        y = alt.value(175),
        x2 = alt.value(300),
        strokeWidth = alt.value(0.5)
    )
)

# Part two of the November/December rule
rule_nov_dec_2 = (
    alt.Chart()
    .mark_rule(point = {"fill": "gray"}) # With the dot
    .encode(
        x = alt.value(260),
        y = alt.value(185),
        y2 = alt.value(175),
        strokeWidth = alt.value(0.5)
    )
)

final_conn = (
    bar + 
    text_enclosure + 
    rule_april + 
    rule_summer + 
    rule_nov_dec_1 + 
    rule_nov_dec_2
)

final_conn.configure_view(stroke = None)
Out[80]:

Visualization as depicted in the book:

Alt text

Connection + Similarity�

Now we can use the connection and the similarity principles to connect highlighted texts with colored data.

In�[81]:
final_conn_sim = (
    bar_highlight_color
    + text_april_blank2_enclosure
    + text_summer_blank2_enclosure
    + text_nov_dec_blank2_enclosure
    + text_april_color_enclosure
    + text_summer_color_enclosure
    + text_nov_dec_color_enclosure
    + rule_april
    + rule_summer
    + rule_nov_dec_1
    + rule_nov_dec_2
)

final_conn_sim.configure_view(stroke = None)
Out[81]:

Visualization as depicted in the book:

Alt text

Interactivity�

In designing this interactive graph, we opted to apply the Gestalt Principle of Proximity for tooltips. Rather than displaying the text next to the bars it represents, the information will appear when the user hovers over the respective bars.

Having experimented with more sophisticated code solutions without achieving success, we opted for an alternative approach. We created a "base" bar in light gray without any tooltips. Subsequently, for each of the three groups of bars, we crafted distinct charts containing only the data relevant to the tooltips, in darker gray. Consequently, the final graph is composed of four charts added on top of each other.

In�[82]:
# Base graph
bar_base = (
    alt.Chart(
        table,
        title = alt.Title(
            "2019 monthly voluntary attrition rate",
            fontSize = 15,
            anchor = "start",
            offset = 10,
            fontWeight = "normal"
        )
    )
    .mark_bar(size = 20)
    .encode(
        x = alt.X(
            "Date",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleX = 12,
                labelColor = "#888888",
                titleColor = "#888888",
                ticks = False,
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "2019"
        ),
        y = alt.Y(
            "Rate",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%",
                tickCount = 10
            ),
            scale = alt.Scale(domain = [0, 0.01]),
            title = "ATTRITION RATE",
        ),
        color = alt.Color( # Light gray
            "Date", 
            sort = None, 
            scale = alt.Scale(range = ["#b0b0b0"]), 
            legend = None
        )
    )
    .properties(width = 300, height = 200)
)

# Graph for April
bar_apr = (
    alt.Chart(table)
    .mark_bar()
    .encode(
        x = alt.X("Date", sort = None),
        y = alt.Y(
            "Rate", 
            scale = alt.Scale(domain = [0, 0.01])
            ),
        color = alt.value("#666666"), # Darker gray
        tooltip = alt.value(
            "In April there was a reorganization. No jobs were eliminated, but many people chose to leave."
        ) # Text for April
    )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "Date", 
            equal = "APR"
            )
        ) # Only data for April
)

# Graph for summer
bar_jul_aug = (
    alt.Chart(table)
    .mark_bar(size = 20)
    .encode(
        x = alt.X(
            "Date", 
            sort = None
            ),
        y = alt.Y(
            "Rate", 
            scale = alt.Scale(domain = [0, 0.01])
            ),
        color = alt.value("#666666"), # Darker gray
        tooltip = alt.value(
            "Attrition rates tend to be higher in the Summer months when it is common for associates to leave to go back to school."
        ) # Text for summer
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Date", 
            oneOf = ["JUL", "AUG"])
            )
) # Filter summer data

# Graph for end of year
bar_dec_nov = (
    alt.Chart(table)
    .mark_bar(size = 20)
    .encode(
        x = alt.X("Date", sort = None),
        y = alt.Y("Rate", scale = alt.Scale(domain = [0, 0.01])),
        color = alt.value("#666666"), # Darker gray
        tooltip = alt.value(
            "Attrition is typically low in November and December due to the holidays."
        ) # Text for the end of year
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Date", 
            oneOf = ["NOV", "DEC"]
            )
        )
)

final_tooltip = (
    bar_base + 
    bar_apr + 
    bar_jul_aug + 
    bar_dec_nov
    )

final_tooltip.configure_view(stroke = None)
Out[82]:

Chapter 4 - Focus Attention�

"Where do you want your audience to look?" - Cole Nussbaumer Knaflic

Exercise 2 - Focus on...�

This exercise focuses on identifying what you want to highlight in a graph, the specific insights you wish the viewer to take from the visual presentation.

If you would like to return to the Table of Contents, you can click here.

Loading the data�

In�[83]:
# Loading table
table = pd.read_excel(r"Data\4.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 5, skipfooter = 30)

# Fixing names
table['Brands'] = table['Unnamed: 1']
table['Change'] = table['$ Vol % change']

table.drop(columns = ['Unnamed: 1', '$ Vol % change'], inplace = True)
table
Out[83]:
spacing for dot plot Brands Change
0 0 Fran's Recipe -0.14
1 1 Wholesome Goodness -0.13
2 2 Lifestyle -0.10
3 3 Coat protection -0.09
4 4 Diet Lifestyle -0.08
5 5 Feline Basics -0.05
6 6 Lifestyle Plus -0.04
7 7 Feline Freedom -0.02
8 8 Feline Gold 0.01
9 9 Feline Platinum 0.01
10 10 Feline Instinct 0.02
11 11 Feline Pro 0.03
12 12 Farm Fresh Tasties 0.04
13 13 Feline Royal 0.05
14 14 Feline Focus 0.09
15 15 Feline Grain Free 0.09
16 16 Feline Silver 0.12
17 17 Nutri Balance 0.16
18 18 Farm Fresh Basics 0.17

Graph without highlights�

Initially, we will show the graph in a light gray color. Notice that not using sort = None results in the brands being arranged in alphabetical order.

In�[84]:
# Brans in alphabetical order
alt.Chart(table).mark_bar().encode(
    x = "Change",
    y = "Brands"
    )
Out[84]:
In�[85]:
# "Not sorted" graph

alt.Chart(table).mark_bar().encode(
    x = "Change",
    y = alt.Y("Brands", sort = None)
    )
Out[85]:
In�[86]:
# Customized graph

chart = (
    alt.Chart(
        table,
        title = alt.Title(
            "Cat food brands: YoY sales change",
            subtitle = "% CHANGE IN VOLUME ($)", # Add subtitle
            color = "black",
            subtitleColor = "gray",
            offset = 10,
            anchor = "start",
            fontSize = 19,
            subtitleFontSize = 11,
            fontWeight = "normal"
        ),
    )
    .mark_bar(color = "#8b8b8b", size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]), # x-axis from -20% to 20%
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "DECREASED | INCREASED"
        ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)


# Brand names that go on the right
label1 = (
    alt.Chart(table.loc[table["Change"] < 0])
    .mark_text(
        align = "left", 
        color = "#8b8b8b", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

# Brand names that go on the left
label2 = (
    alt.Chart(table.loc[table["Change"] > 0])
    .mark_text(
        align = "right", 
        color = "#8b8b8b", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

gray = chart + label1 + label2

gray.properties(width = 400).configure_view(stroke = None)
Out[86]:

Visualization as depicted in the book:

Alt text

Highlighting the Lifestyle brands�

Following this, the author suggests highlighting a specific brand of cat food, the Lifestyle line. All labels and bars associated with this brand will be colored black to captivate the viewer's attention.

In�[87]:
# Creates a list of brand in the Lifestyle line
conditions = [
    f'datum.Brands == "{brand}"' for brand in table["Brands"] if "Lifestyle" in brand
]
condition = f"({'|'.join(conditions)})"


# Create the chart
chart = (
    alt.Chart(
        table,
        title = alt.Title(
            "Cat food brands:", # Title
            subtitle = "YEAR-OVER-YEAR % CHANGE IN VOLUME ($)",
            color = "black",
            subtitleColor = "gray",
            anchor = "start",
            fontSize = 19,
            subtitleFontSize = 11,
            fontWeight = "normal"
        )
    )
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "DECREASED | INCREASED"
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition( # If brand in Lifestyle, then color equals black
            condition,
            alt.value("black"),
            alt.value("#c6c6c6")
        )
    )
)

# Labels to the right
label1_bw = (
    alt.Chart(table.loc[table["Change"] < 0])
    .mark_text(align = "left", fontWeight = 700)
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        # If brand in Lifestyle, then label equals black
        color = alt.condition(
            condition, 
            alt.value("black"), 
            alt.value("#c6c6c6")
            )
    )
)

# Labels to the left
# There are no lifestyle brand line in this category
label2_bw = (
    alt.Chart(table.loc[table["Change"] > 0])
    .mark_text(
        align = "right", 
        color = "#c6c6c6", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

# Add bold part of the title separately
title_bw = (
    alt.Chart({"values": [
        {"text": ["Lifestyle line brands decline"]}
        ]})
    .mark_text(
        size = 19, 
        align = "left", 
        dx = 172, 
        dy = -250, 
        fontWeight = 700, 
        color = "black"
        )
    .encode(text = "text:N")
)

(chart + label1_bw + label2_bw + title_bw).properties(
    width = 400
    ).configure_view(
        stroke = None
        )
Out[87]:

Given that incorporating the bold section of the title next to the chart's predefined title proved unsuccessful, we will proceed by generating the chart without a title. Subsequently, we will add both texts separately.

In�[88]:
# Graph without title
chart_bw = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
            x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = alt.Axis(
                grid = False, orient = "top", 
                labelColor = "#888888", titleColor = '#888888', 
                titleFontWeight = 'normal', format = "%"
                ),
        title = "DECREASED | INCREASED"
        ),
    y = alt.Y(
        "Brands", 
        sort = None, 
        axis = None
        ),
    color = alt.condition(
        condition,
        alt.value('black'), 
        alt.value('#c6c6c6')
        )
    )

# Part one of title
title_bw = (
    alt.Chart({"values": [
        {"text": ["Cat food brands:"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -200, 
        dy = -270, 
        fontWeight = "normal", 
        color = "black"
    )
    .encode(text = "text:N")
)

# Part two of title
title_bw_bold = (
    alt.Chart({"values": [
        {"text": ["Lifestyle line brands decline"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "black"
        )
    .encode(text = "text:N")
)

# Subtitle
subtitle_bw = (
    alt.Chart({"values": [
        {"text": ["YEAR-OVER-YEAR % CHANGE IN VOLUME ($)"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -200, 
        dy = -250, 
        fontWeight = "normal", 
        color = "gray"
    )
    .encode(text = "text:N")
)


lifestyle = (
    chart_bw + 
    label1_bw + 
    label2_bw + 
    title_bw + 
    title_bw_bold + 
    subtitle_bw
    )

lifestyle.properties(width = 400).configure_view(stroke = None)
        
Out[88]:

Visualization as depicted in the book:

Alt text

Highlighting the Feline brand�

In the next step, our focus is on highlighting the Feline brand line. The difference is that we have the information that the brand uses a purple logo, and therefore, we can incorporate this color into our design.

In�[89]:
# Set condition to list of Feline products
conditions = [
    f'datum.Brands == "{brand}"' for brand in table["Brands"] if "Feline" in brand
]
condition_purple = f"({'|'.join(conditions)})"


# Highlights Feline products with purple color
chart_purple = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "DECREASED | INCREASED"
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_purple, 
            alt.value("#713a97"), 
            alt.value("#c6c6c6")
        )
    )
)

# Labels to the right
label1_purple = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(
        align = "left", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_purple, 
            alt.value("#713a97"), 
            alt.value("#c6c6c6")
        )
    )
)

# Labels to the left
# Now we need to add condition as there are Feline products in this
label2_purple = (
    alt.Chart(
        table.loc[table["Change"] > 0]
        )
    .mark_text(
        align = "right", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_purple, 
            alt.value("#713a97"), 
            alt.value("#c6c6c6")
        )
    )
)

# Second title in purple
title_purple = (
    alt.Chart({"values": [
        {"text": ["most in Feline line increased"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "#713a97"
        )
    .encode(text = "text:N")
)


feline = (
    chart_purple + 
    label1_purple + 
    label2_purple + 
    title_bw + 
    title_purple + 
    subtitle_bw
)

feline.properties(width = 400).configure_view(stroke = None)
Out[89]:

Visualization as depicted in the book:

Alt text

In the displayed chart from the book, there is an error where the bar corresponding to "Feline Gold" cat food appears light gray instead of the correct purple, consistent with the labeling. A request for information on the process of formally submitting an erratum has been made.

Highlighting brands that declined�

Moving forward, we will selectively color the brands that experienced a decline in sales. Recognizing the accessibility challenges associated with the conventional red and green combination, the author suggests using orange and blue to cater to individuals with colorblindness.

In�[90]:
# Condition based on negative values of Change
condition = "datum.Change < 0"

# Chart with orange color
chart_orange = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y(
            "Brands", 
            sort = None, 
            axis = None
            ),
        color = alt.condition(
            condition, 
            alt.value("#ec7c30"), 
            alt.value("#c6c6c6")
            )
    )
)

# The right label is for decreased brands
# so we will color all of it with orange
label1_orange = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(
        align = "left", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.value("#ec7c30")
    )
)

# There is no need to create labeling for the left
# it will all be gray and we already have that variable

# Second title in orange
title_orange = (
    alt.Chart({"values": [
        {"text": ["8 brands decreased in sale"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "#ec7c30"
        )
    .encode(text = "text:N")
)

# "DECREASED" text in orange, serves as a label
decreased_orange = (
    alt.Chart({"values": [
        {"text": ["DECREASED"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -80, 
        dy = -220, 
        fontWeight = 700, 
        color = "#ec7c30"
        )
    .encode(text = "text:N")
)

# "INCREASED" text in gray
increased_gray = (
    alt.Chart({"values": [
        {"text": ["|    INCREASED"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -0, 
        dy = -220, 
        fontWeight = 700, 
        color = "#8b8b8b"
        )
    .encode(text = "text:N")
)

decreased = (
    chart_orange
    + label1_orange
    + label2_bw
    + title_bw
    + title_orange
    + subtitle_bw
    + decreased_orange
    + increased_gray
)

decreased.properties(width = 400).configure_view(stroke = None)
Out[90]:

Visualization as depicted in the book:

Alt text

Highlighting the two brands that decreased the most�

To achieve this, the author has opted to maintain the coloration of all brands that experienced a decrease while placing additional emphasis on the two with the most significant declines by using a brighter shade of orange compared to the others.

In�[91]:
# Create a list of brands that decreased the most
decreased_most = table.nsmallest(2, "Change")
brands_decreased = decreased_most["Brands"].tolist()

# Create a condition based on that list
conditions = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition = f"({'|'.join(conditions)})"

# Create a list of brands with positive changes for auxiliary graph
positive_brands = table.loc[table["Change"] > 0, "Brands"].unique()
positive_brands_list = positive_brands.tolist()

# Chart of brands that decreased
chart_oranges = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        # If the Brand is in the list of decreased most,
        # color it orange - else, color it light orange
        color = alt.condition(
            condition, 
            alt.value("#ec7c30"), 
            alt.value("#efb284")
            )
    )
)

# Auxiliary graph with gray bottom
chart_oranges2 = (
    alt.Chart(table)
    # Color equals gray
    .mark_bar(
        size = 15, 
        color = "#c6c6c6", 
        opacity = 1
        )
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y(
            "Brands", 
            sort = None, 
            axis = None
            )
    )
    .transform_filter( # Only positive brands
        alt.FieldOneOfPredicate(
            field = "Brands", 
            oneOf = positive_brands_list
            )
    )
)

# Label to the right
label1_oranges = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(align = "left", fontWeight = 700)
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        # Color it orange or light orange
        color = alt.condition(
            condition, 
            alt.value("#ec7c30"), 
            alt.value("#efb284")
            )
    )
)

# Again, no need for new labels for the left

# Second title in orange
title_oranges = (
    alt.Chart({"values": [
        {"text": ["2 brands decreased the most"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "#ec7c30"
        )
    .encode(text = "text:N")
)


decreased2 = (
    chart_oranges
    + chart_oranges2
    + label1_oranges
    + label2_bw
    + title_bw
    + title_oranges
    + subtitle_bw
    + decreased_orange
    + increased_gray
)

decreased2.properties(width = 400).configure_view(stroke = None)
Out[91]:

Visualization as depicted in the book:

Alt text

Highlighting brands that increased�

This code closely resembles the one used to highlight all brands that experienced a decrease, with modifications made to the condition and color, now blue.

In�[92]:
# Change signal of condition from < to >
condition = "datum.Change > 0"

# Graph highlighting increased brands
chart_blue = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        # If brand increased, color it blue, else, color it gray
        color = alt.condition(
            condition, 
            alt.value("#4772b8"), 
            alt.value("#c6c6c6")
            )
    )
)

# Create labels to the left
label2_blue = (
    alt.Chart(
        table.loc[table["Change"] > 0]
        )
    .mark_text(
        align = "right", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition, 
            alt.value("#4772b8"), 
            alt.value("#c6c6c6")
            )
    )
)

# This time there is no need to create labels on the right

# Second title in blue
title_blue = (
    alt.Chart({"values": [
        {"text": ["11 brands flat to increasing"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "#4772b8"
        )
    .encode(text = "text:N")
)

# "DECREASED" text in gray
decreased_gray = (
    alt.Chart({"values": [
        {"text": ["DECREASED    |"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -80, 
        dy = -220, 
        fontWeight = 700, 
        color = "#8b8b8b"
        )
    .encode(text = "text:N")
)

# "INCREASED" text in blue
increased_blue = (
    alt.Chart({"values": [{"text": ["INCREASED"]}]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 20, 
        dy = -220, 
        fontWeight = 700, 
        color = "#4772b8"
        )
    .encode(text = "text:N")
)

increased = (
    chart_blue
    + label1
    + label2_blue
    + title_bw
    + title_blue
    + subtitle_bw
    + decreased_gray
    + increased_blue
)

increased.properties(width = 400).configure_view(stroke = None)
Out[92]:

Visualization as depicted in the book:

Alt text

Final view with all highlights�

For the last question, the author explores the creation of a conclusive slide that highlights all the points discussed above. For this, we will create two graphs, one with both Feline and Lifestyle Brands and one showcasing increased and decreased sales. The author initially paired the visualizations with text, but as the manual labor-intensive method has already been explored, we will abstain from incorporating additional textual elements.

In�[93]:
# Graph with both brands

# Lifestyle brand
chart_bw2 = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "DECREASED | INCREASED"
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.value("black") # Color black with no condition
    )
    .transform_filter( # Filter to be only lifestyle products
        alt.FieldOneOfPredicate(
            field = "Brands", 
            oneOf = ["Lifestyle", "Lifestyle Plus", "Diet Lifestyle"]
        )
    )
)

# We can reuse the left labeling for the lifestyle brand

# Second title in black
title_bw_bold_2 = (
    alt.Chart({"values": [
        {"text": ["mixed results in sales year-over-year"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "black"
        )
    .encode(text = "text:N")
)

# Creates list of all brands that are not Lifestyle
not_lifestyle = table[
    ~table["Brands"].isin(["Lifestyle", "Lifestyle Plus", "Diet Lifestyle"])
]
not_lifestyle = not_lifestyle["Brands"].tolist()

# Create the labeling in the right for the Feline brand in purple
# If we reuse the one we already have, it will overwrite the labels in black
label1_purple_2 = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(
        align = "left", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_purple, 
            alt.value("#713a97"), 
            alt.value("#c6c6c6")
        )
    )
    # Filter out the Lifestyle brand so that it is not overwritten
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands", 
            oneOf = not_lifestyle
            )
        )
)

# Since there are no Lifestyle brand labeling on the left,
# there is no need to create a new left labeling for Feline

mixed = (
    chart_purple
    + chart_bw2
    + label1_bw
    + label1_purple_2
    + label2_purple
    + title_bw
    + title_bw_bold_2
    + subtitle_bw
)

mixed.properties(width = 400).configure_view(stroke = None)
Out[93]:

Visualization as depicted in the book:

Alt text

In�[94]:
# Create a list to the most increased and most decreased brands
decreased_most = table.nsmallest(2, "Change")
increased_most = table.nlargest(2, "Change")

brands_decreased = decreased_most["Brands"].tolist()
brands_increased = increased_most["Brands"].tolist()

# Create a condition for brands that increased and decreased
conditions_decreased = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition_decreased = f"({'|'.join(conditions_decreased)})"

conditions_increased = [f'datum.Brands == "{brand}"' for brand in brands_increased]
condition_increased = f"({'|'.join(conditions_increased)})"

# Create a gray graph representing middle brands
chart_gray = (
    alt.Chart(table)
    .mark_bar(color = "#c6c6c6", size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y(
            "Brands", 
            sort = None, 
            axis = None
            )
    )
)

# Labels on the right for those middle brands
label1_gray = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(
        align = "left", 
        color = "#c6c6c6", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

# Labels on the left for those middle brands
label2_gray = (
    alt.Chart(
        table.loc[table["Change"] > 0]
        )
    .mark_text(
        align = "right", 
        color = "#c6c6c6", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

# Graph for decreased values
chart_oranges_mix = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            # If on top 2 decreased, bright orange
            # Else, light orange
            condition_decreased, 
            alt.value("#ec7c30"), 
            alt.value("#efb284")
        )
    )
    # Filter to only the five most decreased brands
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = list(table["Brands"][0:5])
        )
    )
)

# Labeling on the right for decreased brands
label_oranges = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(
        align = "left", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            # If on top 2 decreased, bright orange
            # Else, light orange
            condition_decreased, 
            alt.value("#ec7c30"), 
            alt.value("#efb284")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = list(table["Brands"][0:5])
        )
    )
)

# Graph for increased values
chart_blue_mix = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y(
            "Brands", 
            sort = None, 
            axis = None
            ),
        color = alt.condition(
            # If on top 2 increased, color blue
            # Else, color it light blue
            condition_increased, 
            alt.value("#4772b8"), 
            alt.value("#91a9d5")
        )
    )
    # Filter to only increased brands
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = list(table["Brands"][14:19])
        )
    )
)

# Labeling on the left for increased brands
label_blue = (
    alt.Chart(
        table.loc[table["Change"] > 0]
        )
    .mark_text(align = "right", fontWeight = 700)
    .encode(
        x = alt.value(192),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_increased, 
            alt.value("#4772b8"), 
            alt.value("#91a9d5")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = list(table["Brands"][14:19])
        )
    )
)

# We will unite the "DECREASED" orange with the "INCREASED" blue
# So we will create the "|" separation in gray 
separation = (
    alt.Chart({"values": [
        {"text": ["|"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 3, 
        dy = -220, 
        fontWeight = 700, 
        color = "#c6c6c6"
        )
    .encode(text = "text:N")
)

mixed2 = (
    chart_gray
    + chart_oranges_mix
    + chart_blue_mix
    + label1_gray
    + label2_gray
    + label_oranges
    + label_blue
    + title_bw
    + title_bw_bold_2
    + subtitle_bw
    + decreased_orange
    + increased_blue
    + separation
)

mixed2.properties(width = 400).configure_view(stroke = None)
Out[94]:

Visualization as depicted in the book:

Alt text

Although this exercise does not have an interactive version of the graph, we will discuss the color palette of it in a subsequent exercise. There, an interactive version relating to the colors will be demonstrated.

Exercise 3 - Direct attention many ways�

While the last visualizations was about highlighting key elements with color, in this exercise we will explore other ways to bring the viewer's attention, such as contrast and size.

If you would like to return to the Table of Contents, you can click here.

Loading the data�

In�[95]:
# Load Excel file
table = pd.read_excel(r"Data\4.3 EXERCISE.xlsx", usecols = [1, 2, 3, 4], header = 5, skipfooter = 5)

table
Out[95]:
YEAR Total Organic Referral
0 2005 0.087 0.033 0.054
1 2006 0.083 0.035 0.048
2 2007 0.086 0.037 0.049
3 2008 0.089 0.036 0.053
4 2009 0.084 0.034 0.050
5 2010 0.086 0.031 0.055
6 2011 0.075 0.032 0.043
7 2012 0.072 0.035 0.037
8 2013 0.069 0.032 0.037
9 2014 0.074 0.038 0.036
10 2015 0.066 0.035 0.031
11 2016 0.080 0.045 0.035
12 2017 0.073 0.041 0.032
13 2018 0.070 0.042 0.028
14 2019 0.063 0.038 0.025
In�[96]:
# Transform into the long format
melted_table = pd.melt(
    table, 
    id_vars = ['YEAR'], 
    var_name = 'Metric', 
    value_name = 'Value'
    )

melted_table["Metric"] = melted_table["Metric"].str.upper()

melted_table
Out[96]:
YEAR Metric Value
0 2005 TOTAL 0.087
1 2006 TOTAL 0.083
2 2007 TOTAL 0.086
3 2008 TOTAL 0.089
4 2009 TOTAL 0.084
5 2010 TOTAL 0.086
6 2011 TOTAL 0.075
7 2012 TOTAL 0.072
8 2013 TOTAL 0.069
9 2014 TOTAL 0.074
10 2015 TOTAL 0.066
11 2016 TOTAL 0.080
12 2017 TOTAL 0.073
13 2018 TOTAL 0.070
14 2019 TOTAL 0.063
15 2005 ORGANIC 0.033
16 2006 ORGANIC 0.035
17 2007 ORGANIC 0.037
18 2008 ORGANIC 0.036
19 2009 ORGANIC 0.034
20 2010 ORGANIC 0.031
21 2011 ORGANIC 0.032
22 2012 ORGANIC 0.035
23 2013 ORGANIC 0.032
24 2014 ORGANIC 0.038
25 2015 ORGANIC 0.035
26 2016 ORGANIC 0.045
27 2017 ORGANIC 0.041
28 2018 ORGANIC 0.042
29 2019 ORGANIC 0.038
30 2005 REFERRAL 0.054
31 2006 REFERRAL 0.048
32 2007 REFERRAL 0.049
33 2008 REFERRAL 0.053
34 2009 REFERRAL 0.050
35 2010 REFERRAL 0.055
36 2011 REFERRAL 0.043
37 2012 REFERRAL 0.037
38 2013 REFERRAL 0.037
39 2014 REFERRAL 0.036
40 2015 REFERRAL 0.031
41 2016 REFERRAL 0.035
42 2017 REFERRAL 0.032
43 2018 REFERRAL 0.028
44 2019 REFERRAL 0.025

Graph without highlights�

First, we will create the graph without bringing attention to any specific data.

In�[97]:
# Line graph
line_simple = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line(strokeWidth = 3) # Set thickness of the line
    .encode(
        x = alt.X(
            "YEAR:O", # Inform that Year is an Ordinal data
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0) # Align horizontally
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1]) # y-axis goes from 0 to 0.1
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            ) # All lines gray
    )
    .properties(width = 500)
)

# Labeling for the lines
text_simple = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        dx = 20, 
        size = 13, 
        color = "#aaaaaa"
        )
    .encode(
        # Place text at the end of the line
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric"
    )
)

gray_line = line_simple + text_simple
gray_line.configure_view(stroke = None)
Out[97]:

Visualization as depicted in the book:

Alt Text

Not implemented versions�

Before initiating the generated graphs, it is important to acknowledge that two highlight options—Arrows and Circles—were ultimately not implemented for this project. These versions are: Arrows and Circles. The decision not to include them stemmed from the challenges associated with their execution in Altair and their perceived lack of effectiveness in conveying the intended information, as the author herself admits by calling them "brute force options".

Arrow highlight in the book:

Alt text

Circle highlight in the book:

Alt text

Transparent white boxes�

In constructing this graph, the author employed transparent white boxes to obscure lines other than Referral, aiming to reduce their visibility and emphasize the desired data. However, a more elegant approach in Altair involves achieving the same effect by simply adjusting the opacity of each line.

In�[98]:
# Create line graph
line = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%",
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            ),
        opacity = alt.condition(
            # If Metric equals Referral, it has maximum opacity
            # Otherwise, it is slightly transparent
            alt.datum["Metric"] == "REFERRAL", 
            alt.value(1), 
            alt.value(0.5)
        )
    )
    .properties(width = 500)
)

# Text for the end of each line
text = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        baseline = "middle", 
        dx = 20, 
        size = 13, 
        color = "#aaaaaa"
        )
    .encode(
        x = alt.X(
            "YEAR:O", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        opacity = alt.condition(
            # If Metric equals Referral, it has maximum opacity
            # Otherwise, it is slightly transparent        
            alt.datum["Metric"] == "REFERRAL", 
            alt.value(1), 
            alt.value(0.5)
        )
    )
)

final_boxes = line + text
final_boxes.configure_view(stroke = None)
Out[98]:

Visualization as depicted in the book:

Alt text

White boxes created by the author:

Alt text

Thicken the line�

Next, we will increase the thickness of the Referral line to draw greater attention to it.

In�[99]:
# Line graph with thicker line
line = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            ),
        strokeWidth = alt.condition(
            # Make Referral line thicker
            alt.datum["Metric"] == "REFERRAL", 
            alt.value(4), 
            alt.value(2)
        )
    )
    .properties(width = 500)
)

# Text at the end of the line
text = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        baseline = "middle", 
        dx = 20, 
        size = 13, 
        color = "#aaaaaa"
        )
    .encode(
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        opacity = alt.condition(
            # Make Referral line more opaque
            alt.datum["Metric"] == "REFERRAL", 
            alt.value(1), 
            alt.value(0.7)
        )
    )
)

final_thick = line + text
final_thick.configure_view(stroke = None)
Out[99]:

Visualization as depicted in the book:

Alt text

Change line style�

We can change the Referral line to a dashed line, while maintaining other continuous. This type of highlight is recommended when you are representing some type of uncertain data, such as a prediction or goal.

In�[100]:
# Line chart
line = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            ),
        strokeDash = alt.condition(
            # Define dash based on condition if equal referral
            alt.datum["Metric"] == "REFERRAL", 
            alt.value([5, 3]), 
            alt.value([1, 0])
        )
    )
    .properties(width = 500)
)

# We will use the same text as the original version of the graph

final_dashed = line + text_simple
final_dashed.configure_view(stroke = None)
Out[100]:

Visualization as depicted in the book:

Alt text

Leverage intensity�

In the opposite idea of making the other lines slightly transparent, we can make the referral line darker in color.

Since the next part of the exercise is to covert the Referral line on visually top of the others and the default behavior Altair already does that, we will undertake the task of intentionally positioning the Referral line below for now. This will be achieved by creating the Referral line separately and adding it as the initial layer in the graph.

In�[101]:
# Line only for referral
line_referral = (
    alt.Chart(melted_table)
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.value("black"), # Color it black
        opacity = alt.value(1) # Maximum opacity
    )
    .properties(width = 500)
    # Filter out other lines
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "Metric", 
            equal = "REFERRAL"
            )
        )
)

# Other lines
line_rest = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        # Gray color
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            ),
        opacity = alt.value(1) # Maximum opacity
    )
    .properties(width = 500)
    # Filter referral out
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Metric", 
            oneOf = ["ORGANIC", "TOTAL"]
            )
        )
)

# Create text that highlights Referral in black
text_highlight = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        dx = 20, 
        size = 13
        )
    .encode(
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        color = alt.condition(
            alt.datum["Metric"] == "REFERRAL", 
            alt.value("black"), 
            alt.value("#aaaaaa")
        )
    )
)

final_darker = line_referral + line_rest + text_highlight
final_darker.configure_view(stroke = None)
Out[101]:

Adding the text created an undesired axis on top, that we could not remove by setting Axis = None. To solve this issue, we will define each word by pixel value.

In�[102]:
# Text for Total
text_total = alt.Chart({"values": 
                    [{"text":  ['TOTAL']}]
                    }
                    ).mark_text(size = 13, 
                                align = "left", 
                                dx = 230, dy = -37, 
                                color = '#aaaaaa'
                                ).encode(text = "text:N")

# Text for Organic
text_organic = alt.Chart({"values": 
                    [{"text":  ['ORGANIC']}]
                    }
                    ).mark_text(size = 13, 
                                align = "left", 
                                dx = 230, dy = 37, 
                                color = '#aaaaaa'
                                ).encode(text = "text:N")

# Text for Referral
text_referral = alt.Chart({"values": 
                    [{"text":  ['REFERRAL']}]
                    }
                    ).mark_text(size = 13, 
                                align = "left", 
                                dx = 230, dy = 78, 
                                color = 'black'
                                ).encode(text = "text:N")

final_darker = (
    line_referral 
    + line_rest 
    + text_total 
    + text_organic 
    + text_referral
    )

final_darker.configure_view(stroke = None)
Out[102]:

Visualization as depicted in the book:

Alt text

Position Referral on top of other lines�

We can create this effect by simply making a line graph where the color has a condition for Metric == Referral, or a scale with range [gray, black, gray]. However, since we already have the visualizations from above, we can just change the order we add them so that the line referral is on top of the others.

In�[103]:
final_top = (
    line_rest 
    + line_referral 
    + text_total 
    + text_organic 
    + text_referral
    )

final_top.configure_view(stroke = None)
Out[103]:

Visualization as depicted in the book:

Alt text

Change the hue�

Similarly to setting the highlighted line to black, we can change it to other colors such as red.

In�[104]:
# Create a red referral line
line_red = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric",
            # The hex code in the middle corresponds to the referral line
            scale = alt.Scale(
                range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
                ),
            legend = None
        )
    )
    .properties(width = 500)
)

# Create a referral red text
text_red = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        baseline = "middle", 
        dx = 20, 
        size = 13
        )
    .encode(
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        color = alt.Color(
            "Metric",
            # Set referral color to red
            scale = alt.Scale(range = ["#aaaaaa", "#d24b53", "#aaaaaa"]),
            legend = None
        )
    )
)

final_red = line_red + text_red
final_red.configure_view(stroke = None)
Out[104]:

Visualization as depicted in the book:

Alt text

Use words�

For this example, we will add a second part for the title.

In�[105]:
# Same graph as the one without highlights
# but with a bigger title
line = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            # Add more explicit title
            "Conversion rate over time: Referral decreasing markedly since 2010",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        ),
    )
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            )
    )
    .properties(width = 500)
)

final_text = line + text_simple
final_text.configure_view(stroke = None)
Out[105]:

Visualization as depicted in the book:

Alt text

Eliminate other data�

For this, we will only display the pertinent data.

In�[106]:
# Filter other data out of the original graph
final_referral = gray_line.transform_filter(
    alt.FieldEqualPredicate(
        field = "Metric", 
        equal = "REFERRAL")
        )
final_referral.configure_view(stroke = None)
Out[106]:

Visualization as depicted in the book:

Alt text

Animate to appear�

This option is only described in the book and not visually demonstrated due to its static nature. The idea of the author is to animate in such a way that one line appears at a time. We will double this section as our interactive graph and display a visualization where users can click on the legend to reveal the desired line. Holding down the Shift key while clicking will enable the selection of multiple lines.

In�[107]:
# Define an interactive selection
metric_selection = alt.selection_point(fields = ["Metric"])

# Base line chart
line = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Conversion rate over time: Referral decreasing markedly since 2010",
            fontWeight = "normal",
            anchor = "start",
            fontSize = 17
        )
    )
    .mark_line(strokeWidth = 3)
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal",
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"])
            ),
        # Maximum transparency if not selected
        opacity = alt.condition(
            metric_selection, 
            alt.value(1), 
            alt.value(0)
            )
    )
    .add_params(metric_selection)
    .transform_filter(metric_selection)
    .properties(width = 500)
)

# Text chart
text = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        baseline = "middle", 
        dx = 20, 
        size = 13
        )
    .encode(
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            ),
        # Maximum transparency if not selected
        opacity = alt.condition(
            metric_selection, 
            alt.value(1), 
            alt.value(0)
            )
    )
    .add_params(metric_selection)
    .transform_filter(metric_selection)
)

# CLickable legend
legend = (
    alt.Chart(melted_table)
    .mark_point()
    .encode(
        alt.Y(
            "Metric", 
            axis = alt.Axis(orient = "right")
            ),
        color = alt.condition(
            metric_selection, 
            alt.value("#aaaaaa"), 
            alt.value("lightgrey")
        )
    )
    .add_params(metric_selection)
)

final_interactive = line + text | legend
final_interactive.configure_view(stroke = None)
Out[107]:

Add data markers�

We can add data markers simply by defining point = True in the line mark.

In�[108]:
# Referral line with markers
line_referral_markers = (
    alt.Chart(melted_table)
    .mark_line(
        strokeWidth = 3, 
        point = True
        )
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.value("#aaaaaa"),
        opacity = alt.value(1)
    )
    .properties(width = 500)
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "Metric", 
            equal = "REFERRAL")
            )
)

# Text only for referral in gray color
# since adding the "text_simple" results in top axis
text_referral_gray = (
    alt.Chart({"values": [
        {"text": ["REFERRAL"]}
        ]})
    .mark_text(
        size = 13, 
        align = "left", 
        dx = 230, 
        dy = 78, 
        color = "#aaaaaa"
        )
    .encode(text = "text:N")
)

final_markers = (
    line_rest 
    + line_referral_markers 
    + text_total 
    + text_organic 
    + text_referral_gray
)
final_markers.configure_view(stroke = None)
Out[108]:

Visualization as depicted in the book:

Alt text

Add data labels�

In addition to the markers, we can also add labels.

In�[109]:
# Coding for the labels
label = alt.Chart(
    melted_table
    ).mark_text(
        align = 'left', 
        dx = 3, 
        color = '#aaaaaa'
        ).encode(
    x = alt.X('YEAR:O'),
    y = alt.Y('Value'),
    text = alt.Text(
        'Value', 
        format = ".1%"
        ),
    xOffset = alt.value(-10),
    yOffset = alt.value(-10)
).transform_filter(
    alt.FieldEqualPredicate(
        field = 'Metric', 
        equal = "REFERRAL"
        )
    )

final = (
    line_rest 
    + line_referral_markers 
    + text_total 
    + text_organic 
    + text_referral_gray
    + label
)
final.configure_view(stroke = None)
Out[109]:

Visualization as depicted in the book:

Alt text

Add end markers and labels�

Instead of adding a marker and label to every data in Referral, we can add it only to the end of each line.

In�[110]:
# Text with the Metric
text = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        baseline = "middle", 
        dx = 55, 
        size = 13
        )
    .encode(
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#aaaaaa"]), 
            legend = None
            )
    )
)

# Text with the Value
text2 = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        color = "#aaaaaa"
        )
    .encode(
        x = alt.X(
            "YEAR:O", 
            axis = None
            ),
        y = alt.Y("Value"),
        text = alt.Text(
            "Value", 
            format = ".1%"
            ),
        xOffset = alt.value(225)
    )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "YEAR", 
            equal = 2019
            )
        )
)

# Add end point
point = (
    alt.Chart(melted_table)
    .mark_point(
        filled = True, 
        color = "#aaaaaa"
        )
    .encode(
        x = alt.X(
            "YEAR:O", 
            axis = None
            ), 
        y = alt.Y("Value"), 
        opacity = alt.value(1)
        )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "YEAR", 
            equal = "2019")
        )
)

line_simple + text + text2 + point
Out[110]:

The points appear to be in the incorrect location on the x-axis. Interestingly, running the visualization without the text elements seems to alter their positions.

In�[111]:
line + point
Out[111]:

To solve this issue, we will simply offset the point in the axis until they are in the right place.

In�[112]:
point = alt.Chart(
    melted_table
    ).mark_point(
        filled = True, 
        color = '#aaaaaa'
        ).encode(
    x = alt.X('YEAR:O', axis = None),
    y = alt.Y('Value'),
    opacity = alt.value(1),
    # Offset points in the x-axis
    xOffset = alt.value(217)
).transform_filter(
    alt.FieldEqualPredicate(
        field = 'YEAR', 
        equal = '2019'
        )
    )

final = line + text + text2 + point
final.configure_view(stroke = None)
Out[112]:

Visualization as depicted in the book:

Alt text

Combine�

To conclude the exercise, we will combine some attributes into one single graph. We will make a thicker colored line, with data markers and data labels, along with tied texts.

In�[113]:
# Red and thicker referral line
line_red_thicker = (
    alt.Chart(
        melted_table # We need to add the title separately
    )
    .mark_line()
    .encode(
        x = alt.X(
            "YEAR:O",
            axis = alt.Axis(
                labelAngle = 0,
                labelColor = "#888888",
                titleColor = "#888888",
                titleAnchor = "start",
                titleFontWeight = "normal"
            ),
            title = "FISCAL YEAR",
            scale = alt.Scale(align = 0)
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = "CONVERSION RATE",
            scale = alt.Scale(domain = [0, 0.1])
        ),
        color = alt.Color(
            "Metric",
            # The hex code in the middle corresponds to the referral line
            scale = alt.Scale(
                range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
                ),
            legend = None
        ),
        strokeWidth = alt.condition(
            # Make Referral line thicker
            alt.datum["Metric"] == "REFERRAL", 
            alt.value(4), 
            alt.value(2)
        )        
    )
    .properties(width = 500)
)

# Red text with Metric
text_red = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        baseline = "middle", 
        dx = 55, 
        size = 13
        )
    .encode(
        x = alt.X(
            "YEAR", 
            aggregate = "max", 
            axis = None
            ),
        y = alt.Y(
            "Value", 
            aggregate = {"argmax": "YEAR"}
            ),
        text = "Metric",
        color = alt.Color(
            "Metric",
            scale = alt.Scale(
                range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
                ),
            legend = None
        )
    )
)

# Red text with Value
text2_red = (
    alt.Chart(melted_table)
    .mark_text(align = "left")
    .encode(
        x = alt.X("YEAR:O", axis = None),
        y = alt.Y("Value"),
        text = alt.Text("Value", format = ".1%"),
        xOffset = alt.value(225),
        color = alt.Color(
            "Metric",
            scale = alt.Scale(
                range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
                ),
            legend = None
        )
    )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "YEAR", 
            equal = 2019
            )
        )
)

# Red points
point_red = (
    alt.Chart(melted_table)
    .mark_point(filled = True)
    .encode(
        x = alt.X("YEAR:O", axis = None),
        y = alt.Y("Value"),
        opacity = alt.value(1),
        xOffset = alt.value(217),
        color = alt.Color(
            "Metric",
            scale = alt.Scale(
                range = ["#aaaaaa", "#d24b53", "#aaaaaa"]
                ),
            legend = None
        )
    )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "YEAR", 
            equal = "2019"
            )
        )
)

# End points
other_points = (
    alt.Chart(melted_table)
    .mark_point(filled = True, color = "#d24b53")
    .encode(
        x = alt.X("YEAR:O", axis = None),
        y = alt.Y("Value"),
        opacity = alt.value(1)
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "YEAR", oneOf = ["2010", "2016"]
            )
        )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "Metric", equal = "REFERRAL"
            )
        )
)

line_red_thicker + text_red + text2_red + point_red + other_points
Out[113]:

Since rationally positioning the red points failed, we will manually insert the x and y values for each of them.

In�[114]:
# Red point from 2010
point2010 = (
    alt.Chart(melted_table)
    .mark_point(filled = True, color = "#d24b53")
    .encode(
        x = alt.X("YEAR:O", axis = None),
        y = alt.Y("Value"),
        opacity = alt.value(1),
        xOffset = alt.value(-84)
    )
    # Filter only Referral 2010
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "YEAR", 
            equal = "2010"
            )
        )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "Metric", 
            equal = "REFERRAL"
            )
        )
)

# Red point for referral 2016
point2016 = (
    alt.Chart(melted_table)
    .mark_point(filled = True, color = "#d24b53")
    .encode(
        x = alt.X("YEAR:O", axis = None),
        y = alt.Y("Value"),
        opacity = alt.value(1),
        xOffset = alt.value(117)
    )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "YEAR", 
            equal = "2016"
            )
        )
    .transform_filter(
        alt.FieldEqualPredicate(
            field = "Metric", 
            equal = "REFERRAL"
            )
        )
)

# Line connecting 2016 point with text
rule2016 = (
    alt.Chart()
    .mark_rule()
    .encode(
        x = alt.value(367),
        y = alt.datum(0.034),
        y2 = alt.datum(0.013),
        color = alt.value("#aaaaaa")
    )
)

# Line connecting 2010 point with text
rule2010 = (
    alt.Chart()
    .mark_rule()
    .encode(
        x = alt.value(166),
        y = alt.datum(0.054),
        y2 = alt.datum(0.022),
        color = alt.value("#aaaaaa")
    )
)

# Red part of text for 2010
text2010red = (
    alt.Chart({"values": [
        {"text": ["2010: all time referral conversion high"]}
        ]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -87, 
        dy = 93, 
        fontWeight = "bold", 
        color = "#d24b53"
        )
    .encode(text = "text:N")
)

# Red part of text for 2016
text2016red = (
    alt.Chart({"values": [
        {"text": ["2016: new campaigns"]}
        ]})
    .mark_text(
        size = 10, 
        align = "left", 
        dx = 115, 
        dy = 118, 
        fontWeight = "bold", 
        color = "#d24b53"
    )
    .encode(text = "text:N")
)

# Rest of text for 2010
text2010gray = (
    alt.Chart(
        {
            "values": [
                {
                    "text": [
                        "(5.5%). Strong partnerships historically",
                        "meant steady conversions. Entry of",
                        "competitor ABC has markedly impacted",
                        "referral quality: fewer are buying.",
                    ]
                }
            ]
        }
    )
    .mark_text(
        size = 10, 
        align = "left", 
        dx = -87, 
        dy = 105, 
        fontWeight = "bold", 
        color = "#aaaaaa"
    )
    .encode(text = "text:N")
)

# Rest of text for 2016
text2016gray = (
    alt.Chart(
        {"values": [
            {"text": ["lead to brief uptick; steady", "decrease since then."]}
            ]}
    )
    .mark_text(
        size = 10, 
        align = "left", 
        dx = 115, 
        dy = 130, 
        fontWeight = "bold", 
        color = "#aaaaaa"
    )
    .encode(text = "text:N")
)

# First half of title
title_black = (
    alt.Chart({"values": [
        {"text": ["Conversion rate over time: "]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -275, 
        dy = -168, 
        color = "black"
        )
    .encode(text = "text:N")
)

# Second half of title
title_red = (
    alt.Chart({"values": [
        {"text": ["referral decreasing markedly since 2010"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -83, 
        dy = -168, 
        color = "#d24b53"
        )
    .encode(text = "text:N")
)

final_combined = (
    line_red_thicker
    + text_red
    + text2_red
    + rule2010
    + rule2016
    + point_red
    + point2010
    + point2016
    + text2010red
    + text2016red
    + text2010gray
    + text2016gray
    + title_black
    + title_red
)

final_combined.configure_view(stroke = None)
Out[114]:

Visualization as depicted in the book:

Alt text

Chapter 5 - Think like a designer�

"Where do you want your audience to look?" - Cole Nussbaumer Knaflic

Exercise 4 - Design in style (Inspired)�

This exercise revolves around integrating brand design into graphs. While the author in the book focuses on creating a graph with a Coke and Light Coke theme, our approach takes it a step further. We will delve into various color inspirations using the most colorful graph thus far, which is the one from exercise 4.2.

We will separate the exercise into four categories: Accessibility, Branding, Paintings, and Nature.

If you would like to return to the Table of Contents, you can click here.

Loading data�

This is the same data as exercise 4.2.

In�[115]:
table = pd.read_excel(r"Data\4.2 EXERCISE.xlsx", usecols = [1, 2, 3], header = 5, skipfooter = 30)
table['Brands'] = table['Unnamed: 1']
table['Change'] = table['$ Vol % change']

table.drop(columns = ['Unnamed: 1', '$ Vol % change'], inplace = True)
table
Out[115]:
spacing for dot plot Brands Change
0 0 Fran's Recipe -0.14
1 1 Wholesome Goodness -0.13
2 2 Lifestyle -0.10
3 3 Coat protection -0.09
4 4 Diet Lifestyle -0.08
5 5 Feline Basics -0.05
6 6 Lifestyle Plus -0.04
7 7 Feline Freedom -0.02
8 8 Feline Gold 0.01
9 9 Feline Platinum 0.01
10 10 Feline Instinct 0.02
11 11 Feline Pro 0.03
12 12 Farm Fresh Tasties 0.04
13 13 Feline Royal 0.05
14 14 Feline Focus 0.09
15 15 Feline Grain Free 0.09
16 16 Feline Silver 0.12
17 17 Nutri Balance 0.16
18 18 Farm Fresh Basics 0.17

The original graph�

In�[116]:
decreased_most = table.nsmallest(2, "Change")
increased_most = table.nlargest(2, "Change")

brands_decreased = decreased_most["Brands"].tolist()
brands_increased = increased_most["Brands"].tolist()

conditions_decreased = [f'datum.Brands == "{brand}"' for brand in brands_decreased]
condition_decreased = f"({'|'.join(conditions_decreased)})"

conditions_increased = [f'datum.Brands == "{brand}"' for brand in brands_increased]
condition_increased = f"({'|'.join(conditions_increased)})"

chart_gray = (
    alt.Chart(table)
    .mark_bar(color = "#c6c6c6", size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

chart_oranges_mix = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#ec7c30"), 
            alt.value("#efb284")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle",
            ]
        )
    )
)

chart_blue_mix = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#4772b8"), 
            alt.value("#91a9d5")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

label1_gray = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(
        align = "left", 
        color = "#c6c6c6", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

label2_gray = (
    alt.Chart(
        table.loc[table["Change"] > 0]
        )
    .mark_text(
        align = "right", 
        color = "#c6c6c6", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

label_oranges = (
    alt.Chart(
        table.loc[table["Change"] < 0]
        )
    .mark_text(align = "left", fontWeight = 700)
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_decreased, 
            alt.value("#ec7c30"), 
            alt.value("#efb284")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)


label_blue = (
    alt.Chart(
        table.loc[table["Change"] > 0]
        )
    .mark_text(
        align = "right", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_increased, 
            alt.value("#4772b8"), 
            alt.value("#91a9d5")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

title_bw = (
    alt.Chart({"values": [
        {"text": ["Cat food brands:"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -200, 
        dy = -270, 
        fontWeight = "normal", 
        color = "black"
    )
    .encode(text = "text:N")
)

title_bw_bold = (
    alt.Chart({"values": [
        {"text": ["Lifestyle line brands decline"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "black"
        )
    .encode(text = "text:N")
)

title_bw_bold_2 = (
    alt.Chart({"values": [
        {"text": ["mixed results in sales year-over-year"]}
        ]})
    .mark_text(
        size = 16, 
        align = "left", 
        dx = -78, 
        dy = -270, 
        fontWeight = 700, 
        color = "black"
        )
    .encode(text = "text:N")
)

subtitle_bw = (
    alt.Chart({"values": [
        {"text": ["YEAR-OVER-YEAR % CHANGE IN VOLUME ($)"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -200, 
        dy = -250, 
        fontWeight = "normal", 
        color = "gray"
    )
    .encode(text = "text:N")
)

decreased_orange = (
    alt.Chart({"values": [
        {"text": ["DECREASED"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -80, 
        dy = -220, 
        fontWeight = 700, 
        color = "#ec7c30"
        )
    .encode(text = "text:N")
)

increased_blue = (
    alt.Chart({"values": [
        {"text": ["INCREASED"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 20, 
        dy = -220, 
        fontWeight = 700, 
        color = "#4772b8"
        )
    .encode(text = "text:N")
)

separation = (
    alt.Chart({"values": [{"text": ["|"]}]})
    .mark_text(
        size = 11,
        align = "left", 
        dx = 3, 
        dy = -220, 
        fontWeight = 700, 
        color = "#c6c6c6"
        )
    .encode(text = "text:N")
)

original = (
    chart_gray
    + chart_oranges_mix
    + chart_blue_mix
    + label1_gray
    + label2_gray
    + label_oranges
    + label_blue
    + title_bw
    + title_bw_bold_2
    + subtitle_bw
    + decreased_orange
    + increased_blue
    + separation
)

original.properties(width = 400).configure_view(stroke = None)
Out[116]:

Accessibility�

How do we assess the accessibility of the graph palette?

Colorblind�

First, we can start by checking if the original colors are accessible to people with color blindness. The online tool Coloring for Colorblindness helps simulate how your selected color palette appears to viewers with protanopia, deuteranopia (both being the inability to tell the difference between red and green), and tritanopia (inability to tell the difference between blue and green, purple and red, and yellow and pink), respectively.

The image below shows that our initial palette can be distinguished by those with these visual deficiency. The website also offers some famous colorblind friendly palette, such as the Wong palette.

Alt text

Black and White�

Although the complete inability to distinguish colors is very rare, having a black and white version of your graph can be beneficial in certain situations, such as when intending to print it in a newspaper or an article. By using the Microsoft Photos App, we can preview how our visualization would appear without colors.

Alt text

Since the top and bottom colors can not be distinguished easily, we can create a new black and white palette.

In�[117]:
# Same graph as before, but
# with colors in the black and white spectrum

chart_gray = (
    alt.Chart(table)
    .mark_bar(color = "#666666", size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

chart_light_gray_mix = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#bbbbbb"), 
            alt.value("#999999")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

chart_black_mix = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change",
            scale = alt.Scale(domain = [-0.20, 0.20]),
            axis = alt.Axis(
                grid = False,
                orient = "top",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                format = "%"
            ),
            title = None
        ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("black"), 
            alt.value("#333333")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

label1_gray = (
    alt.Chart(table.loc[table["Change"] < 0])
    .mark_text(
        align = "left", 
        color = "#666666", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(207), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

label2_gray = (
    alt.Chart(table.loc[table["Change"] > 0])
    .mark_text(
        align = "right", 
        color = "#666666", 
        fontWeight = 700
        )
    .encode(
        x = alt.value(192), 
        y = alt.Y("Brands", sort = None), 
        text = alt.Text("Brands")
        )
)

label_light_gray = (
    alt.Chart(table.loc[table["Change"] < 0])
    .mark_text(align = "left", fontWeight = 700)
    .encode(
        x = alt.value(207),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_decreased, 
            alt.value("#bbbbbb"), 
            alt.value("#999999")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)


label_black = (
    alt.Chart(table.loc[table["Change"] > 0])
    .mark_text(align = "right", fontWeight = 700)
    .encode(
        x = alt.value(192),
        y = alt.Y("Brands", sort = None),
        text = alt.Text("Brands"),
        color = alt.condition(
            condition_increased, 
            alt.value("black"), 
            alt.value("#333333")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

title_bw = (
    alt.Chart({"values": [
        {"text": ["Cat food brands:"]}
        ]})
    .mark_text(
        size = 16,
        align = "left", 
        dx = -200, 
        dy = -270, 
        fontWeight = "normal", 
        color = "black"
    )
    .encode(text = "text:N")
)

decreased_light_gray = (
    alt.Chart({"values": [
        {"text": ["DECREASED"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = -80, 
        dy = -220,
        fontWeight = 700, 
        color = "#999999"
        )
    .encode(text = "text:N")
)

increased_black = (
    alt.Chart({"values": [
        {"text": ["INCREASED"]}
        ]})
    .mark_text(
        size = 11, 
        align = "left", 
        dx = 20, 
        dy = -220, 
        fontWeight = 700, 
        color = "black"
        )
    .encode(text = "text:N")
)

black_and_white = (
    chart_gray
    + chart_light_gray_mix
    + chart_black_mix
    + label1_gray
    + label2_gray
    + label_light_gray
    + label_black
    + title_bw
    + title_bw_bold_2
    + subtitle_bw
    + decreased_light_gray
    + increased_black
    + separation
)

black_and_white.properties(width = 400).configure_view(stroke = None)
Out[117]:

Branding�

Branding is a really important aspect of marketing. Most companies have set color palettes, logos and design. Tools such as Image Color Picker and Adobe Color can help you extract and implement those official palettes into themed data visualizations. For the following graphs, we will not use labeling or titles, as it will not represent the cat food brands and be used only as a canvas for the palettes.

Cookie Clicker�

Cookie Clicker is an idle game created by the French programmer Orteil, and it has been open in a separate tab baking and accumulating cookies throughout the entirety of this project. Hence, it only seems fitting to pay homage with a dedicated color palette!

In�[118]:
# Orange middle (for the orange milk in the game)
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#ee8241", size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Dark brown top (for the chocolate chips)
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#4a251d"), 
            alt.value("#64433a")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Light brown bottom (for the cookie dough)
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#81532d"), 
            alt.value("#c0a681")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

cookie_clicker = chart_middle + chart_down + chart_up

cookie_clicker.properties(width = 400).configure_view(stroke = None)
Out[118]:

Game screenshot that inspired the color palette:

Alt text

Google Maps�

One of the most distinctive brand designs is the vibrant Google palette, widely employed in various apps associated with the company. Inspired by the Google Maps logo, which features precisely five colors, we will demonstrate how our graph looks when incorporating this design.

In�[119]:
# Green middle
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#34A852", size = 15)
    .encode(
        x=alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Red and yellow top
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#EA4335"), 
            alt.value("#FABB04")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Blue and light blue bottom
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#1A73E8"), 
            alt.value("#4285F4")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

google = chart_middle + chart_down + chart_up

google.properties(width = 400).configure_view(stroke = None)
Out[119]:

Google maps logo used for reference:

Alt text

Paintings�

Artist spend their lives dedicated to the study of colors and how we as humans perceive them. We can draw inspiration from their work when searching to evoke a specific emotion through our visualizations.

Starry Night by Vincent Van Gogh�
In�[120]:
# Light blue middle
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#7B95A6", size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Yellow tones top for the stars
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#A65D05"), 
            alt.value("#D9B13B")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Blue tones bottom for the sky
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#0B1E38"), 
            alt.value("#304F8C")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

starry_night = chart_middle + chart_down + chart_up

starry_night.properties(width = 400).configure_view(stroke = None)
Out[120]:

Painting that inspired the palette:

Alt text

A Cuca by Tarsila do Amaral�
In�[121]:
# Green middle for the plants
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#034001", size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Blue and green top for the plants and pond
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#0339A6"), 
            alt.value("#067302")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Yellow and orange bottom for the Cuca
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y(
            "Brands", 
            sort = None, 
            axis = None
            ),
        color = alt.condition(
            condition_increased, 
            alt.value("#F25C05"), 
            alt.value("#F29F05")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

cuca = chart_middle + chart_down + chart_up

cuca.properties(width = 400).configure_view(stroke = None)
Out[121]:

Painting that inspired the palette:

Alt text

The Great Wave Off Kanagawa by Hokusai�
In�[122]:
# Light blue for the waves
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#8FBABF", size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Beige top for the sky and boats
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#E0C6A3"), 
            alt.value("#D9B779")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Darker blue for the waves
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#010326"), 
            alt.value("#010B40")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

wave = chart_middle + chart_down + chart_up

wave.properties(width = 400).configure_view(stroke = None)
Out[122]:

Painting that inspired the palette:

Alt text

Nature�

Inspired by the use of imagery of roses in the works of Theresa-Marie Rhyne, we will extract the color palette from pictures of nature.

Sunset�

This picture was taken by Sergio Mena Ferraira and can be found here.

In�[123]:
# Bright orange middle for the sun
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#FF9200", size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Purple tones top for the sky
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#AA4650"), 
            alt.value("#FD4044")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Orange tones bottom for the sky
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#FA4210"), 
            alt.value("#FD7E37")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

sunset = chart_middle + chart_down + chart_up

sunset.properties(width = 400).configure_view(stroke = None)
Out[123]:

Picture that inspired the palette:

Alt text

Forest�

This picture was taken by Sam Abell and can be found here.

In�[124]:
# Light green middle from leaves
chart_middle = (
    alt.Chart(table)
    .mark_bar(color = "#5C7346", size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None)
    )
)

# Lighter green top from leaves
chart_up = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_decreased, 
            alt.value("#ABD9A9"), 
            alt.value("#83A66A")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Fran's Recipe",
                "Wholesome Goodness",
                "Lifestyle",
                "Coat protection",
                "Diet Lifestyle"
            ]
        )
    )
)

# Dark green and brown bottom from leaves and tree trunk
chart_down = (
    alt.Chart(table)
    .mark_bar(size = 15)
    .encode(
        x = alt.X(
            "Change", 
            scale = alt.Scale(domain = [-0.20, 0.20]), 
            axis = None, 
            title = None
            ),
        y = alt.Y("Brands", sort = None, axis = None),
        color = alt.condition(
            condition_increased, 
            alt.value("#140F09"), 
            alt.value("#3B401B")
        )
    )
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Brands",
            oneOf = [
                "Feline Focus",
                "Feline Grain Free",
                "Feline Silver",
                "Nutri Balance",
                "Farm Fresh Basics"
            ]
        )
    )
)

forest = chart_middle + chart_down + chart_up

forest.properties(width = 400).configure_view(stroke = None)
Out[124]:

Picture that inspired the palette:

Alt text

Interactivity�

For interactivity, we will empower the user to customize the palette of the graph without requiring any knowledge of Altair.

We have devised five charts, each representing a distinct color in this visualization. Additionally, we've provided a space where the viewer can select the hue in the spectrum or input the RGB, HSL or the hex code for each chart.

In�[125]:
# Create the color parameters, with default in the original graph
color_one = alt.param(
    value = "#ec7c30", 
    bind = alt.binding(
        input = 'color', 
        name = 'First color: '
        )
    )

color_two = alt.param(
    value = "#efb284", 
    bind = alt.binding(
        input = 'color', 
        name = 'Second color: '
        )
    )

color_three = alt.param(
    value = "#c6c6c6", 
    bind = alt.binding(
        input = 'color', 
        name = 'Third color: '
        )
    )

color_four = alt.param(
    value = "#91a9d5", 
    bind = alt.binding(
        input = 'color', 
        name = 'Fourth color: '
        )
    )

color_five = alt.param(
    value = "#4772b8", 
    bind = alt.binding(
        input = 'color', 
        name = 'Fifth color: '
        )
    )

# First two bars
chart_one = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = None,
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = color_one
    ).transform_filter(
    alt.FieldOneOfPredicate(
        field = 'Brands', 
        oneOf = list(table['Brands'][0:2])
    )
    ).add_params(
    color_one
)

# Three next bars
chart_two = alt.Chart(table).mark_bar( 
        size = 15
        ).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = None,
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = color_two
).transform_filter(
    alt.FieldOneOfPredicate(
        field = 'Brands', oneOf = list(table['Brands'][2:5])
        )
    ).add_params(
    color_two
)                                         

# Middle bars
chart_three = alt.Chart(
    table
    ).mark_bar(size = 15).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = None,
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = color_three
).transform_filter(
    alt.FieldOneOfPredicate(
        field = 'Brands', oneOf = list(table['Brands'][5:14])
        )
    ).add_params(
    color_three
)        
    
# Three bars after the middle
chart_four = alt.Chart(
    table
    ).mark_bar(size = 15).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = None,
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = color_four
).transform_filter(
    alt.FieldOneOfPredicate(
        field = 'Brands', oneOf = list(table['Brands'][14:17])
        )
    ).add_params(
    color_four
)      

# Last two bars
chart_five = alt.Chart(
    table
    ).mark_bar(size = 15).encode(
    x = alt.X(
        "Change", 
        scale = alt.Scale(domain = [-0.20, 0.20]), 
        axis = None,
        title = None
        ),
    y = alt.Y("Brands", sort = None, axis = None),
    color = color_five
).transform_filter(
    alt.FieldOneOfPredicate(
        field='Brands', oneOf = list(table['Brands'][17:19])
        )
    ).add_params(
    color_five
)      

interactive_colors = (
    chart_one 
    + chart_two 
    + chart_three 
    + chart_four 
    + chart_five
)

interactive_colors.properties(width = 400).configure_view(stroke = None)
Out[125]:

Chapter 6 - Tell a story�

"Data in a spreadsheet or facts on a slide aren’t things that naturally stick with us — they are easily forgotten. Stories, on the other hand, are memorable." - Cole Nussbaumer Knaflic

Exercise 6 - Differentiate between live & standalone stories �

The objective of this exercise is to know how to create visualizations depending in the situation you are going to present them, being that a live presentation or a printed graph. Here, we will be creating a series of charts ready to be in a slide show, demonstrating the ideas step by step.

There is one standalone graph at the end, but the texts will not be inserted due to that process being presented earlier.

If you would like to return to the Table of Contents, you can click here.

Loading the data�

In�[126]:
# Loading the table considering Excel formatting
table = pd.read_excel(r"Data\6.6 EXERCISE.xlsx", usecols = [2, 3, 4, 5, 6, 7], header = 4, skipfooter = 5)
table
Out[126]:
Unnamed: 2 Unnamed: 3 Internal External Overall Goal
0 Jan 2019-01-01 47.6 44.8 45.05 60
1 Feb 2019-02-01 37.9 48.5 47.25 60
2 Mar 2019-03-01 17.6 49.5 46.15 60
3 Apr 2019-04-01 18.6 55.2 50.35 60
4 May 2019-05-01 40.6 56.5 55.55 60
5 Jun 2019-06-01 28.8 60.7 53.85 60
6 Jul 2019-07-01 27.1 44.2 42.85 60
7 Aug 2019-08-01 36.9 29.0 31.15 60
8 Sep 2019-09-01 37.1 61.2 59.15 60
9 Oct 2019-10-01 25.9 44.9 41.55 60
10 Nov 2019-11-01 51.2 76.6 71.85 60
11 Dec 2019-12-01 40.6 34.7 36.15 60
In�[127]:
# Rename columns
table.rename(columns = {'Unnamed: 2': 'Month', 'Unnamed: 3': 'Date'}, inplace = True)

# Drop useless information
table.drop(columns = ['Date', 'Goal', 'Overall'], inplace = True)

# Create long-format version
melted_table = pd.melt(table, id_vars = ['Month'], var_name = 'Metric', value_name = 'Value')
melted_table
Out[127]:
Month Metric Value
0 Jan Internal 47.6
1 Feb Internal 37.9
2 Mar Internal 17.6
3 Apr Internal 18.6
4 May Internal 40.6
5 Jun Internal 28.8
6 Jul Internal 27.1
7 Aug Internal 36.9
8 Sep Internal 37.1
9 Oct Internal 25.9
10 Nov Internal 51.2
11 Dec Internal 40.6
12 Jan External 44.8
13 Feb External 48.5
14 Mar External 49.5
15 Apr External 55.2
16 May External 56.5
17 Jun External 60.7
18 Jul External 44.2
19 Aug External 29.0
20 Sep External 61.2
21 Oct External 44.9
22 Nov External 76.6
23 Dec External 34.7

The graph�

First, we will plot the normal graph.

In�[128]:
# Line chart
line = (
    alt.Chart(
        melted_table, 
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
)
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90]) # y-axis goes from 0 to 90
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        )
    )
    .properties(width = 500)
)

# Goal line
goal = alt.Chart().mark_rule(
    strokeDash = [4, 4] # Dashed
    ).encode(
        x = alt.datum("Jan"), # From January
        x2 = alt.datum("Dec"), # To December
        y = alt.datum(60) # At value 60
        )


# Label at the end of the line
label = alt.Chart(melted_table).mark_text(
    align = "left", 
    dx = 3
    ).encode(
        alt.X(
            "Month", 
            aggregate = "max", 
            sort = None
            ),
        alt.Y(
            "Value", 
            aggregate = {"argmax": "Month"}
            ),
        alt.Text("Metric")
    )


final = line + goal + label

final.configure_view(stroke=None)
Out[128]:

Once again, we encounter the challenge of Altair considering September as the maximum value for months, given its alphabetical order, even when using Sorted = None. We will solve this by defining x as December and filtering the data.

In�[129]:
# Updated label
label = alt.Chart(melted_table).mark_text(
    align = 'left', 
    dx = 4
    ).encode(
    x = alt.datum('Dec'),
    y = alt.Y('Value'),
    text = alt.Text('Metric')
).transform_filter(
    # Only December values
    (alt.datum.Month == 'Dec') 
)

# Label for the goal line
label_goal = alt.Chart({"values": [
    {"text":  [ "GOAL"]}
    ]}).mark_text(
        align = 'left', 
        dx = 4
        ).encode(
    x = alt.datum('Dec'),
    y = alt.datum(60),
    text = "text:N"
)

final_original = line + goal + label + label_goal

final_original.configure_view(stroke = None)
Out[129]:

Visualization as depicted in the book:

Alt text

Empty graph�

We can start our presentation with only the skeleton of our visualization.

In�[130]:
# Create a graph with full transparency
empty = (
    alt.Chart(
        melted_table, 
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
        )
    .mark_line(opacity = 0) # Minimum opacity
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False,
                title = "2019"
            )
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90])
        )
    )
    .properties(width = 500)
)

graph_1 = empty.configure_view(stroke = None)
graph_1
Out[130]:

Visualization as depicted in the book:

Alt text

Add the goal line�

We can do this by adding already existing graphs.

In�[131]:
graph_2 = (empty + goal + label_goal).configure_view(stroke = None)
graph_2
Out[131]:

Visualization as depicted in the book:

Alt text

Create the first data point�

In�[132]:
# Point for january
point_jan = alt.Chart().mark_point(
    filled = True, 
    color = 'black', 
    size = 50, 
    opacity = 1
    ).encode(
    x = alt.datum('Jan'),
    y = alt.datum(
        melted_table["Value"][12]
        )
    )

# Create opaque versions of the goal line and label
goal_opaque = alt.Chart().mark_rule(
    strokeDash = [4,4]
    ).encode(
    x = alt.datum('Jan'),
    x2 = alt.datum('Dec'),
    y = alt.datum(60),
    opacity = alt.value(0.4)
    )

label_goal_opaque = alt.Chart({"values": [
    {"text":  [ "GOAL"]}
    ]}).mark_text(align = 'left', dx = 4).encode(
    x = alt.datum('Dec'),
    y = alt.datum(60),
    text = "text:N",
    opacity = alt.value(0.4)
)  

graph_3 = (
    empty 
    + goal_opaque 
    + label_goal_opaque 
    + point_jan
    ).configure_view(stroke = None)
graph_3
Out[132]:

Visualization as depicted in the book:

Alt text

Updating for the first half of the year�

In�[133]:
# Create the line until June
partial_line1 = (
    alt.Chart(
        melted_table, 
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019",
            scale = alt.Scale(
                # We need to set all months as a domain
                # to make sure they appear in the axis
                domain = [
                    "Jan",
                    "Feb",
                    "Mar",
                    "Apr",
                    "May",
                    "Jun",
                    "Jul",
                    "Aug",
                    "Sep",
                    "Oct",
                    "Nov",
                    "Dec"
                ]
            )
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90]),
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        )
    )
    .properties(width = 500)
    .transform_filter(
        # Filter only half of the year
        alt.FieldOneOfPredicate(
            field = "Month", 
            oneOf = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
        )
    )
)

# Point for June
point_jun = (
    alt.Chart()
    .mark_point(
        filled = True, 
        color = "black", 
        size = 50, 
        opacity= 1
        )
    .encode(
        x = alt.datum("Jun"), 
        y = alt.datum(melted_table["Value"][17])
        )
)

graph_4 = (
    partial_line1.transform_filter(alt.datum.Metric == "External")
    + goal_opaque
    + label_goal_opaque
    + point_jun
).configure_view(stroke = None)

graph_4
Out[133]:

Visualization as depicted in the book:

Alt text

Rest of the year with points�

We will complete the line for the rest of the year, highlighting some points.

In�[134]:
# Condition for External points
points_condition = (alt.datum.Metric == "External") & (
    (alt.datum.Month == "Aug")
    | (alt.datum.Month == "Sep")
    | (alt.datum.Month == "Oct")
    | (alt.datum.Month == "Nov")
    | (alt.datum.Month == "Dec")
)

# Define the points
points = (
    alt.Chart(melted_table)
    .mark_point(
        filled = True, 
        size = 50, 
        opacity = 1
        )
    .encode(
        x = alt.X("Month", sort = None),
        y = "Value",
        color = alt.condition(
            # Color depends if it achieved the Goal
            alt.datum.Value > 60, 
            alt.value("#f6792c"), 
            alt.value("#187cae")
        )
    )
    # Only for the points in the condition
    .transform_filter(points_condition)
)

graph_5 = (
    # We can filter the line directly
    line.transform_filter(alt.datum.Metric == "External")
    + goal_opaque
    + label_goal_opaque
    + points
).configure_view(stroke = None)

graph_5
Out[134]:

Visualization as depicted in the book:

Alt text

Focus on the internal�

Now we will lighten the external line to bring focus on the internal data.

In�[135]:
# External line with less opacity
external_opaque = alt.Chart(
        melted_table, 
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
            )
        ).mark_line().encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        ),
        opacity = alt.value(0.4) # Set transparency
    ).properties(width = 500
    # Filter data
    ).transform_filter(alt.datum.Metric == "External")


# Label with less opacity
label_opaque = (
    alt.Chart(melted_table)
    .mark_text(align = "left", dx = 4)
    .encode(
        x = alt.datum("Dec"),
        y = alt.Y("Value"),
        text = alt.Text("Metric"),
        opacity = alt.value(0.4)
    )
    .transform_filter((alt.datum.Month == "Dec"))
    .transform_filter(alt.datum.Metric == "External")
)

# Point for January in the Internal line
point_jan2 = (
    alt.Chart()
    .mark_point(
        filled = True, 
        color = "black", 
        size = 50, 
        opacity = 1
        )
    .encode(
        x = alt.datum("Jan"), 
        y = alt.datum(melted_table["Value"][0])
        )
)

graph_6 = (
    external_opaque 
    + label_opaque 
    + goal_opaque 
    + label_goal_opaque 
    + point_jan2
).configure_view(stroke = None)

graph_6
Out[135]:

Visualization as depicted in the book:

Alt text

First months of the year�

Similarly to the External line, we will plot the first months for Internal - this time stopping at April.

In�[136]:
# Line for the first four months
partial_line2 = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019",
            scale = alt.Scale(
                domain = [
                    "Jan",
                    "Feb",
                    "Mar",
                    "Apr",
                    "May",
                    "Jun",
                    "Jul",
                    "Aug",
                    "Sep",
                    "Oct",
                    "Nov",
                    "Dec"
                ]
            )
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90]),
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        )
    )
    .properties(width = 500)
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Month", 
            oneOf = ["Jan", "Feb", "Mar", "Apr"]
            )
    )
)

# Point for April
point_apr = (
    alt.Chart()
    .mark_point(
        filled = True, 
        color = "black", 
        size = 50, 
        opacity = 1
        )
    .encode(
        x = alt.datum("Apr"), 
        y = alt.datum(melted_table["Value"][3])
        )
)

graph_7 = (
    partial_line2.transform_filter(alt.datum.Metric == "Internal")
    + goal_opaque
    + label_goal_opaque
    + point_apr
    + external_opaque
    + label_opaque
).configure_view(stroke = None)

graph_7
Out[136]:

Visualization as depicted in the book:

Alt text

Add May�

After showing the decrease in time to fill in the first months, we can now add the next one (May) to show an increase.

In�[137]:
#Line with May added

partial_line3 = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019",
            scale = alt.Scale(
                domain = [
                    "Jan",
                    "Feb",
                    "Mar",
                    "Apr",
                    "May",
                    "Jun",
                    "Jul",
                    "Aug",
                    "Sep",
                    "Oct",
                    "Nov",
                    "Dec"
                ]
            )
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        )
    )
    .properties(width = 500)
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Month", 
            oneOf = ["Jan", "Feb", "Mar", "Apr", "May"]
        )
    )
)

# Point for May
point_may = (
    alt.Chart()
    .mark_point(
        filled = True, 
        color = "black", 
        size = 50, 
        opacity = 1
        )
    .encode(
        x = alt.datum("May"), 
        y = alt.datum(melted_table["Value"][4])
        )
)

graph_8 = (
    partial_line3.transform_filter(alt.datum.Metric == "Internal")
    + goal_opaque
    + label_goal_opaque
    + point_may
    + external_opaque
    + label_opaque
).configure_view(stroke = None)

graph_8
Out[137]:

Visualization as depicted in the book:

Alt text

Add months until September�

Here we can see a small dip, before the data rising again.

In�[138]:
# Line until September
partial_line4 = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019",
            scale = alt.Scale(
                domain = [
                    "Jan",
                    "Feb",
                    "Mar",
                    "Apr",
                    "May",
                    "Jun",
                    "Jul",
                    "Aug",
                    "Sep",
                    "Oct",
                    "Nov",
                    "Dec"
                ]
            )
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        )
    )
    .properties(width = 500)
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Month",
            oneOf = [
                "Jan", 
                "Feb", 
                "Mar", 
                "Apr", 
                "May", 
                "Jun", 
                "Jul", 
                "Aug", 
                "Sep"
                ]
        )
    )
)

# Point for September
point_sep = (
    alt.Chart()
    .mark_point(
        filled = True, 
        color = "black", 
        size = 50, 
        opacity = 1
        )
    .encode(
        x = alt.datum("Sep"), 
        y = alt.datum(melted_table["Value"][8])
        )
)

graph_9 = (
    partial_line4.transform_filter(alt.datum.Metric == "Internal")
    + goal_opaque
    + label_goal_opaque
    + point_sep
    + external_opaque
    + label_opaque
).configure_view(stroke = None)

graph_9
Out[138]:

Visualization as depicted in the book:

Alt text

Add November�

Here we can see another increase.

In�[139]:
# Create line until November
partial_line5 = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019",
            scale = alt.Scale(
                domain = [
                    "Jan",
                    "Feb",
                    "Mar",
                    "Apr",
                    "May",
                    "Jun",
                    "Jul",
                    "Aug",
                    "Sep",
                    "Oct",
                    "Nov",
                    "Dec"
                ]
            )
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["black"]), 
            legend = None
        )
    )
    .properties(width = 500)
    .transform_filter(
        alt.FieldOneOfPredicate(
            field = "Month",
            oneOf = [
                "Jan",
                "Feb",
                "Mar",
                "Apr",
                "May",
                "Jun",
                "Jul",
                "Aug",
                "Sep",
                "Oct",
                "Nov"
            ]
        )
    )
)

# Point for November
point_nov = (
    alt.Chart()
    .mark_point(
        filled = True, 
        color = "black", 
        size = 50, 
        opacity = 1
        )
    .encode(
        x = alt.datum("Nov"), 
        y = alt.datum(melted_table["Value"][10])
        )
)

graph_10 = (
    partial_line5.transform_filter(alt.datum.Metric == "Internal")
    + goal_opaque
    + label_goal_opaque
    + point_nov
    + external_opaque
    + label_opaque
).configure_view(stroke = None)

graph_10
Out[139]:

Visualization as depicted in the book:

Alt text

Add December�

Adding the final month we can see that, despise having a last dip, the Internal time to fill finished the year higher than the External.

In�[140]:
# Define point for December
point_dec = alt.Chart().mark_point(
    filled = True, 
    color = 'black', 
    size = 50, 
    opacity = 1
    ).encode(
    x = alt.datum('Dec'),
    y = alt.datum(melted_table["Value"][11])
    )

# Add existing graphs
graph_11 = (
    line.transform_filter(alt.datum.Metric == 'Internal') 
    + label.transform_filter(alt.datum.Metric == 'Internal') 
    + goal_opaque 
    + label_goal_opaque 
    + point_dec 
    + external_opaque 
    + label_opaque
    ).configure_view(stroke = None)
graph_11
Out[140]:

Visualization as depicted in the book:

Alt text

Final graph�

We can now exhibit a final graph with all the values and the External lines colored blue.

In�[141]:
# Line chart with External colored blue
line_color = (
    alt.Chart(
        melted_table,
        title = alt.Title(
            "Time to fill", 
            fontSize = 18, 
            fontWeight = "normal", 
            anchor = "start", 
            offset = 10
        )
    )
    .mark_line()
    .encode(
        x = alt.X(
            "Month",
            sort = None,
            axis = alt.Axis(
                labelAngle = 0,
                titleAnchor = "start",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal",
                ticks = False
            ),
            title = "2019"
        ),
        y = alt.Y(
            "Value",
            axis = alt.Axis(
                grid = False,
                titleAnchor = "end",
                labelColor = "#888888",
                titleColor = "#888888",
                titleFontWeight = "normal"
            ),
            title = "TIME TO FILL (DAYS)",
            scale = alt.Scale(domain = [0, 90])
        ),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#1d779c", "#676767"]), 
            legend = None
        )
    )
    .properties(width = 500)
)

# Labeling with External colored blue
label_color = (
    alt.Chart(melted_table)
    .mark_text(
        align = "left", 
        dx = 4
        )
    .encode(
        x = alt.datum("Dec"),
        y = alt.Y("Value"),
        text = alt.Text("Metric"),
        color = alt.Color(
            "Metric", 
            scale = alt.Scale(range = ["#1d779c", "#676767"]), 
            legend = None
        )
    )
    .transform_filter((alt.datum.Month == "Dec"))
)

graph_12 = (
    line_color 
    + goal_opaque 
    + label_goal_opaque 
    + label_color
    ).configure_view(stroke = None)

graph_12
Out[141]:

Visualization as depicted in the book:

Alt text

The exercise still presents a graph to be used as a summarization of the ones above. However, instead of replicating it, we will focus on creating an animation with the visualizations we already have.

Last graph from the book:

Alt text

Animation�

For our animation, we will leave Altair and use ipywidgets. The code is as follows.

In�[142]:
# List of the graphs
graphs = [
    graph_1, 
    graph_2, 
    graph_3, 
    graph_4, 
    graph_5, 
    graph_6, 
    graph_7, 
    graph_8, 
    graph_9, 
    graph_10, 
    graph_11, 
    graph_12
    ]

# Create function
def demo(i):
    clear_output(wait = True)
    if 0 <= i < len(graphs):
        chart = graphs[i]
    else:
        chart = None 
    display(chart)

# Create the animation
interact(demo, i = widgets.Play(
    value = 0,
    min = 0,
    max = 11,
    step = 1,
    description = "Press play",
    interval = 2000
    ))
interactive(children=(Play(value=0, description='Press play', interval=2000, max=11), Output()), _dom_classes=…
Out[142]:
<function __main__.demo(i)>

It is important to note, however, that the application above only works in Live Kernels (therefore not running on an HTML file). For that reason, we will upload a GIF of a screen recording.

GIF for the animation:

Alt text

The end - for now! :)�

If you have any questions, suggestions on how to improve my code, or requests for new exercises from the book, you can contact me by opening an issue in the Github repository for the project, available here. I would love to hear from you!

Cell to create the .html�

This code was created by Søren Fuglede Jørgensen and can be found here. Please make sure any changes are saved before running this cell.

In�[143]:
with open('index.ipynb') as nb_file:
    nb_contents = nb_file.read()

# Convert using the ordinary exporter
notebook = nbformat.reads(nb_contents, as_version=4)

# HTML Export
html_exporter = nbconvert.HTMLExporter()
body, res = html_exporter.from_notebook_node(notebook)

# Create a dict mapping all image attachments to their base64 representations
images = {}
for cell in notebook['cells']:
    if 'attachments' in cell:
        attachments = cell['attachments']
        for filename, attachment in attachments.items():
            for mime, base64 in attachment.items():
                images[f'attachment:{filename}'] = f'data:{mime};base64,{base64}'

# Fix up the HTML and write it to disk
for src, base64 in images.items():
    body = body.replace(f'src="{src}"', f'src="{base64}"')

# Write HTML to file
with open('index.html', 'w') as html_output_file:
    html_output_file.write(body)