Minard’s Diagram of Napoleon’s Invasion of Russia in R
This is a recreation of Charles Minard's 1869 chart showing the number of men in Napoleon’s 1812 Russian campaign army, their movements and the temperature they encountered on their return path.
1. Data
We’re using three distinct CSV files for cities, troop movements and temperatures, all courtesy of Michael Friendly’s Re-Visions of Minard.
2. Plotting the data
The central idea of Nextjournal is making it easier to generate, acquire and internalize knowledge. Now that's very abstract so let's make it more concrete:In the majority of knowledge that's being shared today – be it in the form of blog posts or scientific papers – we're only sharing the final result. The analysis behind this final result is often inaccessible. Even if we're lucky and the code is available on GitHub or similar services, running it is often hard if not impossible. Software rot, dependency hell, missing data files are just a few of the problems we experience.
We start out with plotting the troop movements using the ggplot2
package. Let’s add it to our environment first.
install.packages("ggplot2") library(ggplot2)
Next up, we look at the contents of troops
variable.
troops <- read.table(troops.csv, header = TRUE, stringsAsFactors = FALSE)
All of the contained variables will be used for this plot:
- the troops’ paths will be drawn using
long
andlat
, each corresponding to the x and y axes - since Napoleon had various groups in the field, each
group
will have its own path - the paths will be drawn in different colors for both
direction
s: advancement (A) and retreat (R) - the width of the paths will correspond to the number of
survivors
ggsave("/results/minard-1.svg", ggplot(troops, aes(x = long, y = lat, group = group, color = direction, size = survivors)) + geom_path(), width = 7, height = 2.2)
This looks already familiar but there are still some things to do:
- We want the style and colors look similar to what’s in Minard’s diagram.
- Minard chose to conflate the paths of the different groups starting their approach. This adds more drama to the graph. You see how the army starts out as one very wide line that is reduced to a sliver as they return.
- We need to show the cities over the troops path.
- We still need to show the temperater graph below the troops path.
ggsave("/results/minard-2.svg", ggplot(troops, aes(x = long, y = lat, group = group, color = direction, size = survivors)) + geom_path(lineend = "round") + scale_size(range = c(0.5, 15)) + scale_colour_manual(values = c("#e5cbaa", "#242021")) + labs(x = NULL, y = NULL) + guides(color = FALSE, size = FALSE) + theme_void(), width = 7, height = 2.2)
Next up, we want to add the city labels. For this, we’ll read the contents of ggrepel
library to lay them out in a way that they don’t overlap. We also add extrafonts
for some additional font options.
cities <- read.table(cities.csv, header = TRUE, stringsAsFactors = FALSE)
install.packages(c("ggrepel", "extrafont")) library(ggrepel) library(extrafont) font_import()
ggplot() + geom_path(data = troops, aes(x = long, y = lat, group = group, color = direction, size = survivors), lineend = "round") + geom_label_repel(data = cities, aes(x = long, y = lat, label = city), color = "#000000", size = 2, family = "Helvetica") + scale_size(range = c(0.5, 15)) + scale_colour_manual(values = c("#e5cbaa", "#242021")) + labs(x = NULL, y = NULL) + guides(color = FALSE, size = FALSE) + theme_void() + theme(aspect.ratio = 0.3)
list(), list(<environment>....
Okay, next we’ll add the temperature graph starting by reading temps
variable.
temps <- read.table(temps.csv, header = TRUE, stringsAsFactors = FALSE)
In date
is actually of type string so we have to convert that to actual dates. For that we use the lubridate
package to parse the strings and convert them to dates.
install.packages("lubridate") library(lubridate)