Hi folks!
Today we take a glimpse at plot animation in R with the gganimate. Since the 10th anniversary of Early Childhood in Context (aka the lab I am working in) is coming up, we will have a celebration in the form of a scientific symposium. Among other things, my boss plans to illustrate how productive we have been and asked me for a visualization of our conference contributions. This is not scientific work, but I enjoy it anyways and because it will not be published in a journal I can share how I have done it with you.
Before we start, here is what my boss envisions: She wants a world map with all our conference contributions which will be shown as a slide in her presentation. Besides the location, this map should provide information about the types of conferences and the number of our contributions. Ideally, it should also include a time axis so that our guest can see how busy we have been.
Let’s see if we can meet these requirements.
Preparations
We start with some preparatory steps.
packages
At first we load some required packages. Note that we will load gganimate
as the core of this post later on.
library(tidyverse)
library(magrittr)
library(ggmap)
data
Next, we need the data. I have used datapasta (a package I like) so that you can easily copy and paste it and do not need to download any files.
dat_raw <- tibble::tribble(
~conference, ~date, ~location, ~n,
"IAC", "Aug 13", "Pavia/Italien", 1L,
"650-Jahres-Jubiläum Universität Wien", "Apr 15", "Wien/Österreich", 6L,
"APS", "Mai 09", "San Francisco/USA", 1L,
"ARGE", "Mai 13", "Wien/Österreich", 1L,
"Biennial International Conference on Infant Studies", "Mai 16", "New Orleans/USA", 2L,
"DGfE", "Mär 14", "Berlin/Deutschland", 1L,
"DGfE", "Mär 08", "Dresden/Deutschland", 2L,
"DGfE", "Mär 12", "Osnabrück/Deutschland", 2L,
"DGPs", "Sep 09", "Hildesheim/Deutschland", 6L,
"DGPs", "Sep 10", "Bremen/Deutschland", 8L,
"DGPs", "Sep 12", "Bielefeld/Deutschland", 1L,
"DGPs", "Sep 16", "Leipzig/Deutschland", 7L,
"DGPs", "Sep 18", "Frankfurt/Deutschland", 5L,
"DGPs-Fachtagung Entwicklungspsychologie", "Aug 15", "Frankfurt/Deutschland", 5L,
"DGPs-Fachtagung Entwicklungspsychologie", "Sep 17", "Münster/Deutschland", 8L,
"DGPs Entwicklungspsychologie", "Sep 13", "Saarbrücken/Deutschland", 5L,
"DGPs Fachtagung der Different. Psychologie", "Sep 13", "Greifswald/Deutschland", 1L,
"DGPs Fachtagung Entw.-psych.", "Sep 11", "Erfurt/Deutschland", 4L,
"DGPs Fachtagung Päd.-Psych.", "Sep 11", "Erfurt/Deutschland", 3L,
"EAM-SMABs", "Jul 10", "Berlin/Deutschland", 1L,
"EAM", "Jul 12", "Santiago de Compostela/Spanien", 1L,
"EARLI", "Aug 10", "Luzern/Schweiz", 3L,
"EARLI SIG 5", "Aug 18", "Berlin/Deutschland", 1L,
"ECDP", "Aug 11", "Bergen/Norwegen", 4L,
"ECDP", "Sep 13", "Lausanne/Schweiz", 3L,
"ECP", "Jul 11", "Istanbul/Türkei", 1L,
"EPC", "Jul 09", "Oslo/Norwegen", 3L,
"ESFR", "Sep 12", "Lillehammer/Norwegen", 1L,
"European Conference on Developmental Psychology", "Aug 17", "Utrecht/Niederlande", 2L,
"European Conference on Psychology", "Jul 15", "Milano/Italien", 1L,
"GAIMH", "Feb 11", "Wien/Österreich", 6L,
"ICIS", "Mai 14", "Berlin/Deutschland", 1L,
"IDAD-Workshop on Fathers", "Jun 16", "Ann Arbor/Michigan", 2L,
"International Attachment Conference", "Aug 15", "New York City/USA", 5L,
"International Attachment Conference", "Jun 17", "London/UK", 3L,
"ISPNE", "Sep 17", "Zürich/Schweiz", 1L,
"ISSBD", "Jul 08", "Würzburg/Deutschland", 1L,
"ISSBD", "Jul 10", "Lusaka/Sambia", 2L,
"ISSBD", "Jul 12", "Edmonton/Alberta, Canada", 3L,
"ISSBD", "Jul 14", "Shanghai/China", 2L,
"ISSBD", "Jul 16", "Vilnius/Litauen", 1L,
"ISSBD", "Jul 18", "Gold Coast/Australia", 2L,
"Jacobs Experten-Workshop", "Mai 14", "Marbach/Schweiz", 2L,
"ÖGP", "Apr 10", "Salzburg/Österreich", 1L,
"ÖGP", "Apr 12", "Graz/Österreich", 5L,
"ÖGP", "Apr 14", "Wien/Österreich", 9L,
"SIG 5", "Aug 12", "Utrecht/Niederlande", 1L,
"SRCD", "Apr 09", "Denver/USA", 5L,
"SRCD", "Apr 13", "Seattle/USA", 1L,
"SRCD", "Apr 17", "Austin/Texas/USA", 5L,
"SRCD", "Mär 11", "Montreal/Canada", 3L,
"SRCD", "Mär 15", "Philadelphia/USA", 1L,
"SRCD", "Feb 12", "Tampa, Florida/USA", 1L,
"WAIMH", "Jul 10", "Leipzig/Deutschland", 3L,
"WAIMH", "Mai 16", "Prag/Tschechien", 3L,
"WAIMH", "Mai 18", "Rom/Italien", 3L
)
Let’s take a glimpse at the data:
dat_raw %>% glimpse()
## Observations: 56
## Variables: 4
## $ conference <chr> "IAC", "650-Jahres-Jubiläum Universität Wien", "APS...
## $ date <chr> "Aug 13", "Apr 15", "Mai 09", "Mai 13", "Mai 16", "...
## $ location <chr> "Pavia/Italien", "Wien/Österreich", "San Francisco/...
## $ n <int> 1, 6, 1, 1, 2, 1, 2, 2, 6, 8, 1, 7, 5, 5, 8, 5, 1, ...
OK, we have four variables:
conference
: the name (abbreviated) of the conferencedate
: the date in a German short formatlocation
: the location of the conference in format city/countryn
: the number of our contributions (e.g., talks, posters, …) at the conference
Before we can get to the visualization, we have to prepare the data a little.
timestamps
At the moment date
is a character-variable; hence we can only sort alphabetically, which is not too helpful. Moreover, it is in German and no international standard format. To solve this problem, we transform date
into a date-variable for sorting and create a new character-variable for annotating.
(dat_time <- dat_raw %>%
mutate(
## strptime cannot deal with German "Mär" and "Mai"
## convert to Mar and May, respectively
date = str_replace(date, "ä", "a"),
date = str_replace(date, "i", "y"),
date = lubridate::dmy(paste("01 ", date, locale="de_DE")),
date_chr = format(date, "%m/%Y")
) %>%
arrange(date))
## # A tibble: 56 x 5
## conference date location n date_chr
## <chr> <date> <chr> <int> <chr>
## 1 DGfE 2008-03-01 Dresden/Deutschland 2 03/2008
## 2 ISSBD 2008-07-01 Würzburg/Deutschland 1 07/2008
## 3 SRCD 2009-04-01 Denver/USA 5 04/2009
## 4 APS 2009-05-01 San Francisco/USA 1 05/2009
## 5 EPC 2009-07-01 Oslo/Norwegen 3 07/2009
## 6 DGPs 2009-09-01 Hildesheim/Deutschland 6 09/2009
## 7 ÖGP 2010-04-01 Salzburg/Österreich 1 04/2010
## 8 EAM-SMABs 2010-07-01 Berlin/Deutschland 1 07/2010
## 9 ISSBD 2010-07-01 Lusaka/Sambia 2 07/2010
## 10 WAIMH 2010-07-01 Leipzig/Deutschland 3 07/2010
## # ... with 46 more rows
conferences
The data includes many different conferences, but my boss wants them to be color-coded in the visualization. To make this feasible, we have to reduce the number of different categories. We do this by collapsing all DGPs conferences into one category and lumping all infrequent conferences together.
dat_conf <- dat_time %>%
mutate(
conf = str_replace(conference,"DGP.*", "DGPs"),
conf = as_factor(conf),
conf = fct_infreq(conf),
conf = fct_lump(conf, n=6)
)
place and coordinates
Since my boss wants the location indicated on a map, we need to get their coordinates. For this task several providers are available, but their APIs have in common that they do not work well with German umlauts and special characters. So the first step is to replace those.
dat_umlauts <- dat_conf %>%
mutate(
loc_um = str_replace_all(location, "Ö", "Oe"),
loc_um = str_replace_all(loc_um, "ü", "ue"),
loc_um = str_replace_all(loc_um, "[/,]", " ")
)
Note: This is not a robust solution and only works on the given data set because it only deals with some umlauts and special characters.
Unfortunately, the google maps API is not as accessible like it used to be in former times (you have to register, create a project, set up billing, …) so we will work with the free OSM Nominatim API instead. Its most significant drawback is that it allows only one request per second, but we do not have that many locations that this would be a problem.
We start by creating a wrapper function for the Nominatim API.
geocode <- function(location = NULL)
{
require(tibble)
if(suppressWarnings(is.null(location)))
return(tibble())
base_url <- "http://nominatim.openstreetmap.org/search/@location@?format=json&addressdetails=0&limit=1"
loc_html <- str_replace(base_url, "\\@location\\@", location)
final_url <- str_replace_all(loc_html, "\\s+", "\\%20")
tryCatch(
d <- jsonlite::fromJSON(
final_url
), error = function(c) return(tibble())
)
if(length(d) == 0) return(tibble())
return(tibble(lon = as.numeric(d$lon), lat = as.numeric(d$lat)))
## Nominatim allows 1 request per second
## --> pause
Sys.sleep(1)
}
Next, we apply it to our locations.
(dat_coord <- dat_umlauts %>%
mutate(
coord = map(loc_um, geocode)
))
## # A tibble: 56 x 8
## conference date location n date_chr conf loc_um coord
## <chr> <date> <chr> <int> <chr> <fct> <chr> <list>
## 1 DGfE 2008-03-01 Dresden/De… 2 03/2008 DGfE Dresden … <tibb…
## 2 ISSBD 2008-07-01 Würzburg/D… 1 07/2008 ISSBD Wuerzbur… <tibb…
## 3 SRCD 2009-04-01 Denver/USA 5 04/2009 SRCD Denver U… <tibb…
## 4 APS 2009-05-01 San Franci… 1 05/2009 Other San Fran… <tibb…
## 5 EPC 2009-07-01 Oslo/Norwe… 3 07/2009 Other Oslo Nor… <tibb…
## 6 DGPs 2009-09-01 Hildesheim… 6 09/2009 DGPs Hildeshe… <tibb…
## 7 ÖGP 2010-04-01 Salzburg/Ö… 1 04/2010 ÖGP Salzburg… <tibb…
## 8 EAM-SMABs 2010-07-01 Berlin/Deu… 1 07/2010 Other Berlin D… <tibb…
## 9 ISSBD 2010-07-01 Lusaka/Sam… 2 07/2010 ISSBD Lusaka S… <tibb…
## 10 WAIMH 2010-07-01 Leipzig/De… 3 07/2010 WAIMH Leipzig … <tibb…
## # ... with 46 more rows
Since, geocode()
returns a tibble we have to unnest()
its result.
(dat_geo <- dat_coord %>%
unnest(coord))
## # A tibble: 56 x 9
## conference date location n date_chr conf loc_… lon lat
## <chr> <date> <chr> <int> <chr> <fct> <chr> <dbl> <dbl>
## 1 DGfE 2008-03-01 Dresden… 2 03/2008 DGfE Dres… 13.7 51.0
## 2 ISSBD 2008-07-01 Würzbur… 1 07/2008 ISSBD Wuer… 9.93 49.8
## 3 SRCD 2009-04-01 Denver/… 5 04/2009 SRCD Denv… -105. 39.7
## 4 APS 2009-05-01 San Fra… 1 05/2009 Other San … -122. 37.8
## 5 EPC 2009-07-01 Oslo/No… 3 07/2009 Other Oslo… 10.7 59.9
## 6 DGPs 2009-09-01 Hildesh… 6 09/2009 DGPs Hild… 9.95 52.2
## 7 ÖGP 2010-04-01 Salzbur… 1 04/2010 ÖGP Salz… 13.0 47.8
## 8 EAM-SMABs 2010-07-01 Berlin/… 1 07/2010 Other Berl… 13.4 52.5
## 9 ISSBD 2010-07-01 Lusaka/… 2 07/2010 ISSBD Lusa… 28.3 -15.4
## 10 WAIMH 2010-07-01 Leipzig… 3 07/2010 WAIMH Leip… 12.4 51.3
## # ... with 46 more rows
annotation
As a last data preparatory step, we create a new character-variable that combines date, conference, and location. We will use this variable for annotating.
(dat_ann <- dat_geo %>%
## separate with linebreaks --> \n
unite("ann", date_chr, conference, location, sep = "\n") %>%
mutate(ann = as.factor(ann),
ann = fct_inorder(ann)))
## # A tibble: 56 x 7
## ann date n conf loc_um lon lat
## <fct> <date> <int> <fct> <chr> <dbl> <dbl>
## 1 "03/2008\nDGfE\nDres… 2008-03-01 2 DGfE Dresden Deu… 13.7 51.0
## 2 "07/2008\nISSBD\nWür… 2008-07-01 1 ISSBD Wuerzburg D… 9.93 49.8
## 3 "04/2009\nSRCD\nDenv… 2009-04-01 5 SRCD Denver USA -105. 39.7
## 4 "05/2009\nAPS\nSan F… 2009-05-01 1 Other San Francis… -122. 37.8
## 5 "07/2009\nEPC\nOslo/… 2009-07-01 3 Other Oslo Norweg… 10.7 59.9
## 6 "09/2009\nDGPs\nHild… 2009-09-01 6 DGPs Hildesheim … 9.95 52.2
## 7 "04/2010\nÖGP\nSalzb… 2010-04-01 1 ÖGP Salzburg Oe… 13.0 47.8
## 8 "07/2010\nEAM-SMABs\… 2010-07-01 1 Other Berlin Deut… 13.4 52.5
## 9 "07/2010\nISSBD\nLus… 2010-07-01 2 ISSBD Lusaka Samb… 28.3 -15.4
## 10 "07/2010\nWAIMH\nLei… 2010-07-01 3 WAIMH Leipzig Deu… 12.4 51.3
## # ... with 46 more rows
map
Now, that the data is in shape, we can turn to the visualization. As a first step, we download a map which we will overlay with our data later on. Please note that the map’s bounding box does not include the whole world, because some regions are not required for the figure.
(map_world <- ggmap::get_map(location=c(-126.92,-49.61,160.67,72.29),
source = "stamen", maptype = "toner-lite", zoom=3))
## 932x1636 toner-lite map image from Stamen Maps. see ?ggmap to plot it.
Let’s take a look at the map.
plot(map_world)
Seems reasonable to me - all the locations of all conferences should be covered (we could check whether all coordinates are within the bounding box, but I believe we are good without it).
Plot
With all preparation finished we can move to the visualization. My idea is to include the time axis as an animation. However, before we tend to animation, it is a good idea to set up a static version.
create a static map
Let’s try what happens when we project our data on the map without much further configuration.
ggmap(map_world) +
geom_point(data = dat_geo,
aes(x=lon, y=lat, size = n, color=conf),
alpha=.6)
OK, that does look pretty usable. Let’s make some minor adjustments (increase point size in the map, choose more sensible breaks, select another color palette, increase the point size in the legend, and remove the axes).
ggmap(map_world) +
geom_point(data = dat_geo,
aes(x=lon, y=lat, size = n, color=conf),
alpha=.6) +
scale_size(name="Contributions",range = c(2, 8), breaks = c(2,4,6,8)) +
scale_color_brewer(palette="Dark2", name="Conference") +
guides(color = guide_legend(override.aes = list(size=5))) +
theme(axis.line=element_blank(),axis.text.x=element_blank(),
axis.text.y=element_blank(),axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank())
Well, it is not perfect, but good enough. I am sure that my boss will have multiple aspects that she wants to be amended. However, this should be an acceptable starting point for further improvements. But before I can get to work out the fine-tuning with my boss, we need to add the animation of the time axis.
create animation
For the animation, we will use the gganimate package. Other animation packages for R exist, but I have never before worked with this one, so it is an excellent opportunity to learn something new.
CRAN does not distribute the package, so we install it from GitHub.
devtools::install_github("dgrtwo/gganimate")
Next, we load it.
library(gganimate)
With that, we can get directly to the animation. I want the points to appear in chronological order with the past points remaining on the map. According to the packages documentation transition_reveal()
seems the right way to get this effect. Unfortunately, it did not work for me with past points vanishing as new ones were shown. Instead, I used a workaround with a combination of transition_states()
and shadow_mark()
.
p <- ggmap(map_world) +
geom_point(data = dat_ann,
aes(x=lon, y=lat, size = n, color=conf, group=ann),
alpha=.6) +
scale_size(name="Contributions",range = c(2, 8), breaks = c(2,4,6,8)) +
scale_color_brewer(palette="Dark2", name="Conference") +
guides(color = guide_legend(override.aes = list(size=10))) +
transition_states(ann,1,1) +
shadow_mark(past=TRUE) +
labs(title = '{closest_state}') +
theme(axis.line=element_blank(),axis.text.x=element_blank(),
axis.text.y=element_blank(),axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank())
Now, that we have set up the plot, we can animate it.
animate(p)
Hm, that is strange. The last conferences are missing. By inspecting the animation with frame_vars(p_ani)
I realized that they weren’t included. The reason is the default frame number of 100 with 2 frames per state. We have 56 data points, and the default frame number is not sufficient.
Another problem I have noticed is that the graphic is rather coarse. Again the default setting is the culprit because they scale down the output. We will change that, but in turn, we have to increase the base point size and the legend text size too. Let’s begin with this task.
p_all <- ggmap(map_world) +
geom_point(data = dat_ann,
aes(x=lon, y=lat, size = n, color=conf, group=ann),
alpha=.6) +
scale_size(name="Contributions",range = c(8, 14), breaks = c(2,4,6,8)) +
scale_color_brewer(palette="Dark2", name="Conference") +
guides(color = guide_legend(override.aes = list(size=10))) +
transition_states(ann,1,1) +
shadow_mark(past=TRUE) +
labs(title = '{closest_state}') +
theme(axis.line=element_blank(),axis.text.x=element_blank(),
axis.text.y=element_blank(),axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
text = element_text(size=20))
Next, we increase the output’s resolution …
options(gganimate.dev_args = list(width = 1800, height = 860))
… and finish by animating all data points (additionally, I slowed the animation down to only 3 frames per second).
p_ani <- animate(p_all, nframes = 112, fps=3, detail = 1)
Let’s have a look at the result.
p_ani
That looks good to me, but I wish it would pause a little before the next loop starts. Unfortunately, our workaround with transition_states()
does not allow control over the individual frame viewing times. One option would be to try to get transition_reveal()
to work after all, but I am lazy and have little time, so I take a different approach.
For that, we first need to save the animation as a GIF.
anim_save("map_ani.gif")
Next, we use the command line tool Gifsicle to add a pause to the last frame.
gifsicle -U map_ani.gif "#0--2" -d500 "#-1" -O2 > map_pause.gif
## gifsicle:map_ani.gif: warning: GIF too complex to unoptimize
## (The reason was local color tables or complex transparency.
## Try running the GIF through ‘gifsicle --colors=255’ first.)
## gifsicle: warning: too many colors, using local colormaps
## (You may want to try ‘--colors 256’.)
This gives us:
With that, I am satisfied for now and look forward to all the improvements my boss will add.
Closing Remarks
I hope you have enjoyed our little digression into graphics animation with R. Animating an R graphic with gganimate was easy, and of course much more sophisticated animations are possible with it. If you try out gganimate with your data, please share the results.
Regarding the symposium: If you are interested in developmental psychology (with a focus on fatherhood or extra-familial care), you are more than welcome to join us. We could win an impressive list of international scientists in this area as our keynote speakers and my Ph.D. siblings, and I will present our top article from our dissertation bundles. Last but not least the final animation will be presented. If this has not convinced you by now, then I can offer a 20% reduction on the admission fee - just enter bernhardlearns20 as a comment in the registration form.
If you have any questions or comments, please post them in the comments section. If something is not working as outlined here, please check the package versions you are using. The system I have used was:
sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] gganimate_0.9.9.9999 bindrcpp_0.2.2 ggmap_2.7.904
## [4] magrittr_1.5 forcats_0.3.0 stringr_1.3.1
## [7] dplyr_0.7.7 purrr_0.2.5 readr_1.1.1
## [10] tidyr_0.8.2 tibble_1.4.2 ggplot2_3.1.0
## [13] tidyverse_1.2.1 conflicted_1.0.1
##
## loaded via a namespace (and not attached):
## [1] progress_1.2.0 tidyselect_0.2.5 haven_1.1.2
## [4] lattice_0.20-35 colorspace_1.3-2 htmltools_0.3.6
## [7] yaml_2.2.0 utf8_1.1.4 rlang_0.3.0.1
## [10] pillar_1.3.0 glue_1.3.0 withr_2.1.2
## [13] tweenr_1.0.0 RColorBrewer_1.1-2 modelr_0.1.2
## [16] readxl_1.1.0 jpeg_0.1-8 bindr_0.1.1
## [19] plyr_1.8.4 munsell_0.5.0 gtable_0.2.0
## [22] cellranger_1.1.0 rvest_0.3.2 RgoogleMaps_1.4.2
## [25] memoise_1.1.0 evaluate_0.12 labeling_0.3
## [28] knitr_1.20 curl_3.2 gifski_0.8.6
## [31] fansi_0.4.0 broom_0.5.0 Rcpp_0.12.19
## [34] scales_1.0.0 backports_1.1.2 jsonlite_1.5
## [37] farver_1.0 rjson_0.2.20 hms_0.4.2
## [40] png_0.1-7 digest_0.6.18 stringi_1.2.4
## [43] grid_3.4.4 rprojroot_1.3-2 cli_1.0.1
## [46] tools_3.4.4 bitops_1.0-6 lazyeval_0.2.1
## [49] crayon_1.3.4 pkgconfig_2.0.2 prettyunits_1.0.2
## [52] xml2_1.2.0 lubridate_1.7.4 assertthat_0.2.0
## [55] rmarkdown_1.10 httr_1.3.1 rstudioapi_0.8
## [58] R6_2.3.0 nlme_3.1-131 compiler_3.4.4
It is very interesting,but if I want to tag it in my site ,or even on my face book account,how can I do that ,I mean with the animation?
ReplyDeleteI'm no expert on this - as far as I know gifs get embedded into facebook mostly via 3rd party providers like e.g., giphy.com - so tagging might be a problem. An easy fix for this problem is to use another container for your animation, e.g., animate(p, renderer = ffmpeg_renderer()) renders the animation as a video instead of a gif.
DeletePS: thx to Magdalena Lischka for preparing the data.
ReplyDelete