For background complete the introductory tutorials on rgbif and on rotl to familiarize yourself.
https://ropensci.org/tutorials/rotl_tutorial/
https://ropensci.org/tutorials/rgbif_tutorial/
In order to examine geographic distributions of species, it is useful to tie together information about phylogenetic relationships, which we can get from Open Tree of Life, with information about where individuals of that species are found.
install.packages(c("httr", "jsonlite", "rgbif", "rotl", "phytools", "viridis"))
## Installing packages into '/home/ejmctavish/R/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(httr)
library(jsonlite)
library(rgbif)
library(rotl)
library(phytools)
## Loading required package: ape
## Loading required package: maps
library(viridis)
## Loading required package: viridisLite
##
## Attaching package: 'viridis'
## The following object is masked from 'package:maps':
##
## unemp
Lets look for some fun species, A polar bear, “Ursus maritimus”, a hippo “Hippopotamus amphibius”, a gorilla “Gorilla gorilla” and a wild boar “Sus scrofa”.
polarbear <-"Ursus maritimus"
hippo <-"Hippopotamus amphibius"
boar <-"Sus scrofa"
gorilla <- "Gorilla gorilla"
dolphin <- "Tursiops truncatus"
spp <- c(polarbear, boar, gorilla, hippo,dolphin)
We can use Open Tree of Life to generate a tree for our species!
taxa <- rotl::tnrs_match_names(spp, context="Animals")
tr <- rotl::tol_induced_subtree(ott_id(taxa), label="name")
##
Progress [----------------------------------] 0/62 ( 0%) ?s
Progress [================================] 62/62 (100%) 0s
## Warning in collapse_singles(tr, show_progress): Dropping singleton
## nodes with labels: Euarchontoglires, mrcaott786ott112387, Primates,
## mrcaott786ott3428, Haplorrhini, Simiiformes, Catarrhini, mrcaott786ott3607729,
## mrcaott786ott83926, mrcaott83926ott3607702, mrcaott83926ott96938,
## mrcaott83926ott6145147, mrcaott83926ott3607728, mrcaott83926ott770295,
## mrcaott83926ott3607876, mrcaott83926ott3607873, Homininae,
## Gorilla, Laurasiatheria, mrcaott1548ott6790, mrcaott1548ott3607484,
## mrcaott1548ott4942380, mrcaott1548ott4942547, mrcaott1548ott3021, Artiodactyla,
## mrcaott1548ott5256, mrcaott5256ott4944931, Cetacea, mrcaott5256ott3615450,
## mrcaott5256ott44568, Odontoceti, mrcaott5256ott5269, mrcaott5269ott6470,
## mrcaott5269ott47843, mrcaott47843ott194312, Delphinidae, mrcaott47843ott111103,
## mrcaott111103ott825969, mrcaott111103ott124220, mrcaott124220ott250045,
## mrcaott124220ott155492, mrcaott124220ott124224, mrcaott124220ott246130,
## Tursiops, Ancodonta, Hippopotamidae, Hippopotamus, Suina, Suidae,
## Sus, mrcaott4697ott263949, Carnivora, mrcaott4697ott6940, Caniformia,
## mrcaott4697ott10732, mrcaott10732ott22064, Ursidae, mrcaott10732ott679667,
## mrcaott10732ott297463, mrcaott10732ott153282, mrcaott10732ott7067610,
## mrcaott10732ott350099
plot(tr)
Pulling supporting studies isn’t included in rotl yet (or I couldn’t find it). So we will use an Open Tree API call directly.
url_ot <- 'https://api.opentreeoflife.org/v3/tree_of_life/induced_subtree'
body <- list(ott_ids=taxa$ott_id)
r <- httr::POST(url_ot, body = body, encode = "json")#gets which studies support the subtree from the open tree API
for(studytree in httr::content(r)$supporting_studies){
study <- strsplit(studytree, '@')[[1]][1]
meta <- get_study_meta(study) #pulls the metadata for each study
pub <- get_publication(meta) #grabs teh publication information
cat(pub,"\n\n")
}
## Lartillot, Nicolas, Frédéric Delsuc. 2012. Joint reconstruction of divergence times and life-history evolution in placental mammals using a phylogenetic covariance model. Evolution 66 (6): 1773-1787.
##
## O'Leary, M. A., J. I. Bloch, J. J. Flynn, T. J. Gaudin, A. Giallombardo, N. P. Giannini, S. L. Goldberg, B. P. Kraatz, Z.-X. Luo, J. Meng, X. Ni, M. J. Novacek, F. A. Perini, Z. S. Randall, G. W. Rougier, E. J. Sargis, M. T. Silcox, N. B. Simmons, M. Spaulding, P. M. Velazco, M. Weksler, J. R. Wible, A. L. Cirranello. 2013. The placental mammal ancestor and the post-K-Pg radiation of placentals. Science 339 (6120): 662-667.
##
## Springer, Mark S., Robert W. Meredith, John Gatesy, Christopher A. Emerling, Jong Park, Daniel L. Rabosky, Tanja Stadler, Cynthia Steiner, Oliver A. Ryder, Jan E. Janečka, Colleen A. Fisher, William J. Murphy. 2012. Macroevolutionary dynamics and historical biogeography of primate diversification inferred from a species supermatrix. PLoS ONE 7 (11): e49521.
##
## Mark S. Springer, Christopher A. Emerling, Robert W. Meredith, Jan E. Janečka, Eduardo Eizirik, William J. Murphy, 2017, 'Waking the undead: Implications of a soft explosive model for the timing of placental mammal diversification', Molecular Phylogenetics and Evolution, vol. 106, pp. 86-102
##
## Meredith, R.W., Janecka J., Gatesy J., Ryder O.A., Fisher C., Teeling E., Goodbla A., Eizirik E., Simao T., Stadler T., Rabosky D., Honeycutt R., Flynn J., Ingram C., Steiner C., Williams T., Robinson T., Herrick A., Westerman M., Ayoub N., Springer M., & Murphy W. 2011. Impacts of the Cretaceous Terrestrial Revolution and KPg Extinction on Mammal Diversification. Science 334 (6055): 521-524.
##
## Tarver, J. E., P. C. J. Donoghue, M. J. Benton. 2011. Is evolutionary history repeatedly rewritten in light of new fossil discoveries? Proceedings of the Royal Society B: Biological Sciences 278 (1705): 599-604.
##
## Graham J. Slater, Jeremy A. Goldbogen, Nicholas D. Pyenson, 2017, 'Independent evolution of baleen whale gigantism linked to Plio-Pleistocene ocean dynamics', Proceedings of the Royal Society B: Biological Sciences, vol. 284, no. 1855, p. 20170546
##
## Perelman, P., Johnson W.E., Roos C., Seuánez H.N., Horvath J.E., Moreira M.A., Kessing B.D., Pontius J., Roelke M., Rumpler Y., Schneider M.C., Silva A., O'brien S., & Pecon-slattery J. 2011. A Molecular Phylogeny of Living Primates. PLoS Genetics 7(3): e1001342.
##
## Nyakatura, Katrin, Olaf RP Bininda-Emonds. 2012. Updating the evolutionary history of Carnivora (Mammalia): a new species-level supertree complete with divergence time estimates. BMC Biology 10 (1): 12
##
## Krause, Johannes, Tina Unger, Aline Noçon, Anna-Sapfo Malaspinas, Sergios-Orestis Kolokotronis, Mathias Stiller, Leopoldo Soibelzon, Helen Spriggs, Paul H Dear, Adrian W Briggs, Sarah CE Bray, Stephen J O'Brien, Gernot Rabeder, Paul Matheus, Alan Cooper, Montgomery Slatkin, Svante Pääbo, Michael Hofreiter. 2008. Mitochondrial genomes reveal an explosive radiation of extinct and extant bears near the Miocene-Pliocene boundary. BMC Evolutionary Biology 8 (1): 220.
##
## Toljagić O., Voje K.L., Matschiner M., Liow L., & Hansen T.F. 2017. Millions of Years Behind: Slow Adaptation of Ruminants to Grasslands. Systematic Biology, .
##
## Steeman, M., Hebsgaard M., Fordyce R., Ho S., Rabosky D., Nielsen R., Rahbek C., Glenner H., Sørensen M., & Willerslev E. 2009. Radiation of Extant Cetaceans Driven by Restructuring of the Oceans. Systematic Biology 58 (6): 573-585.
##
## McGowen, M., Spaulding M., and Gatesy J. 2009. Divergence date estimation and a comprehensive molecular tree of extant cetaceans. Molecular Phylogenetics and Evolution 53 (3): 891-906.
We will use the Global Biodiversity Information Facility (GBIF) to search for records of these species, and arbitrarily chose the first record returned as the location for the species. (sometimes the first record has no lat-long info, and we need to find a better one)
dat1 <- rgbif::occ_search(scientificName = polarbear, fields=c('name','decimalLatitude', 'decimalLongitude'), limit = 10)
locsdat1 <- cbind(polarbear, dat1$data) #labels the lat long columns with the species name
colnames(locsdat1)<-c("species","Long","Lat")
dat2 <- rgbif::occ_search(scientificName = hippo, fields=c('name','decimalLatitude', 'decimalLongitude'), limit = 10)
locsdat2 <- cbind(hippo, dat2$data)
colnames(locsdat2)<-c("species","Long","Lat")
dat3 <- rgbif::occ_search(scientificName = boar, fields=c('name','decimalLatitude', 'decimalLongitude'), limit = 10)
locsdat3 <- cbind(boar, dat3$data)
colnames(locsdat3)<-c("species","Long","Lat")
dat4 <- rgbif::occ_search(scientificName = gorilla, fields=c('name','decimalLatitude', 'decimalLongitude'), limit = 10)
locsdat4 <- cbind(gorilla, dat4$data)
colnames(locsdat4)<-c("species","Long","Lat")
dat5 <- rgbif::occ_search(scientificName = dolphin, fields=c('name','decimalLatitude', 'decimalLongitude'), limit = 10)
locsdat5 <- cbind(dolphin, dat5$data)
colnames(locsdat5)<-c("species","Long","Lat")
locs <- as.data.frame(rbind(locsdat1,locsdat2, locsdat3, locsdat4, locsdat5)) #Combine the data from each species
locs
## species Long Lat
## 1 Ursus maritimus -93.816720 58.733208
## 2 Ursus maritimus -94.152214 58.775311
## 3 Ursus maritimus -156.858589 71.226550
## 4 Ursus maritimus 12.389055 78.930327
## 5 Ursus maritimus -93.959003 58.766739
## 6 Ursus maritimus -156.723045 71.350298
## 7 Ursus maritimus -93.192528 58.649208
## 8 Ursus maritimus -20.507641 74.428475
## 9 Ursus maritimus -156.670606 71.272698
## 10 Ursus maritimus -94.104818 58.765814
## 11 Hippopotamus amphibius 37.315509 -2.705606
## 12 Hippopotamus amphibius 27.055363 -25.321550
## 13 Hippopotamus amphibius 32.461800 -28.284100
## 14 Hippopotamus amphibius 19.834524 -34.665625
## 15 Hippopotamus amphibius 32.529800 -28.133800
## 16 Hippopotamus amphibius 38.021624 -2.822808
## 17 Hippopotamus amphibius 32.408200 -28.236500
## 18 Hippopotamus amphibius 39.191352 -3.399534
## 19 Hippopotamus amphibius 32.528700 -28.133000
## 20 Hippopotamus amphibius 32.530200 -28.134200
## 21 Sus scrofa 6.425629 50.869137
## 22 Sus scrofa 8.907829 49.988293
## 23 Sus scrofa 8.509169 49.979488
## 24 Sus scrofa 8.210533 50.410290
## 25 Sus scrofa 8.952374 50.132347
## 26 Sus scrofa 8.567899 49.976768
## 27 Sus scrofa 9.147406 52.178852
## 28 Sus scrofa 10.269532 48.920700
## 29 Sus scrofa 9.723029 49.976425
## 30 Sus scrofa 7.982176 50.326050
## 31 Gorilla gorilla 12.972422 -1.600545
## 32 Gorilla gorilla 11.667805 -0.317679
## 33 Gorilla gorilla 10.053514 -2.998760
## 34 Gorilla gorilla 12.868204 -1.605592
## 35 Gorilla gorilla 13.163600 -1.624496
## 36 Gorilla gorilla 12.849542 -1.651610
## 37 Gorilla gorilla -106.663485 35.078382
## 38 Gorilla gorilla 12.894200 -1.721151
## 39 Gorilla gorilla 12.884102 -1.738712
## 40 Gorilla gorilla 12.930430 -1.711025
## 41 Tursiops truncatus -110.689321 25.008885
## 42 Tursiops truncatus -118.392640 33.811742
## 43 Tursiops truncatus -119.614499 34.057997
## 44 Tursiops truncatus -97.164837 27.940834
## 45 Tursiops truncatus 168.418270 -46.565697
## 46 Tursiops truncatus -118.255547 33.715199
## 47 Tursiops truncatus -85.314767 29.812679
## 48 Tursiops truncatus -97.392456 27.751821
## 49 Tursiops truncatus -97.066288 27.982494
## 50 Tursiops truncatus 4.045470 43.452470
locs[,1] <- gsub(' ','_',locs[,1])#On the tree, names have underscores instead of spaces.
#Sometimes the lat longs have spaces in the records.
#We use trimws to get rid of the whitespace, and as.numeric to record the the latitudes and longitudes as numbers instead of characters.
locs[,2]<-as.numeric(trimws(as.character(locs[,2])))
locs[,3]<-as.numeric(trimws(as.character(locs[,3])))
latlong <- as.matrix(locs[,c(3,2)]) #flip the columns to lat, long instead of long,lat
rownames(latlong) <- locs[,1]
Check that your species are all included and have the correct names (taxonomic name recognition sometimes updates them)
latlong
## Lat Long
## Ursus_maritimus 58.733208 -93.816720
## Ursus_maritimus 58.775311 -94.152214
## Ursus_maritimus 71.226550 -156.858589
## Ursus_maritimus 78.930327 12.389055
## Ursus_maritimus 58.766739 -93.959003
## Ursus_maritimus 71.350298 -156.723045
## Ursus_maritimus 58.649208 -93.192528
## Ursus_maritimus 74.428475 -20.507641
## Ursus_maritimus 71.272698 -156.670606
## Ursus_maritimus 58.765814 -94.104818
## Hippopotamus_amphibius -2.705606 37.315509
## Hippopotamus_amphibius -25.321550 27.055363
## Hippopotamus_amphibius -28.284100 32.461800
## Hippopotamus_amphibius -34.665625 19.834524
## Hippopotamus_amphibius -28.133800 32.529800
## Hippopotamus_amphibius -2.822808 38.021624
## Hippopotamus_amphibius -28.236500 32.408200
## Hippopotamus_amphibius -3.399534 39.191352
## Hippopotamus_amphibius -28.133000 32.528700
## Hippopotamus_amphibius -28.134200 32.530200
## Sus_scrofa 50.869137 6.425629
## Sus_scrofa 49.988293 8.907829
## Sus_scrofa 49.979488 8.509169
## Sus_scrofa 50.410290 8.210533
## Sus_scrofa 50.132347 8.952374
## Sus_scrofa 49.976768 8.567899
## Sus_scrofa 52.178852 9.147406
## Sus_scrofa 48.920700 10.269532
## Sus_scrofa 49.976425 9.723029
## Sus_scrofa 50.326050 7.982176
## Gorilla_gorilla -1.600545 12.972422
## Gorilla_gorilla -0.317679 11.667805
## Gorilla_gorilla -2.998760 10.053514
## Gorilla_gorilla -1.605592 12.868204
## Gorilla_gorilla -1.624496 13.163600
## Gorilla_gorilla -1.651610 12.849542
## Gorilla_gorilla 35.078382 -106.663485
## Gorilla_gorilla -1.721151 12.894200
## Gorilla_gorilla -1.738712 12.884102
## Gorilla_gorilla -1.711025 12.930430
## Tursiops_truncatus 25.008885 -110.689321
## Tursiops_truncatus 33.811742 -118.392640
## Tursiops_truncatus 34.057997 -119.614499
## Tursiops_truncatus 27.940834 -97.164837
## Tursiops_truncatus -46.565697 168.418270
## Tursiops_truncatus 33.715199 -118.255547
## Tursiops_truncatus 29.812679 -85.314767
## Tursiops_truncatus 27.751821 -97.392456
## Tursiops_truncatus 27.982494 -97.066288
## Tursiops_truncatus 43.452470 4.045470
Trees from OpenTree don’t automatically come with branch lengths - we need to infer some branch lengths for this tree in order to plot it. (in future we can use R-Datelife! datelife.org)
tr_bl<-ape::compute.brlen(tr)
Combine the phylogeny and the mapped locations
cols<-setNames(sample(viridis(n=ape::Ntip(tr_bl))),
tr_bl$tip.label)# This sets the colors to pretty colors
tdobj<-phytools::phylo.to.map(tr_bl,latlong,plot = FALSE, direction="rightwards")
## objective: 4
## objective: 4
## objective: 4
## objective: 2
plot(tdobj, colors=cols, direction="rightwards")
#If there is an error with 'cols' try this instead:
#plot(tdobj, direction="rightwards")
Lets try some plants
spp2 <- c("Mimulous guttatus","Arabidopsis thaliana", "Musa gracilis", "Carex capitata")
taxa2 <- rotl::tnrs_match_names(spp2)
tr2 <- rotl::tol_induced_subtree(ott_id(taxa2), label="name")
##
Progress [---------------------------------] 0/100 ( 0%) ?s
Progress [==============================] 100/100 (100%) 0s
## Warning in collapse_singles(tr, show_progress): Dropping singleton nodes with
## labels: mrcaott2ott10930, mrcaott2ott2441, mrcaott2ott969, mrcaott2ott62529,
## mrcaott2ott8379, eudicotyledons, Gunneridae, mrcaott2ott2464, mrcaott2ott8384,
## rosids, malvids, mrcaott96ott14140, mrcaott96ott50744, mrcaott96ott378,
## mrcaott378ott29446, mrcaott378ott1697, Brassicales, mrcaott378ott307071,
## mrcaott378ott32461, mrcaott378ott509555, mrcaott378ott318175,
## mrcaott378ott9635, mrcaott378ott125843, mrcaott378ott509568, mrcaott378ott28763,
## mrcaott378ott83547, Brassicaceae, mrcaott378ott299734, mrcaott378ott4673,
## mrcaott378ott4671, mrcaott4671ott58909, mrcaott4671ott6278, mrcaott6278ott15318,
## mrcaott6278ott158438, mrcaott6278ott10585, mrcaott6278ott34460,
## mrcaott6278ott9083, mrcaott9083ott19798, mrcaott19798ott31487,
## mrcaott31487ott88883, mrcaott31487ott152275, mrcaott248ott10053,
## mrcaott248ott20991, mrcaott248ott557, asterids, mrcaott248ott650,
## mrcaott248ott320, lamiids, mrcaott248ott979274, mrcaott248ott26035,
## mrcaott248ott1191, mrcaott248ott2108, Lamiales, mrcaott248ott11341,
## mrcaott248ott55259, Core_Lamiales, mrcaott248ott1016, mrcaott1016ott25224,
## mrcaott1016ott1452, mrcaott1452ott5046, mrcaott1452ott2133, mrcaott1452ott1607,
## mrcaott1452ott108490, mrcaott1452ott305916, mrcaott1452ott39175, Phrymaceae,
## Erythranthe, Liliopsida, mrcaott121ott290, Petrosaviidae, mrcaott121ott4474,
## mrcaott121ott1439, mrcaott121ott334, mrcaott121ott4575, mrcaott121ott3449,
## Zingiberales, mrcaott8470ott44226, Musaceae, Musa, mrcaott252ott213153,
## Poales, mrcaott252ott128594, mrcaott252ott1477, mrcaott1477ott591692,
## mrcaott1477ott2066, Cyperaceae, Cyperoideae, mrcaott2066ott957772,
## mrcaott2066ott297890, mrcaott2066ott57510, mrcaott2066ott262874, Cariceae,
## Carex, mrcaott2078ott4161, mrcaott4161ott193247, mrcaott4161ott980217,
## mrcaott4161ott5264517, mrcaott4161ott265965, mrcaott4161ott8152,
## mrcaott4161ott164308
tr_bl2<-ape::compute.brlen(tr2)
taxa2[,"gbif"] <- NA
locs = data.frame(matrix(vector(), nrow(taxa2), 2))
i=1
for(id in taxa2$ott_id){
tax_info <- rotl::taxonomy_taxon_info(id)
for(source in tax_info[[1]]$tax_sources){
if (grepl('gbif', source, fixed=TRUE)){
gbif_id <- strsplit(source,":")[[1]][2]
taxa2[i,]$gbif <-gbif_id
result <- rgbif::occ_search(taxonKey = gbif_id, fields=c('name','decimalLatitude', 'decimalLongitude'), limit = 5)
if (dim(result$data)[1] > 0) {
try(locs[i,1] <- as.data.frame(result$data)[2])
try(locs[i,2] <- as.data.frame(result$data)[1])
}
}}
i = i + 1}
## Warning in `[<-.data.frame`(`*tmp*`, i, 1, value =
## structure(list(decimalLatitude = c(-44.170553, : replacement element 1 has 5
## rows to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 2, value =
## structure(list(decimalLongitude = c(170.995575, : replacement element 1 has 5
## rows to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 1, value =
## structure(list(decimalLatitude = c(51.493676, : replacement element 1 has 5 rows
## to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 2, value =
## structure(list(decimalLongitude = c(-0.015267, : replacement element 1 has 5
## rows to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 1, value =
## structure(list(decimalLatitude = c(-2.361221, : replacement element 1 has 4 rows
## to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 2, value =
## structure(list(decimalLongitude = c(101.599041, : replacement element 1 has 4
## rows to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 1, value =
## structure(list(decimalLatitude = c(64.611437, : replacement element 1 has 5 rows
## to replace 1 rows
## Warning in `[<-.data.frame`(`*tmp*`, i, 2, value =
## structure(list(decimalLongitude = c(-149.292663, : replacement element 1 has 5
## rows to replace 1 rows
rownames(locs) <- gsub(' ','_',taxa2$unique_name)
colnames(locs) <- c("latitude","longitude")
One of our species got reconciled to a synonym!
plot(tr2)
Mimulous guttatus is a synonym of Erythranthe lutea https://tree.opentreeoflife.org/taxonomy/browse?id=662909
taxa2
## search_string unique_name approximate_match ott_id is_synonym
## 1 mimulous guttatus Erythranthe guttata TRUE 504496 FALSE
## 2 arabidopsis thaliana Arabidopsis thaliana FALSE 309263 FALSE
## 3 musa gracilis Musa gracilis FALSE 8478 FALSE
## 4 carex capitata Carex capitata FALSE 4161 FALSE
## flags number_matches gbif
## 1 5 7346102
## 2 1 3052436
## 3 sibling_higher 1 2762973
## 4 sibling_higher 1 2728838
Because we are mapping tips based on keys instead of names, the name replacement doesn’t matter affect our mapping
tdobj2<-phytools::phylo.to.map(tr_bl2,locs, plot = FALSE, direction="rightwards")
## objective: 4
## objective: 2
## objective: 2
plot(tdobj2, direction="rightwards")