Statistical Exploration of Landscapes of Phylogenetic Trees • treespace

treespace implements new methods for the exploration and analysis of distributions of phylogenetic trees for a given set of taxa.

Installing treespace

To install the development version from github:

library(devtools)
install_github("thibautjombart/treespace")

The stable version can be installed from CRAN using:

install.packages("treespace")

Then, to load the package, use:

library("treespace")

## Loading required package: ape

## Loading required package: ade4

## Registered S3 methods overwritten by 'adegraphics':
##   method         from
##   biplot.dudi    ade4
##   kplot.foucart  ade4
##   kplot.mcoa     ade4
##   kplot.mfa      ade4
##   kplot.pta      ade4
##   kplot.sepan    ade4
##   kplot.statis   ade4
##   scatter.coa    ade4
##   scatter.dudi   ade4
##   scatter.nipals ade4
##   scatter.pco    ade4
##   score.acm      ade4
##   score.mix      ade4
##   score.pca      ade4
##   screeplot.dudi ade4

Content overview

The main functions implemented in treespace are:

treespace: explore landscapes of phylogenetic trees
treespaceServer: open up an application in a web browser for an interactive exploration of the diversity in a set of trees
findGroves: identify clusters of similar trees
plotGroves: scatterplot of groups of trees, and plotGrovesD3 which enables interactive plotting based on d3.js
medTree: find geometric median tree(s) to summarise a group of trees
wiwTreeDist: find the distance between transmission trees by comparing their MRCI depth matrices
wiwMedTree: find the median of a list of transmission scenarios
relatedTreeDist: calculate the distances between trees whose tips belong to the same categories but are not necessarily identically labelled
treeConcordance: calculate the concordance between a category tree and an individuals tree

Other functions are central to the computations of distances between trees:

treeVec: characterise a tree by a vector
treeDist: find the distance between two tree vectors
multiDist: find the pairwise distances of a list of trees
refTreeDist: find the distances of a list of trees from a reference tree
tipDiff: for a pair of trees, list the tips with differing ancestry
plotTreeDiff: plot a pair of trees, highlighting the tips with differing ancestry
findMRCIs: find the most recent common infector (MRCI) matrix from “who infected whom” information
tipsMRCAdepths: similar to treeVec but the output is a matrix where columns 1 and 2 correspond to tip labels and column 3 gives the depth of the MRCA of that pair of tips

Distributed datasets include:

woodmiceTrees: illustrative set of 201 trees built using the neighbour-joining and bootstrapping example from the woodmice dataset in the ape documentation.
DengueTrees: 500 trees sampled from a BEAST posterior set of trees from (Drummond and Rambaut, 2007)
DengueSeqs: 17 dengue virus serotype 4 sequences from (Lanciotti et al., 1997), from which the DengueTrees were inferred.
DengueBEASTMCC: the maximum clade credibility (MCC) tree from the DengueTrees.

Documentation

treespace comes with the following vignettes:

introduction: general introduction using a worked example.
Dengue example: worked example using a Dengue dataset, used in the treespace publication.
transmission trees: worked example using transmission trees.
tip categories: introduction to the measures for comparing trees with shared tip label “categories”