HowTo/Phylogenetic Independent Contrasts
Phylogenetic Independent Contrasts
This example assumes you have installed R and the package ape (if not start here) and downloaded the Geospiza tree and tip data. Upload the tree and data file into your R workspace as a phylo Class for the tree (read.nexus) and a dataframe for the data (read.table).
library(ape) geodata<-read.table("geospiza.txt") geotree<-read.nexus("geospiza.nex")
The following example will calculate phylogenetic independent contrasts (Felsenstein, 1985) using the pic function in ape.
IMPORTANT you can only calculate contrasts on a single variable, the soon to be released CAIC package will allow you to run more than one trait at a time as well as discrete traits.
--Simonb 03:08, 17 March 2008 (EDT)Discrete traits can be coded as continuous variables (see ?contr.treatment) and analysed as usual, if they are independent (explanatory) variables. The BRUNCH option in CAIC (to be included in the package?) involves arbitrarily discarding data from deep in the tree. This has never been justified. Categorical dependent variables require other methods of analysis.
WARNING your tree must be fully dichotomous and there must be no gaps in your data.
So remove the outgroup "olivacea" from the tree as there are no data for it in the dataframe.
geotree <- drop.tip(geotree, "olivacea")
Create a numeric vector for the traits wingL and tarsusL
wingL <- geodata$wingL tarsusL <- geodata$tarsusL
However if you leave the data like this the analysis will run but the data will not be associated with the correct tips. So as long as your row names in your dataframe (mydata) match the tip labels in the phylogeny (mytree$tip.label) you can associate the data with the correct tip by:
names(wingL) <- row.names(geodata) names(tarsusL) <- row.names(geodata)
To calculate contrasts that have been scaled using the expected variances:
ContrastwingL <- pic(wingL, geotree) ContrasttarsusL <- pic(tarsusL, geotree)
If you want to extract the expected variances as well as the contrasts:
ContrastwingL.var <- pic(wingL, geotree, var.contrasts=TRUE) ContrasttarsusL.var <- pic(tarsusL, geotree, var.contrasts=TRUE)
If you don't want to scale the contrasts using the expected variances:
ContrastwingL <- pic(wingL, geotree, scaled=FALSE) ContrasttarsusL <- pic(tarsusL, geotree, scaled=FALSE)
--Simonb 03:08, 17 March 2008 (EDT)If you want to calculate contrasts for several variables, create a matrix with your variables in columns, then just use apply:
contrasts.geodata <- apply(geodata, 2, pic, geotree)
To see the contrasts of the wingL variable type (first with variances, in a data frame, then just the contrasts, as a vector only):
To see the contrasts plotted at the appropriate nodes, when you have saved both expected variances and contrasts to the object ContrastwingL and ContrasttarsusL, use the function nodelabels:
plot (geotree) nodelabels(round(ContrastwingL.var [,1], 3), adj = c(0, -0.5), frame="n") nodelabels(round(ContrasttarsusL.var [,1], 3), adj = c(0, 1), frame="n")
So what is this code actually doing? it is plotting the first column ([,1]) of the object ContrastwingL.var to 3 decimal places (round(ContrastwingL.var [,1], 3). You will notice that the instructions for the 2 sets of contrasts are slightly different, this is to stop the 2 values being printed upon one another!
To see the contrasts plotted at the appropriate nodes, when you have only saved the contrasts to the object ContrastwingL and ContrasttarsusL, use the function nodelabels:
plot (geotree) nodelabels(round(ContrastwingL, 3), adj = c(0, -0.5), frame="n") nodelabels(round(ContrasttarsusL, 3), adj = c(0, 1), frame="n")
To run a linear regression on the wing and tarsus contrasts when we have not extracted the variances:
RegressTarsusWing <- lm(ContrastwingL~ContrasttarsusL -1)
The -1 specifies that the regression is through the origin (the intercept is set to zero) as recommended by Garland et al., 1992.
If you have extracted the expected variances there will be more than one column in the objects ContrastwingL and ContrasttarsusL so you need to specify that the contrasts are in the first column, using ContrastwingL.var[,1]:
RegressTarsusWing <- lm(ContrastwingL ~ ContrasttarsusL -1)
To see the regression statistics:
To plot the contrasts to visualize the regression - which is always wise to check that they meet the assumption that there is a linear relationship between the 2 variables:
To add the regression line to the plot you add the regression coefficients from the linear regression.
You can change lots of things about the plot such as the line thickness, colour etc. to see more information type:
--Simonb 19:53, 17 March 2008 (EDT)Note that using independent contrasts and fitting regressions through the origin is numerically identical to using PGLS, assuming equal and constant within-species variances (which corresponds to using an ultrametric tree). See Garland and Ives (2000) and Rohlf (2001). For most purposes, it is usually easier and more flexible to use PGLS.
Felsenstein, J. (1985) Phylogenies and the comparative method. Am. Nat. 125, 1-15.
Garland, Jr. T., Harvey, P.H. & Ives, A. R. (1992) Procedures for the analysis of comparative data using phylogenetically independent contrasts. Syst. Biol 41, 18-32.
Purvis, A. & Rambaut, A. (1995) Comparative Analysis by Independent Contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Bioinformatics 11(3), 247-251.
Garland, T., Jr. and A. R. Ives (2000). Using the past to Predict the Present: Confidence Intervals for Regression Equations in Phylogenetic Comparative Methods. The American Naturalist, Vol. 155, No. 3. (Mar., 2000), pp. 346-364.
Rohlf, F. J. (2001). Comparative Methods for the Analysis of Continuous Variables: Geometric Interpretations. Evolution Vol. 55, No. 11 (Nov., 2001), pp. 2143-2160