Introduction

03/02/20: BAli-Phy 3.5.0 released - Download
Covarion models + automatic ancestral state reconstruction + bug fixes (release notes)

BAli-Phy is software by Ben Redelings that estimates multiple sequence alignments and evolutionary trees from DNA, amino acid, or codon sequences. It uses likelihood-based evolutionary models of substitutions and insertions and deletions to place gaps. It has been used in published analyses on data sets up to 117 taxa.

High alignment accuracy: Redelings (2014) showed that BAli-Phy had 3.5 times fewer alignment errors than MUSCLE and MAFFT on simulated data:

Figure 4. from Redelings BD. Erasing Errors Due to Alignment Ambiguity When Estimating Positive Selection. Mol. Biol. Evo. 31(8), 2014.

Eliminate bias: Fletcher and Yang (2010) showed that relying on a ClustalW alignment estimate could lead to a 99% false-positive rate in detecting positive selection. Inferring evolutionary trees and branch lengths from a single alignment can also lead to bias if the alignment is ambiguous. BAli-Phy solves the problem of alignment bias by using MCMC to estimate evolutionary trees, positive selection, and branch lengths while averaging over alternative alignments in a Bayesian paradigm.

uncertain                                          certain
....310.......320.......330.......340.......350.......360.......370.......
Thermotoga DEVEIIGLSYEIKKTV---VTSVEMFRKELDEGIAGDNVGCLLRGIDKDEVERGQVLA-----APGSIKPHKRF
Anacystis ETIEIVGLR-DTRSTT---VTGVEMFQKTLDEGLAGDNVGLLLRGIQKTDIERGMVLA-----KPGSITPHTKF
Escheria EEVEIVGIK-ETQKST---CTGVEMFRKLLDEGRAGENVGVLLRGIKREEIERGQVLA-----KPGTIKPHTKF
Pyrococcus EVVIFEPASTIFHKPIQGEVKSIEMHHEPLEEALPGDNIGFNVRGVSKNDIKRGDVAGHTTN-PPTVVRTKDTF
Halobacterium DNVSFQPSDVG------GEVKTIEMHHEEVPNAEPGDNVGFNVRGIGKDDIRRGDVCGPADD-PPSVA---DTF
Methanococcus DKVVFEPAGAI------GEIKTVEMHHEQLPSAEPGDNIGFNVRGVGKKDIKRGDVLGHTTN-PPTVA---TDF
Aeropyrum DKVVFMPPGVV------GEVRSIEMHYQQLQQAEPGDNIGFAVRGVSKSDIKRGDVAGHLDK-PPTVA---EEF
Sulfolobus DKIVFMPVGKI------GEVRSIETHHTKIDKAEPGDNIGFNVRGVEKKDVKRGDVAGSVQN-PPTVA---DEF
Giardia MKVVFAPTSQV------SEVKSVEMHHEELKKAGPGDNVGFNVRGLAVKDLKKGYVVGDVTNDPPVGC---KSF
Homo MVVTFAPVNVT------TEVKSVEMHHEALSEALPGDNVGFNVKNVSVKDVRRGNVAGDSKNDPPMEA---AGF
Euglena DVVTFAPNNLT------TEVKSVEMHHEALTEAVPGDNVGFNVKNVSVKDIRRGYVASNAKNDPAKEA---ADF
Nicotiana MVVTFGPTGLT------TEVKSVEMHHEALQEALPGDNVGFNVKNVAVKDLKRGFVASNSKDDPAKGA---ASF
 

This ambiguity can be displayed graphically in an alignment uncertainty (AU) plot.

Model-based alignment: BAli-Phy can make use of complex substitution models while estimating alignments (and trees). These include the free-rates and Gamma+INV models, codon models such as the M3 and M8 models, and covarion models such as Tuffley-Steel.

Fixed-alignment: BAli-Phy can also estimate phylogenies from a fixed alignment (like MrBayes and BEAST) using complex substitution models like GTR+gamma.

Multi-gene: BAli-Phy automatically estimates relative rates for each gene, as described in the Manual and the tutorial.

Ancestral sequence reconstruction: BAli-Phy automatically reconstructs ancestral sequences (with gaps) for each gene, while averaging over both topological and alignment uncertainty, as described in the Manual.

References

  1. Redelings BD and Suchard MA Joint Bayesian Estimation of Alignment and Phylogeny, Systematic Biology, 54(3):401-418, 2005    [PDF]
  2. Suchard MA and Redelings BD BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny, Bioinformatics, 22:2047-2048, 2006.     [PDF]
  3. Redelings BD and Suchard MA. Incorporating indel information into phylogeny estimation for rapidly emerging pathogens. BMC Evolutionary Biology, 7:40, 2007.    [PDF]
  4. Redelings BD. Erasing Errors Due to Alignment Ambiguity When Estimating Positive Selection. Mol. Biol. Evo. 31(8), 2014. [WWW]

comments and suggestions: benjamin . redelings * gmail + com