BAli-Phy User's Guide v4.0-beta13

Question

15.1.1.

Does BAli-Phy accept the wildcard characters "N" or "X"? How does it treat them?

Answer 1

Yes, BAli-Phy accepts the wildcard characters "N" (for DNA) and "X" (for proteins). These characters indicate that some letter is present (as opposed to a gap), but that you don't know which letter it is.

Answer 2

No. "?" characters are often used to indicate either letter presence (e.g. "N", "X") or absence (e.g. "-"). BAli-phy will insist that you replace each "?" with either "N"/"X" or "-" to indicate which one you mean.

(Most programs ignore indels and consider only substitutions, and in that case "N" and "-" have the same effect on the likelihood or parsimony score. However, since BAli-Phy takes indels into account, these two alternatives are quite different.)

Answer 3

Yes. BAli-Phy accepts the characters Y, R, W, S, K, M, B, D, H, and V for DNA, RNA, and Codon alphabets. BAli-Phy also accepts the characters B, Z, and J for amino acids. These characters indicate partial knowledge about a letter. For example, R indicates that a nucleotide is present, and is a puRine (A or G). J indicates that an amino acid is present and is either I or L.

(Note that sequences sometimes contain such ambiguity codes because the DNA that was sequenced contains both values. This might occur when sequencing a heterozygote or when sequencing pooled DNA from several individuals. However, the model in BAli-Phy (and other phylogeny inference programs) is that only one letter is correct, but we do not know which one it is. This is probably not problematic when dealing with pooled sequences, but should be considered.)

Answer 4

Yes. Add -Inone or -I none on the command line.

Answer 5

Yes. Add --fix=topology --tree=treefile on the command line.

Answer 6

Yes. Add --fix=tree --tree=treefile on the command line.

Answer 7

Yes. Add --fix=tree --tree=treefile '--scale=~gamma(0.5,2)' on the command line.

Answer 8

You are probably using the C-shell as your command line shell. It is trying to interpret lg08 +> Rates.gamma(6) as an array before running the command, and it is not succeeding. Therefore, it doesn't even run bali-phy.

To avoid this, put quotes around the substitution model, like this: -S 'lg08 +> Rates.gamma(6)'. This will keep the C-shell from interfering with your command.

Answer 9

It runs until you stop it. Stop it when its done.

The longer answer is that is is hard to predict how long MCMC will take to converge, since it depends on each data set in complex ways. Automatic rules for determining when to stop an MCMC chain can be difficult to get right. BAli-Phy does not contain an automatic stopping rule yet, so it relies on the user to run convergence diagnostics and determine when to stop the run.

Answer 10

Simply kill the process -- there is no special command to stop bali-phy. If you are running it on your personal workstation, then you can use the command kill. To do that, you need to find the PID (process ID) of the running program. You can find this by examining the beginning of the file C1.out. For example:

% less 5d-1/C1.out
command: bali-phy 5d.fasta -I none --iter=10 --seed=0
start time: Wed Jul  4 17:13:25 2018

VERSION: 3.3-b1  [HEAD -> logging, origin/logging commit 96a43e550]  (Jul 04 2018 16:25:09)
BUILD: Jul  4 2018 17:12:29
ARCH: linux x86_64
COMPILER: gcc 8.1.0 x86_64
directory: /home/bredelings/Work
subdirectory: 5d-675
hostname: telomere
PID: 18838
...

Here the PID is 18838. Therefore you can type:

% kill 18838

On some operating systems you can also type:

% killall bali-phy

However, be aware that this will terminate all of your bali-phy runs on that computer.

Answer 11

Simply terminate the submitted job. The specific command to terminate a job will depend on the queue manager that is installed on your cluster. Examine the documentation for your cluster, or ask your cluster support staff how to delete running jobs on your cluster.

As an example, if the SGE software is used to submit jobs, then the command qstat should list your jobs and their job ID numbers (which is different than the process ID number). You can then use the command qdel to delete jobs by ID number. The SGE documentation describes how to use these commands.

Answer 12

You can stop when it has both converged and also run for long enough to give you >1000 effectively independent samples.

Answer 13

See section Section 11, “Convergence and Mixing: Is it done yet?”.

Answer 14

Run wc -l C1.log inside the output directory, and subtract 2.

Answer 15

The program draw-tree was not distributed on this platform (Windows, Mac). This is not a fatal error message, it just means that a pretty picture of the tree will not be generated automatically. You can still view the tree with FigTree, for example.

Answer 16

This is because you have not installed gnuplot. This is not a fatal error message, it just means that pictures of partition support, and SRQ plots will not be generated automatically.

Answer 17

This is because you have not installed R. This is not a fatal error message, it just means that a plot showing differences in clade probabilities between runs will not be generated.

Answer 18

Look in the file Results/commands.log. This should contain the actual commands that were run, along with error message from these commands. These error message should give you a hint as to what the problem might be.

Answer 19

Actually, BAli-Phy uses unrooted trees, so it only estimates bi-partition support. A bi-partition is a division of taxa into two groups, but it does not specify which group contains the root.

Answer 20

After you analyze the output (Section 5.4, “Summarizing the output - scripted”), the partition support is indicated in Results/consensus and in Results/c50.PP.tree.

Answer 21

% alignment-cat filename1.fasta filename2.fasta > result.fasta

The alignments must have the same sequence names, but the names need not be in the same order.

Answer 22

You can now select columns for analysis by specifying a range:

% bali-phy sequences.fasta:1-200,401-600 sequences.fasta:201-400

You can create a new alignment from selected columns using alignment-cat:

% alignment-cat -c1-10,50-100,600- filename.fasta > result.fasta

The resulting alignment will contain the selected columns in the order you specified.

Answer 23

To constrain the alignment to match some alignment file filename.fasta in columns 100, 200-250, and 300, run:

% alignment-indices -c100,200-250,300 filename.fasta > filename.constraint

UNIX-style	Windows-style
/home/username	C:\cygwin64\home\username
~/file	C:\cygwin64\home\username\file
/cygdrive/c/file	C:\file

C1.out	Iteration numbers, probabilities, success probabilities for transition kernels, etc..
C1.P $n$ .fastas	Sampled alignments for partition $n$ including ancestral sequences.
C1.err	Log file for hopefully irrelevant error messages.
C1.MAP	Successive estimates of the MAP alignment, tree and parameters.
C1.log	Numeric parameters: indel and substitution rates, etc. (One sample per line.)
C1.trees	Tree samples in Newick format. (One sample per line.)
C1.run.json	JSON file containing information about the command line, models, hostname, start time, etc.

prior	The log prior probability.
likelihood	The log likelihood.
posterior	The log of the posterior probability. (The posterior probability is the product of the prior and the likelihood).
prior_A	The log-probability of the alignments in all partitions.
\|A\|	The total number of alignment columns across all partitions.
#indels	The total number of indel events across all partitions. (Adjacent indels that occur on the same branch are merged).
\|indels\|	The total length of indel events across all partitions. (Adjacent indels that occur on the same branch are merged).
#substs	The total unweighted parsimony score for substitutions across all partitions.
P $n$ /likelihood	The substitution log-likelihood for partition $n$ .
P $n$ /prior_A	The log-probability of the alignment for partition $n$ .
P $n$ /\|A\|	The length of the alignment in the $n$ th partition.
P $n$ /#indels	The number of indel events in partition $n$ , if we group adjacent indels that occur on the same branch.
P $n$ /\|indels\|	The length of indel events in partition $n$ , if we group adjacent indels that occur on the same branch.
P $n$ /#substs	The unweighted parsimony score for substitutions in partition $n$ .
Scale[ $m$ ] * \|T\|	The scaled branch lengths for partition group $m$ .
\|T\|	The unscaled tree length. (This will probably be around 1.0).
Scale[ $m$ ]	The average number of substitutions per site on the entire tree for partitions in the $m$ th scale group.
S $n$ /`name`	Parameter `name` in the $n$ th substitution model.
I $n$ /`name`	Parameter `name` in the $n$ th insertion/deletion model.

Report	A summary of numerical parameters: credible intervals and mixing.
consensus	A summary of supported splits (clades).
c-levels.plot	The number of splits (clades) supported at each LOD level.
c50.tree	The majority consensus topology + branch lengths (Newick format)
c50.PP.tree	The majority consensus topology + branch lengths + Posterior Probabilities (Newick format)
MAP.tree	An estimate of the MAP topology + branch lengths (Newick format)

P`p`-max.fasta	An estimate of the alignment for partition `p` using maximum posterior decoding.
P`p`-max-AU.html	An AU plot of the maximum posterior decoding alignment for partition `p` (AA/DNA color-scheme).

partitions.bs	Confidence intervals on the support for partitions, generated using a block bootstrap.
partitions.SRQ	A collection of SRQ plots for the supported partitions.
c50.SRQ	An SRQ plot for the majority consensus tree.

Model	d.f.	Summary
`jc69`	0	Equal rates and equal base frequencies. (Jukes and Cantor, 1969)
`k80`	1	Unequal transition & transversion rates, equal base frequencies. (Kimura, 1980)
`f81`	3	Equal exchangeabilities, unequal frequencies. (Felsenstein, 1981)
`hky85`	4	Unequal Transition & transversion rates, unequal base frequencies. (Hasegawa, Kishino, and Yano, 1985)
`tn93`	5	Unequal rates for transitions (purines), transitions (pyrimidines) and transversions, unequal base frequencies. (Tamura and Nei, 1993)
`gtr`	8	Unequal exchangeabilities, unequal frequencies. (Tavare, 1986)

Model	d.f.	Summary
`jc69`	0	Equal rates and equal frequencies. (Jukes and Cantor, 1969)
`f81`	19	Equal exchangeabilities, unequal frequencies. (Felsenstein, 1981)
`jtt +> f`	19	Empirical exchange rates, all proteins. (Jones, Taylor, and Thornton, 1992)
`wag +> f`	19	Empirical exchange rates, all proteins. (Whelan and Goldman, 2001)
`lg08 +> f`	19	Empirical exchange rates, all proteins. (Le and Gascuel, 2008)
`empirical(file) +> f`	19
`gtr`	208	Unequal exchangeabilities, unequal frequencies. (Tavare, 1986)

Model	d.f.	Summary
`Frequencies.uniform`	0	Equal frequencies
`wag_freq`	0	The constant amino-acid frequencies from the WAG paper.
`lg08_freq`	0	The constant amino-acid frequencies from the LG08 paper.

Model	d.f.	Summary
`nuc_model +> x2`	df(nuc_model)	The the same as `nuc_model`, but on dinucleotides instead of nucleotides. Simultaneous changes of both letters are not allowed. Dinucleotide frequencies are the product of independent nucleotide frequencies.
`nuc_model +> x2 +> mut_sel`	df(nuc_model)+15	Mutation-selection model: neutral mutation follows `nuc_model` and scaled selection coefficients 2Ns on dinucleotides. Simultaneous changes of both letters are not allowed.
`nuc_model +> x2_sym +> f`	df(nuc_model)+15	This model has separate frequencies for each dinucleotide. Simultaneous changes of both letters are not allowed.
`RNA.m16a`	19	This model has separate frequencies for each dinucleotide, and distinguishes between transitions and transversion between match states (including GU/UG). Simultaneous changes of both letters are allowed, but only between match states. (Savill et al., 2001)
`gtr`	134	Unequal exchangeabilities, unequal frequencies. It is unlikely that you would want to use this model, since it has so many parameters. (Tavare, 1986)

Model	d.f.	Summary
`Frequencies.uniform`	0	Equal frequencies
`f1x4`	3	Constructs triplet frequencies from independent nucleotide frequencies.
`f3x4`	9	Constructs triplet frequencies from independent nucleotide frequencies for each codon position.

Model	d.f.	Summary
`gy94`	62	Model of dN/dS with a separate frequency for each codon. Rate for changing a nucleotide depends on neighboring nucleotides. (Goldman and Yang, 1994)
`gy94(pi=f1x4)`	5	The GY94 model with codon frequencies constructed from nucleotide frequencies. (Goldman and Yang, 1994)
`gy94(pi=f3x4)`	11	The GY94 model with codon frequencies constructed from nucleotide frequencies for each codon position. (Goldman and Yang, 1994)
`gy94_ext(nuc_model)`	df(`nuc_model`)+61	GY94 model extended with a generic nucleotide exchangeability matrix. (Goldman and Yang, 1994)
`mg94`	4	Model of dN/dS with f81 as the neutral model. Rate for changing a nucleotide depends only on that nucleotide. (Muse and Gaut, 1994)
`mg94k`	5	Model of dN/dS with hky85 as the neutral model. (Muse and Gaut, 1994)
`mg94_ext(nuc_model)`	df(`nuc_model`)+1	Model of dN/dS with `nuc_model` as the neutral model. (Muse and Gaut, 1994)
`fMutSel`	65	MG94-like model with fitnesses for each codon. (Yang and Nielsen, 2008)
`fMutSel0`	24	MG94-like model with fitnesses for each amino-acid. (Yang and Nielsen, 2008)
`nuc_model +> x3_sym +> f`	df(nuc_model)+60	GY94-style rate matrix constructed from nucleotide exchangeability matrix (dN/dS = 1). This model should give the same likelihood as `nuc_model` on codons only if the frequency of stop codons is zero.
`nuc_model +> x3`	df(nuc_model)	MG94-style rate matrix constructed from nucleotide rate matrix (dN/dS = 1).
`codon_model +> dNdS(omega)`	df(`codon_model`)+1	Scales non-synonymous rates by `omega`.
`codon_model +> mut_sel`	df(`codon_model`)+60	Mutation-selection model with neutral mutation following `codon_model` and scaled selection coefficients 2Ns for each codon.
`nuc_model +> x3 +> mut_sel_aa`	df(`nuc_model`)+19	Mutation-selection model with neutral mutation following `nuc_model` and scaled selection coefficients 2Ns for each amino acid.

Name	Number	Description
standard	1	Standard
mt-vert	2	Mt: Vertebrate
mt-yeast	3	Mt: Yeast
mt-protozoa	4	*: Mold, Protozoan and Coelenterate Mitochondrial Code and Mycoplasma/Spiroplasma
mt-invert	5	Mt: Invertebrate
nuc-ciliate	6	Nuc: Ciliate, Dasycladacean and Hexamita
mt-echinoderm	9	Mt: Echinoderm and Flatworm
nuc-euplotid	10	Nuc: Euplotid
bacteria	11	*: Bacterial, Archaeal and Plant Plastid
nuc-yeast-alt	12	Nuc: Alternative Yeast
mt-ascidian	13	Mt: Ascidian
mt-flatworm-alt	14	Mt: Alternative Flatworm
nuc-blepharisma	15	Nuc: Blepharisma Nuclear Code
mt-chlorophycean	16	Mt: Chlorophycean
mt-trematode	21	Mt: Trematode
mt-scenedesmus-obliquus	22	Mt: Scenedesmus obliquus
mt-thraustochytrium	23	Mt: Thraustochytrium
mt-rhabdopleuridae	24	Mt: Rhabdopleuridae
bacteria-sr1	25	*: Candidate Division SR1 and Gracilibacteria
nuc-pachysolen-tannophilus	26	Nuc: Pachysolen tannophilus
nuc-karyorelict	27	Nuc: Karyorelict
nuc-condylostoma	28	Nuc: Condylostoma
nuc-mesodinium	29	Nuc: Mesodinium
nuc-peritrich	30	Nuc: Peritrich
nuc-blastocrithidia	31	Nuc: Blastocrithidia
mt-cephalodiscidae	33	Mt: Cephalodiscidae UAA-Tyr

Model	d.f.	Summary
`m1a`	df(`submodel`)+2	A mixture of conserved and neutral sites. (Wong et al., 2004)
`m2a`	df(`submodel`)+4	A mixture of conserved, neutral, and positively-selected sites. (Wong et al., 2004)
`m2a_test`	df(`submodel`)+4	A Bayesian test for positive selection that compares M2a with M1a. (Wong et al., 2004)
`m3`	df(`submodel`)+2* $n$ -1	An free mixture of $n$ categories of conserved dN/dS values. (Yang et al., 2000)
`m3_test`	df(`submodel`)+2* $n$ +1	A Bayesian test for positive selection based on the M3 model extended with an extra category of either neutral of positively-selected sites.
`m7`	df(`submodel`)+2	The M7 model places a beta distribution on dN/dS. (Yang et al., 2000)
`m8a`	df(`submodel`)+3	The M8a model adds a category of neutral sites to the M7 model. (Swanson et al., 2003)
`m8`	df(`submodel`)+4	The M8 model adds a category of positively-selected sites to the M7 model. (Yang et al., 2000)
`m8a_test`	df(`submodel`)+4	A Bayesian test for positive selection that compares the M8 to the M8a model. (Swanson et al., 2003)
`branch_site`	df(`submodel`)+4	A Bayesian test for positive selection that on some (unknown) sites and some (known) branches. (Zhang et al., 2005)

Model	d.f.	Summary
`submodel +> Rates.gamma`	df(`submodel`)+1	Site rates follow a discrete approximation to the Gamma distribution (Yang, 1994)
`submodel +> Rates.log_normal`	df(`submodel`)+1	Site rates follow a discrete approximation to the logNormal distribution
`submodel +> Rates.free`	df(`submodel`)+2( $n$ -1)	Sites fall in one of $n$ categories. Each category has its own rate. (Yang, 1995)
`submodel +> multi_rate(dist)`	df(`submodel`)+df(`dist`)	Site rates follow a discrete approximation to the distribution `dist`.
`submodel +> inv`	df(`submodel`)+1	Some fraction inv:p_inv of sites are invariable.

Model	d.f.	Summary
`Q +> Covarion.ts98`	df(`submodel`)+2	Each state in rate matrix Q is split into an ON and OFF variant. Models burstiness. (Tuffley and Steel, 1998)
`Q +> Rates.gamma +> Covarion.hb02` `submodel +> Covarion.hb02`	df(`Q+Rates.gamma`)+2 df(`submodel`)+2	Combines Gamma (or other) rate heterogeneity with the Tuffley-Steel model. (Huelsenbeck, 2002)
`Q +> Rates.gamma +> Covarion.gt01` `submodel +> Covarion.gt01`	df(`Q+Rates.gamma`)+2 df(`submodel`)+2	Allows switching between Gamma (or other) rate classes over time. Models changes in conservation. (Galtier, 2001)
`Q +> Rates.gamma +> Covarion.wssr07` `submodel +> Covarion.wssr07`	df(`Q+Rates.gamma`)+4 df(`submodel`)+4	Allows switching between ON/OFF states and also between Gamma (or other) rate classes over time. Models both burstiness and changes in conservation. (Wang et al., 2007)

15.2.1.	Can I fix the alignment and ignore indel information, like MrBayes, BEAST, PhyloBayes and other MCMC programs?
	Yes. Add `-Inone` or `-I none` on the command line.
15.2.2.	Can I fix the tree topology, while allowing the alignment to vary?
	Yes. Add `--fix=topology --tree=treefile` on the command line.
15.2.3.	Can I fix the tree topology and absolute branch lengths in all data partitions, while allowing the alignment to vary?
	Yes. Add `--fix=tree --tree=treefile` on the command line.
15.2.4.	Can I fix the tree topology and relative branch lengths, while allowing the alignment to vary?
	Yes. Add `--fix=tree --tree=treefile '--scale=~gamma(0.5,2)'` on the command line.

15.4.1.	Why is bali-phy still running? How long will it take?
	It runs until you stop it. Stop it when its done. The longer answer is that is is hard to predict how long MCMC will take to converge, since it depends on each data set in complex ways. Automatic rules for determining when to stop an MCMC chain can be difficult to get right. BAli-Phy does not contain an automatic stopping rule yet, so it relies on the user to run convergence diagnostics and determine when to stop the run.
15.4.2.	How do I stop a bali-phy run on my personal computer?
	Simply kill the process -- there is no special command to stop bali-phy. If you are running it on your personal workstation, then you can use the command kill. To do that, you need to find the PID (process ID) of the running program. You can find this by examining the beginning of the file `C1.out`. For example: `%` less 5d-1/C1.out command: bali-phy 5d.fasta -I none --iter=10 --seed=0 start time: Wed Jul 4 17:13:25 2018 VERSION: 3.3-b1 [HEAD -> logging, origin/logging commit 96a43e550] (Jul 04 2018 16:25:09) BUILD: Jul 4 2018 17:12:29 ARCH: linux x86_64 COMPILER: gcc 8.1.0 x86_64 directory: /home/bredelings/Work subdirectory: 5d-675 hostname: telomere PID: 18838 ... Here the PID is 18838. Therefore you can type: `%` kill 18838 On some operating systems you can also type: `%` killall bali-phy However, be aware that this will terminate all of your bali-phy runs on that computer.
15.4.3.	How do I stop a bali-phy run on a computing cluster?
	Simply terminate the submitted job. The specific command to terminate a job will depend on the queue manager that is installed on your cluster. Examine the documentation for your cluster, or ask your cluster support staff how to delete running jobs on your cluster. As an example, if the SGE software is used to submit jobs, then the command qstat should list your jobs and their job ID numbers (which is different than the process ID number). You can then use the command qdel to delete jobs by ID number. The SGE documentation describes how to use these commands.
15.4.4.	So, how can I know when to stop it?
	You can stop when it has both converged and also run for long enough to give you >1000 effectively independent samples.
15.4.5.	How can I tell when the chain has converged?
	See section Section 11, “Convergence and Mixing: Is it done yet?”.
15.4.6.	How can I check how many iterations the chain has finished?
	Run wc -l C1.log inside the output directory, and subtract 2.

15.5.1.	Why does bp-analyze say "Program 'draw-tree' not found. Tree pictures will not be generated"?
	The program draw-tree was not distributed on this platform (Windows, Mac). This is not a fatal error message, it just means that a pretty picture of the tree will not be generated automatically. You can still view the tree with FigTree, for example.
15.5.2.	Why does bp-analyze say "Program 'gnuplot' not found. Trace plots will not be generated"?
	This is because you have not installed gnuplot. This is not a fatal error message, it just means that pictures of partition support, and SRQ plots will not be generated automatically.
15.5.3.	Why does bp-analyze say "Program 'R' not found. Some mixing graphs will not be generated"?
	This is because you have not installed R. This is not a fatal error message, it just means that a plot showing differences in clade probabilities between runs will not be generated.
15.5.4.	Why is bp-analyze stopping early, or failing to generate some files?
	Look in the file `Results/commands.log`. This should contain the actual commands that were run, along with error message from these commands. These error message should give you a hint as to what the problem might be.

15.6.1.	How do I compute the clade support?
	Actually, BAli-Phy uses unrooted trees, so it only estimates bi-partition support. A bi-partition is a division of taxa into two groups, but it does not specify which group contains the root.
15.6.2.	How do I compute the split/bi-partition support?
	After you analyze the output (Section 5.4, “Summarizing the output - scripted”), the partition support is indicated in `Results/consensus` and in `Results/c50.PP.tree`.

15.7.1.	How do I concatenate alignments?
	`%` alignment-cat `filename1.fasta` `filename2.fasta` > result.fasta The alignments must have the same sequence names, but the names need not be in the same order.
15.7.2.	How do I select columns from an alignment?
	You can now select columns for analysis by specifying a range: `%` `bali-phy sequences.fasta:1-200,401-600 sequences.fasta:201-400` You can create a new alignment from selected columns using `alignment-cat`: `%` alignment-cat -c1-10,50-100,600- `filename.fasta` > result.fasta The resulting alignment will contain the selected columns in the order you specified.
15.7.3.	How do I create an alignment-constraint file from an alignment?
	To constrain the alignment to match some alignment file `filename.fasta` in columns 100, 200-250, and 300, run: `%` alignment-indices -c100,200-250,300 `filename.fasta` > filename.constraint

BAli-Phy User's Guide v4.0-beta13

Benjamin Redelings

1. Introduction

2. Installation

2.1. Hardware requirements

2.2. Upgrades

2.3. Install on MS Windows

2.3.1. Install a Unix command line: Cygwin (recommended)

Note

2.4. Install on Mac OS X

2.4.1. Install BAli-Phy using homebrew (recommended)

2.4.2. Install BAli-Phy using executables from website (alternative)

2.4.3. Install programs used by bp-analyze using homebrew

2.4.4. Install some of the programs used for viewing the results using homebrew

2.5. Install on Linux

2.5.1. Install BAli-Phy using apt-get

2.5.2. Install BAli-Phy using executables from website (alternative)

2.5.3. Install programs used by bp-analyze

2.5.4. Install programs used to view the results

2.6. Add BAli-Phy to your PATH

2.6.1. Is bali-phy in your PATH already?

2.6.2. Quick version

2.6.3. I have a path?

2.6.4. Examining your PATH

2.6.5. Adding BAli-Phy to your PATH

2.6.6. Making the change stick

2.7. Test the installed software

2.8. Install programs used for viewing the results

3. Running the program

3.1. Quick Start

3.2. Input

3.3. Command line options

3.4. Option files (Scripts)

3.5. Running on computing clusters

4. Input

4.1. Sequence formats

4.2. Is my data set too large?

4.2.1. Too many taxa?

4.2.2. Sequences too long?

5. Output

5.1. Output directory

5.2. Output files

5.2.1. Field names in C1.log

5.3. Summarizing the output

5.3.1. Finding the majority consensus tree

5.3.2. Finding the greedy consensus tree

5.3.3. Finding the M.A.P. tree

5.3.4. Checking topology convergence

5.3.5. Summarizing numerical parameters

5.3.6. Computing an alignment using Posterior Decoding

5.3.7. Create an Au (Alignment Uncertainty) plot

5.4. Summarizing the output - scripted

5.4.1. Meaning of generated files

5.4.2. Mixing/partitions.bs: partition mixing

6. Substitution models

6.1. DNA and RNA models

6.1.1. Substitution rates

6.1.2. Frequencies

6.2. Protein models

6.2.1. Substitution rates

6.2.2. Frequencies

6.3. Doublet models (RNA stems)

6.3.1. Doublet data

6.3.2. Substitution rates

6.3.3. Frequencies

6.3.4. Branch lengths

6.4. Triplet models

6.4.1. Substitution rates

6.4.2. Frequencies

6.5. Codon models

6.5.1. Substitution rates

6.5.2. Frequencies

6.5.3. Genetic Codes

6.5.4. Heterogeneous dN/dS and tests for positive selection

6.5.5. The branch-site substitution model

6.6. Heterogenous Rates across Sites

6.7. Heterotachy models

Note

7. Insertion/deletion models

8. Models and Priors

2.6. Add BAli-Phy to your `PATH`

2.6.4. Examining your `PATH`

2.6.5. Adding BAli-Phy to your `PATH`

5.2.1. Field names in `C1.log`

5.4.2. `Mixing/partitions.bs`: partition mixing

8.2. Models and '`+>`' notation

9.5. Linking models via the `link` command