Contrast-FEL (Fixed Effects Likelihood - Contrast)#
What question does this method answer?
Which sites in a gene are subject to different selective pressures among various clades or sets of branches in a phylogenetic tree? In other words, FEL-contrast tests whether selective pressures (measured as dN/dS) at specific sites vary significantly between predefined groups of branches.
Recommended Applications
- Clade/Environment Comparisons: Ideally suited to compare selective pressures when a gene is sampled from different selective environments (e.g., HIV evolution in different hosts or compartments, or species with different lifestyles/phenotypes).
- Binary Environment Splits: Test hypotheses where branches are partitioned a priori into two environments (reference vs. test), allowing site-level resolution for comparing selection pressures.
| HyPhy version required | ≥ 2.3.4 |
| Parallel support | MP and/or MPI |
| File path | LIB/TemplateBatchFiles/SelectionAnalyses/FEL-contrast.bf |
| Standard analysis menu | Evolutionary Hypothesis Testing > FEL-contrast |
What is the statistical procedure and statistical test is used to establish significance for this method?#
For each site, three rates are inferred, with other parameters (frequencies, branch lengths) inferred jointly and held at
- α : synonymous substitution rate
- β r : non-synonymous substitution rate along reference branches
- β p : non-synonymous substitution rate along test branches
Two models are compared using a likelihood ratio test
-
HA: α, β r, and β t are inferred by maximum likelihood as free parameters
-
H0: The β r := β t constraint is enforced.
The models are nested and differ by one degree of freedom. p-values are computed using the limit χ2 distribution with one degree of freedom.
How should one interpret positive and negative test results?#
A significant result at a site means that dN/dS (β/α) is different between the two sets of branches, with either an increase or a decrease on the test branches relative to the reference branches. A significant finding does not make any claims about positive (dN/dS > 1) or negative (dN/dS < 1), just that dN/dS differ among sets of branches, i.e., a difference need not change the mode of selection.
Negative results do not mean that there is no difference, rather that whatever difference there may be does not rise to the level of statistical significance/
Rules of thumb for when this method is likely to work well, and when it is not.#
- Generally, you need 10 or more branches in each set to be able to have any statistical power.
- Too little divergence is also likely to severely throttle statistical power.
Example#
We will analyze HIV-1 env sequences from a transmission pair: sequences are isolated from the putative source individual and the putative recipient individual from the 2005 study by Frost et al.
-
Partition the tree into the source and recipient (here we include the transmission branch with the source sequences), for example as described here. For convenience, download a NEXUS file with the tree already partitioned.
-
Run
HYPHYMPorHYPHYMPI, selectEvolutionary Hypothesis Testingfrom the menu of analyses then selectUse a FEL method to test which sites in a gene may be associated with adaptation to a different environment.Alternatively, you can supply the path of the file as a command line argument, e.g. (by default/path/to/hyphylibshould be/usr/local/lib/hyphy)$HYPHYMP /path/to/hyphylib/TemplateBatchFiles/SelectionAnalyses/FEL-contrast.bf -
Select
Universal genetic code -
Input the path to the example file downloaded from the link above
-
Choose
SOURCEas the test set -
Select
Yesto include synonymous rate variation -
Input
0.1for the default p-value
The analysis will now run for a few minutes and output the following results
Branches to use as the test set in the FEL-contrast analysis#
Selected 24 branches to include in FEL calculations: 0564_7, 0564_11, 0564_4, Node6, 0564_1, 0564_21, 0564_5, Node11, Node9, Node5, 0564_17, Node4, 0564_13, 0564_15, Node16, 0564_22, 0564_6, Node20, 0564_3, Node19, Node15, Node3, 0564_9, Node2
Obtaining branch lengths and nucleotide substitution biases under the nucleotide GTR model#
- Log(L) = -5524.85, AIC-c = 11151.77 (51 estimated parameters)
Obtaining the global omega estimate based on relative GTR branch lengths and nucleotide substitution biases#
- Log(L) = -5436.84, AIC-c = 10991.98 (59 estimated parameters)
- non-synonymous/synonymous rate ratio for background = 0.9178
- non-synonymous/synonymous rate ratio for test = 0.8293
Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model#
- Log(L) = -5436.29
- non-synonymous/synonymous rate ratio for background = 1.1136
- non-synonymous/synonymous rate ratio for test = 0.7748
For partition 1 these sites are significant at p <=0.1#
| Codon | alpha | beta-reference | beta-test | LRT | Difference detected? |
|---|---|---|---|---|---|
| 4 | 0.000 | 22.380 | 0.000 | 3.390 | Decr. p = 0.0656 |
| 52 | 0.000 | 20.982 | 0.000 | 3.384 | Decr. p = 0.0658 |
| 83 | 0.000 | 20.365 | 0.000 | 3.389 | Decr. p = 0.0656 |
| 118 | 0.000 | 17.179 | 0.000 | 3.404 | Decr. p = 0.0651 |
| 124 | 0.000 | 23.346 | 0.000 | 3.396 | Decr. p = 0.0653 |
| 155 | 0.000 | 0.000 | 64.943 | 5.045 | Incr. p = 0.0247 |
| 187 | 0.000 | 20.934 | 0.000 | 3.577 | Decr. p = 0.0586 |
| 218 | 0.000 | 20.825 | 0.000 | 3.519 | Decr. p = 0.0607 |
| 222 | 0.000 | 22.658 | 0.000 | 3.459 | Decr. p = 0.0629 |
| 224 | 0.000 | 25.874 | 0.000 | 3.681 | Decr. p = 0.0550 |
| 352 | 0.000 | 19.420 | 0.000 | 3.411 | Decr. p = 0.0648 |
| 386 | 0.000 | 20.334 | 0.000 | 3.387 | Decr. p = 0.0657 |
| 417 | 0.000 | 21.316 | 0.000 | 3.383 | Decr. p = 0.0659 |
| 455 | 0.000 | 22.010 | 0.000 | 3.398 | Decr. p = 0.0653 |
| 462 | 0.000 | 69.066 | 10.567 | 3.860 | Decr. p = 0.0494 |
| 466 | 0.000 | 55.142 | 0.000 | 7.562 | Decr. p = 0.0060 |
| 506 | 0.000 | 33.154 | 0.000 | 3.438 | Decr. p = 0.0637 |
| 526 | 0.000 | 50.810 | 5.313 | 3.351 | Decr. p = 0.0672 |
| 533 | 0.000 | 21.489 | 0.000 | 3.485 | Decr. p = 0.0619 |
| 598 | 0.000 | 18.103 | 0.000 | 3.392 | Decr. p = 0.0655 |
| 633 | 7.019 | 20.227 | 0.000 | 3.393 | Decr. p = 0.0655 |
| 748 | 0.000 | 36.773 | 0.000 | 6.388 | Decr. p = 0.0115 |
| 751 | 0.000 | 18.447 | 0.000 | 3.123 | Decr. p = 0.0772 |
| 762 | 0.000 | 18.868 | 0.000 | 3.402 | Decr. p = 0.0651 |
| 788 | 0.000 | 26.735 | 0.000 | 3.937 | Decr. p = 0.0472 |
| 820 | 0.000 | 56.371 | 0.000 | 9.657 | Decr. p = 0.0019 |
| 824 | 0.000 | 19.630 | 0.000 | 3.604 | Decr. p = 0.0576 |