HyPhy - Hypothesis Testing using Phylogenies - Information, documentation, and news about the HyPhy software package

Using HyPhy#

There are many ways to use HyPhy:

Run HyPhy on our accompanying Datamonkey webserver#

This option is the easiest, supports most popular analyses, and does not require the use of the command line. Access Datamonkey here.

Run HyPhy from the command line#

This option is the most flexible approach through which you can access all available analyses and pipelines as well as customize your own HyPhy analyses. Follow these instructions for download and installation.
HyPhy can be run from the command line in two ways:

Standard command line tool: additional information available here
Interactive command line prompt: additional information available here

Run HyPhy locally (on your own computer) without the command line#

Two graphical user interfaces (GUI) applications are available for running HyPhy locally:

A new GUI application of HyPhy which is very similar to the datamonkey interface except jobs is run on your own machine rather than on the datamonkey server (only available on Mac). Download, installation, and usage tutorials available here.
A legacy GUI application of HyPhy which can run on Windows as well as Mac OS X (no longer developed, but still supporting many popular analyses). Follow these instructions for download and installation.

Use HyPhy on Galaxy#

Many of the HyPhy methods are available for use as tools on a Galaxy server at galaxy.hyphy.org. These tools can also be installed on any Galaxy instance from the Galaxy tool shed.

Use HyPhy from MEGA#

The popular MEGA software package has begun integrating HyPhy tools into the MEGA interface. The current release has a limited set of HyPhy tools but this is expected to be expanded in the future.

Use HyPhy for software/pipeline development#

Compile HyPhy as a library that can be accessed via Python, R, or other language bindings. Follow these instructions for download and installation.

Typical uses of HyPhy#

HyPhy ships with a library of standard analyses that implement ~100 different methods from start to finish. HyPhy is most commonly used for characterizing the evolutionary process, in particular:

Detecting signatures of selection
Estimating evolutionary rates
Comparing different evolutionary models
Fitting custom models to sequence alignments

Characterizing selective pressures#

HyPhy provides a suite of diverse phylogenetic methodologies for testing specific hypotheses about the selection in protein-coding and/or amino-acid multiple sequence alignments. Which method you select will depend on your specific question. Below we recommend several methods for different purposes, linked to more in-depth descriptions. Tutorials for using these methods are also available here.

Note that you may find it useful to perform pre-processing on your dataset, specifically by screening for recombination breakpoints using our GARD (Genetic Algorithm for Recombination Detection) method before proceeding to selection analysis.

Are individual sites subject to pervasive (across the whole phylogeny) positive or purifying selection?#

FEL (Fixed Effects Likelihood) is suitable for small-to-medium sized data sets.
SLAC (Single-Likelihood Ancestor Counting) is an approximate method with accuracy similar to FEL, but suitable for larger datasets. However, SLAC is not suitable for highly-diverged sequences.
FUBAR (Fast, Unconstrained Bayesian AppRoximation) is suitable for medium-to-large data sets and is expected to have more power than FEL for detecting pervasive selection at sites. FUBAR is the preferred approach for inferring pervasive selection.

Are individual sites subject to episodic (at a subset of branches) positive or purifying selection?#

MEME (Mixed Effects Model of Evolution) tests for episodic selection at individual sites. Note that MEME does not accept a priori branch specifications (this feature is being introduced with v2.3-dev and later). MEME is the preferred approach for detecting positive selection at individual sites.

Are individual branches subject to episodic (at a subset of sites) positive or purifying selection?#

aBSREL (adaptive Branch-Site Random Effects Likelihood) is an improved version of the common "branch-site" class of models. aBSREL allows either for a priori specification of branch(es) to test for selection or can test each lineage for selection in an exploratory fashion. Note that the exploratory approach will sacrifice power. aBSREL is the preferred approach for detecting positive selection at individual branches.

Has a gene experienced positive selection at any site on a particular branch or set of branches?#

BUSTED (Branch-Site Unrestricted Statistical Test for Episodic Diversification) will test for gene-wide selection at pre-specified lineages. This method is particularly useful for relatively small datasets (fewer than 10 taxa) where other methods may not have sufficient power to detect selection. This method is not suitable for identifying specific sites subject to positive selection.

Has gene-wide selection pressure been relaxed or intensified along with a certain subset of branches?#

RELAX tests for a relaxation (e.g. where purifying selection has become less stringent) or an intensification (e.g. where purifying selection has become stronger) of selection pressures along with a specified set of "test" branches. This method is not suitable for detecting positive selection.