Frequently Asked Questions
|This page is a work in progress. Current status of the page is: being populated.|
What is HyPhy?
- HyPhy[note 1] stands for "Hypothesis testing using Phylogenies". It is a software package that is designed around the fundamental objects of phylogenetics - the alignment, the tree, and the likelihood function. This design is based on the premise that any analysis of genetic sequences must take place within a phylogenetic framework because all sequences are the products of molecular evolution.
- HyPhy is designed to be extensively customizable. To make this possible, it can interpret its own scripting language - historically, these were also known as "batch languages". We maintain this naming convention and call this the HyPhy Batch Language. In a nutshell, a batch language allows you to combine a sequence of commands, loop over parts of that sequence, and have alternative command chains that are conditional on the state of the process.
- HyPhy is also designed to be user-friendly when your analytical needs are conventional. There is a complete graphical user interface (GUI) that displays sequence alignments, trees, and likelihood functions in an interactive window that allows you to click on components of each object, inspect and in many cases reset that object's value.
- HyPhy is a mature package, whose development began in 1997 and its first release took place in 2001. However, this also means that some of the components of the package are getting a little long in the tooth. With that in mind the Development Team started a major redesign effort to make the package easier to use, simpler to integrate into pipelines and workflows, to write extensive documentation (starting with this Wiki), and streamline the core features of the program. Please bear with us while the changes are being put in place!
What can I do in HyPhy?
- Some of the more popular analyses in HyPhy include:
- Detection of natural selection (diversifying, purifying, or directional)
- Detection of recombination
- Detecting co-evolving residues in protein sequences
- Genomic and multiple-gene evolutionary inference
- Molecular clock and relative rate tests
- Nucleotide, protein and codon model selection
- As a likelihood analysis engine for other software and web services
- Many of these popular analyses are also made available as web applications (Datamonkey), so that you can upload your data and select a method to run on our public high-performance computing cluster.
- With the batch language, the complexity of an analysis that you want to carry out on sequence data is essentially unlimited. (Or for practical purposes, limited by the power of your hardware and your patience!) In other words, you can implement a customized analysis, either by modifying one of the template batch files that are distributed along with HyPhy, or by writing one from scratch.
How do I get a copy of HyPhy?
- Go to the main page of this wiki by clicking on the logo in the top-left corner, and go to the Downloads page.
- If that's too complicated, then just click on this this link.
Why do I have to do user registration to download HyPhy?
- Generally, we don't get paid to write scientific software. We give away the software for free. There are only two benefits to continuing to develop and maintain HyPhy:
- We use it extensively for our own scientific research.
- We can apply for grants from various agencies.
- Our success at the latter is highly contingent at being able to prove that there is a large and active user community. Hence, you are doing us a big favour by filling out the user registration - so please do it!
Common hiccups in using HyPhy
Why can't I select my files?
- If you are unable to select a file (such as a sequence alignment) - for example, if the file appears "faded out" in the Open File dialog window, then there is a simple workaround.
- This is a known issue affecting the graphical user interface on Mac OS X. Without getting into too much detail, it has to do with how OS X handles file types. You need to select "All Documents" in the Enable drop-down menu instead of the default setting, "All Readable Documents".
Who develops HyPhy?
- The majority of the source code was written by Sergei Kosakovsky Pond, initially based on programs written by Spencer V. Muse. Art F. Y. Poon made major contributions to the source code in the areas of machine learning (stochastic grammars, Bayesian networks).
- The entire code base is presently being refactored and documented. Much of this work is being carried out by Steven Weaver.
How do I cite Hyphy?
- Sergei L. Kosakovsky Pond, Simon D. W. Frost and Spencer V. Muse (2005) HyPhy: hypothesis testing using phylogenies. Bioinformatics 21(5): 676-679
- Wayne Delport, Art F. Poon, Simon D. W. Frost and Sergei L. Kosakovsky Pond. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 2010 July 29[Epub ahead of print; PMID: 20671151]
- Sergei L. Kosakovsky Pond and Simon D. W. Frost (2005). Datamonkey: rapid detection of selective pressure on individual sites of codon alignments. Bioinformatics 21(10): 2531-2533
Specific methods implemented in HyPhy
- Selection detection (SLAC/FEL/REL) - Sergei L. Kosakovsky Pond and Simon D. W. Frost (2005) Not So Different After All: A Comparison of Methods for Detecting Amino Acid Sites Under Selection. Molecular Biology and Evolution 22(5): 1208-1222
- Internal Fixed Effects Likelihood (IFEL) Sergei L Kosakovsky Pond, Simon DW Frost, Zehava Grossman, Michael B Gravenor, Douglas D Richman and Andrew J Leigh Brown (2006). Adaptation to different human populations by HIV-1 revealed by codon-based analyses. PLoS Computational Biology 2(6): e62
- TOGGLE - Wayne Delport, Konrad Scheffler and Cathal Seoighe (2008). Frequent Toggling between Alternative Amino Acids Is Driven by Selection in HIV-1. PLoS Pathogens 4(12): e1000242.
- Directional Evolution in Protein Sequences (DEPS) Sergei L Kosakovsky Pond, Art FY Poon, Andrew J Leigh Brown and Simon Frost (2008). A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus. Molecular Biology and Evolution 25(9): 1809-1824
- PARRIS - Konrad Scheffler,Darren P. Martin and Cathal Seoighe (2006). Robust inference of positive selection from recombining coding sequences. Bioinformatics 22(20): 2493-2499
- GA-Branch - S.L. Kosakovsky Pond and S.D.W. Frost (2005). A Genetic Algorithm Approach to Detecting Lineage-specific Variation in Selection Pressure. Molecular Biology and Evolution 22(3): 478-485
- Evolutionary Selection Distance (ESD) Sergei L Kosakovsky Pond, Konrad Scheffler, Michael B Gravenor, Art FY Poon and Simon DW Frost (2009).
Evolutionary Fingerprinting of Genes. Molecular Biology and Evolution 27(3): 520-536
- Spidermonkey/BGM - Art Poon, Fraser Lewis, Sergei Kosakovsky Pond and Simon Frost (2007). An evolutionary-network model reveals stratified interactions in the V3 loop of the HIV-1 envelope. PLoS Computational Biology 3(11): e23
- Codon Model Selection (CMS) - Wayne Delport, Konrad Scheffler, Gordon Botha, Michael B Gravenor, Spencer V. Muse and Sergei L Kosakovsky Pond (2010) CodonTest: modeling amino-acid substitution preferences in coding sequences. PLoS Computational Biology 6(8): e1000885
- Branch-site REL - Sergei L. Kosakovsky Pond1, Ben Murrell, Mathieu Fourment, Simon D. W. Frost, Wayne Delport and Konrad Scheffler (2011)
A random effects branch-site model for detecting episodic diversifying selection. Molecular Biology and Evolution (first published online June 13, 2011 doi:10.1093/ molbev/msr125)
- MEME - Murrell, B., Wertheim, J. O., Moola, S., Weighill, T., Scheffler, K., and Kosakovsky Pond, S. L. (2012) Detecting Individual Sites Subject to Episodic Diversifying Selection". PLoS Genet, 8(7), e1002764+
- SBP/GARD - Sergei L Kosakovsky Pond, David Posada, Michael B Gravenor, Christopher H Woelk and Simon DW Frost. Automated Phylogenetic Detection of Recombination Using a Genetic Algorithm. Molecular Biology and Evolution 23(10): 1891-1901
- SCUEAL - Sergei L Kosakovsky Pond, David Posada, Eric Stawiski, Colombe Chappey, Art FY Poon, Gareth Hughes, Esther Fearnhill, Mike B Gravenor, Andrew J Leigh Brown and Simon DW Frost (2009). An Evolutionary Model-Based Algorithm for Accurate Phylogenetic Breakpoint Mapping and Subtype Prediction in HIV-1. PLoS Computational Biology 5(11): e1000581
- Ancestral Sequence Reconstruction (ASR) (joint) - Tal Pupko, Itsik Pe'er Ron Shamir and Dan Graur (2000). A Fast Algorithm for Joint Reconstruction of Ancestral Amino Acid Sequences. Molecular Biology and Evolution 17: 890-896
- ASR (marginal) - Z Yang, S Kumar and M Nei (1995). A New Method of Inference of Ancestral Nucleotide and Amino Acid Sequences. Genetics 141: 1641-1650
- ASR (sampled) - Rasmus Nielsen (2002) Mapping mutations on phylogenies. Systematic Biology 51(5): 729-739