GpGPU adaptation of the likelihood computing engine
From HyPhy Wiki
Adding openCL support for the tree pruning algorithm with non-nucleotide alphabets
- Currently, HyPhy creates a caching structure for the likelihood function (LF) when it's being optimized (document describing to follow). For the openCL branch, most of this structure would have to be created and maintained on the device.
- Transition matrix computation is usually an insignificant part of the run time (<5% typically), hence it can still be done on the host. Each branch of the tree has its own N x N (N is the number of characters, e.g. 61 for codons/Universal code) transition matrix. These matrices need to be communicated to the device efficiently. Also, not that not all matrices will be changing between LF evaluations; at least one will be, but any number between 1 and branch count is valid
- In the 2010 practice example (add link), all branches were "touched" for every evaluation. In the general case, only a subset of them may need to be processed. This subset (in post-order traversal) will be computed on and supplied by the host.
- The kernel will populate likelihood caches for each internal node/site and either
- return the vector of values (plus scaling factors) to the host or
- collate the results (i.e. sum them) on the device and return a single number (plus the cumulative scaling factor) to the host
- There are two modes of calculation (to be elaborated on later)
- Many branches change (initially focus on this), e.g. during the gradient descent, or global variable updates by the optimizer
- One branch changes many times in a row, e.g. during branch length optimization. In this case, the tree can be rerooted to place the branch of interest at the root, and significantly simplify the calculation