Pathogen Genome Cluster Computing |
|||||||
|
Statistics and Computer Architecture |
|
|
|
Maximum likelihood |
|
Maximum likelihood seeks to search a probability landscape for the solution with the highest (maximum) probability using a predefined model. The approach is widespread in phylogenetics in searching for different tree structures and this in turn is the bricks and mortar of phylogenomics analysis. Phylogenetics seeks to model either DNA or amino acid point mutations, although we have applied it to modeling the presence or absence of genes from DNA-DNA microarray data. Phylogenetic nucleotide models center on recovering information lost due to reversion mutations, particularly at "neutral" sites. Elaborations on this model incorporate skews in purine pyrimidine mutation rates, base homogeneity and site rate heterogeneity. Parameter rich models are processor hungry in an approach which is notoriously processor hungry. The combination of maximum likelihood based models across entire genomes results in highly computationally demanding calculations. Bayesian approaches use the same model based approach to reconstruct a phylogenetic tree but obtain the solution via simulation notably using Markov Chain Monte Carlo.
|
|
Stochastic Approaches |
|
Stochastic simulations, or microsimulations, which model individual behaviours, are very RAM intensive and RAM is often the limitation of this approach. When the population size of the microsimulation exceeds the RAM of a machine the calculation freezes (from bitter experience). A computer designed around simulation usually contains a smaller number of processors each with a huge RAM capacity.
|
|
Next >>
|
London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK | Tel: +44 (0) 20 7636 8636 | Comments and enquiries Last updated 28th July, 2005 MWG. |