Pathogen Genome Cluster Computing

Clusters Genomes Pipelines Statistics History People Home

History and Itinerary

Information for Users

This section could be expanded and assigned a separate web page in due course

Load activity of each slave-node is no longer automatically displayed following a recent Ganglia upgrade. Click on total cluster load diagram pictured above to activate load activity of each slave node.

Things I just never got round to doing

  • MPI needs activating - normally I write my own parallel processing routines, but several users now need this capability for freeware packages
  • Mail server needs fixing. The machine is not an email server although it is handy to send yourself an email when the job is done.

History of Bugs and Fixes

  • August - September, 2004. Machine shipped
  • September - October, 2004. Hardware setup and provisional software setup
  • October, 2004 Sun Grid Engine functioning
  • October, 2004. DHCP leakage caused the machine to be removed from the network
  • October, 2004. DHCP leakage fixed using a firewall rule patch
  • December - June, 2005. Continual software updates and installations
  • May, 2005. Firewall tightened using a layered structure, server access informally called 'Fort Knox'
  • May, 2005. Internet access given to all slave nodes for core software updates outside NAT
  • June, 2005. Prioritized queuing system implemented
  • July, 2005. Ganglia update reversed due to incompatibility with current system

Software Itinerary

Two sources of bioinformatics software are available; restricted access software for specialist needs of individual users or software that is under a commercial licence and core installations.

Restricted Access Software

  • PAUP - comprehensive phylogenetics package (licensed to M Gaunt)
  • GAMESS - protein modeling software ( licensed to D Warhurst)
  • Structure - Bayesian population genetics software
  • Various population genetics packages
  • OligoArrayAux - hybridization prediction
  • LDhat - linkage disequilibrium software

Core Installations

  • EMBOSS - generic bioinformatics package good resource for building pipelines
  • BioPerl - brilliant for building pipelines
  • NCBI Blast but WU-Blast2 will be installed
  • ClustalW - alignment program
  • PAML - comprehensive phylogenetics package
  • MrBayes - Bayesian tree building software
  • Phylip - generic phylogenetic package
  • Many specialist phylogenetic freeware packages such as the archive developed by the Oxford group


Software for Epidemiologists

We currently do not house epidemiological software because traditionally it is mostly orientated around Windows GUI-based programs making it hard to run on a UNIX cluster. Regrettably we cannot run the following programs.

      • Winbugs - popular epidemiological package for maximum likelihood analysis using a Windows GUI interface - admittedly maximum likelihood calculations are ideally suited for clusters but no-can-do here.

      • ModGen - extremely cool high level language for microsimulations. I have heard that this can be run on Linux so is probably portable, but for the most part it is strictly Windows.

      We can run any Java based software with complete ease (hint to Richard White)

We could almost certainly run Stata and SAS on the cluster with the relevant licence, but no one has yet requested this. However, industrial scale microsimulations ideally should run on machines with much more RAM.


Next >>
London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK | Tel: +44 (0) 20 7636 8636

Comments and enquiries Last updated 28th July, 2005 MWG.