Pathogen Genome Cluster Computing

Clusters Genomes Pipelines Statistics History People Home

What is a cluster?


©2005 Photo by Anne Koerber, LSHTM. Side view of cluster, front view of Michael Gaunt

Hardware setup

A cluster is simply a stack of dual processor machines usually mounted in a rack (pictured). The stack is configured such that one machine is designated the head node and acts as a server to all the other machines, technically known as slave nodes, via an internal network. The head node allows access to the external network so we can log into it, whilst controlling and monitoring its own network of slave nodes.

The cluster pictured opposite is a Mac G5 system comprising of a head node, 15 slave nodes viz. 32 processors and two power packs. The big advantage of the hardware is that each machine uses 64-bit processors and the software architecture is built on UNIX (Darwin flavour).

Software setup

"The digital revolution is far more significant than the invention of writing or even of printing."

Douglas Engelbart

The Mac G5 cluster is managed using Sun Grid Engine (SGE), which evenly spreads workload across the cluster by restricting each node to a maximum of 2 separate calculations, or jobs. When the cluster is full to capacity additional jobs are stacked in a queue.

The example below represents two sequential genomic scale calculation spread across the 15 slave nodes represented using Ganglia under the iNquiry software system.

Key: Grayscale bars above represent processor activity. Total processor activity is represented by the graph at the top left hand corner and processor activity for 12 of the 15 slave nodes is represented by the central graphs. The y-axis denotes number of processors and x-axis the time in 24 hour clock mode in all cases.
Next >>
London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK | Tel: +44 (0) 20 7636 8636

Comments and enquiries Last updated 28th July, 2005 MWG.