Search A-Z index Help
University of Cambridge Home Chemistry Dept Home CUC3 home
University of Cambridge > Department of Chemistry > Theoretical Chemistry > Computer Support

CUC3 Nimbus User Notes

Nimbus is a cluster of 33 dual-processor Opteron servers. The CPUs are Opteron 248s (2.2GHz clock speed) and each machine has 2Gb of RAM. They all run SuSE Linux 9.3.

Nimbus can only be used by sshing into the head node, whose external name is nimbus.ch.cam.ac.uk. Almost all work is done from there. In particular, passwords and shells should only be changed on the head node. Every node has a name on the cluster's internal network of the form comp??.ch.cam.ac.uk. The compute nodes can be logged into from the head node but not from outside. You should almost never need to log into a compute node.

Nimbus's firewall is configured only to allow logins from Chemistry machines.

Homespace is on a disk array attached to the head node. The /home filesystem is 200Gb in size. This is shared between all nodes so you see the same home directory wherever you (or your job) are on the machine. /home is backed up nightly and two weeks of incrementals are kept. There are quotas on /home. Currently the soft limit is 6Gb and the hard limit is 8Gb.

There is also a 900Gb shared filesystem called /sharedscratch in which you will have a directory. This is not backed up. At the moment I have no plans to purge it regularly, so please clean up old files when you're done with them. Please try to use /sharedscratch appropriately: it should be used for data that you could recreate in a reasonable time. Try to avoid writing to it over the network where it's avoidable, ie from a running serial job. Each node also has a local /scratch filesystem on which you will have your own directory. These filesystems are about 50Gb in size with no quota restriction and are the most appropriate place for your jobs to write temporary files during a run when this is possible. They are local to each node and so considerably faster than the NFS-mounted /home and /sharedscratch. Please clean up files on /scratch when you are done with them; see the queueing documentation for how to find out which node's /scratch to look at for each job.

The following software is installed on all nodes in addition to a very minimal Linux installation:

  • Intel Fortran and C 8.x (ifort/icc) in 32 and 64 bit versions
  • Intel Fortran and C 9.0 (ifort/icc) in 32 and 64 bit versions
  • Intel Math Kernel Library 7.2.1 in 32 and 64 bit versions
  • Portland Group Fortran (pgf90, pgf77) and C compilers (pgcc) releases 6.1, 6.0 and 5.2 in 32 and 64 bit versions
  • GNU compilers (gcc, g77)
  • SCore parallel environment
  • LAM-MPI parallel environment
  • MPI-CH parallel environment

In order to manage all the combinations, the modules environment is installed. By default the 64-bit Portland 6.0-2, 64-bit ifort 8.1.026, and SCore modules are loaded, as these are likely to be the most popular.

The head node also has some extra software packages, such as popular editors, as it is intended for interactive work. If there is a package missing from the head node that you would like to use then please ask; it will probably be possible to install it provided it is a sensible size.

The two parallel environments on the system are dealt with in a separate document. SCore is the default as it generally outperforms LAM.

All compute jobs should be run through the queueing system as there is an interactive CPU time limit on the nodes. The queueing system will run each job on a free compute node, copying the output back to a user-specified file at the end of the job. The only reason you would log into a compute node is to clean up old scratch files. The queueing system is Torque with the Maui scheduler. Torque is almost identical to OpenPBS from the user's point of view, so the system will be familiar to anyone who has used the local Athens, Rama, Sword, or Destiny clusters. The main difference here is the queue names. Read the OpenPBS/Maui introduction for how to submit jobs, and Nimbus's queue setup to see what queues are available on this particular machine.

Problems with Nimbus should be reported to <cen1001@cam.ac.uk> in the first instance.