Overview

   

The trend in high performance computing is towards the use of Linux clusters. Concurrently, there has been a growing interest in the use of Linux clusters for scientific research at Berkeley Lab. For many, a cluster assembled from inexpensive commodity off-the-shelf hardware and open source software promises to be a cost effective way to obtain a high performance system.

Though many of the concepts are simple, it remains difficult for scientists to navigate a myriad of technologies in order to arrive at a cluster configuration that will meet their needs. Similarly, it is harder to efficiently manage a multi-node compute cluster than it is to do the same for a desktop workstation. Consequently, adopters of this technology have had to invest large amounts of effort to realize the full potential of their systems.

The Scientific Cluster Support program was developed to address the difficulties of obtaining and running a Linux cluster system so that PIs can have access to a dedicated resource that can provide the fast turnaround needed to faciliate scientific inquiry and development. The ultimate goal being to increase the overall use of scientific computing to Lab research projects, and to promote parallel computing within the Berkeley Lab community.



 

   

Service Description

The High Performance Computing Services Group in the IT Division offers the following services for LBNL and UC Berkeley researchers who own or want to acquire a Linux cluster to meet their computational needs.

  • Pre-purchase consulting - Understand customer application; Determine cluster hardware architecture and interconnect; Identify required software;
  • Procurement assistance - Assistance with developing a budget, development of RFP.
  • Setup and configuration - This includes installation and setup of the cluster hardware and networking; and installation and configuration of cluster software, scheduler, and applications software
  • Ongoing systems administration and cyber security - operating system and cluster software maintenance and upgrades; security updates; monitoring of cluster nodes; user accounts
  • Computer room space with networking and cooling - Clusters will be hosted in the either the Bldg 50B-1275 or Earl Warren Hall datacenter to insure access to sufficient electrical, cooling, and networking infrastructure.

Clusters are hosted in a Supercluster infrastructure provided by HPC Services consisting of a dedicated firewall subnet; one-time password authentication; multiple interactive login nodes; access to shared third-party compilers, scientific libraries, and applications; shared home directory storage and Lustre parallel filesystem. This Supercluster infrastructure is used by all the clusters within the datacenter to facilitate the movement of researchers across projects and the sharing of compute resources.


Requirements

Systems in the SCS or Berkeley Research Computing Program must meet the following requirements to be eligible for support:

  • Intel x86-64 architecture
  • Participating cluster must have a minimum of 8 compute nodes
  • Dedicated cluster architecture. No interactive logins on compute nodes
  • Scientific Linux 6.x operating system
  • Warewulf3 cluster implementation toolkit
  • SchedMD SLURM scheduler or Adaptive Computing Moab or Maui scheduler with Torque resource manager
  • OpenMPI message passing library
  • All slave nodes only reachable from master node

Clusters that will be located in the datacenter must meet the following additional requirements

  • Rack mounted hardware required. Desktop form factor hardware not allowed
  • Equipment to be installed into APC Netshelter AR3300 computer racks using 2 ea. APC AP8867 PDUs. Prospective cluster owners should include the cost of these racks and PDUs into their planning budget

Rates

Berkeley Lab has determined that it will cover the cost of the Lab's HPC infrastructure including investment in the data center and the cost of maintaining HPC expertise. Therefore, PIs are only charged for the incremental effort of adding their cluster into our support pool. Pricing for PI-owned clusters can be calculated by using the following rates that apply to their configuration.

  • Master node: $300/mo.
  • Infiniband support: $300/mo.
  • Myrinet support: $100/mo.
  • Storage node: $300/mo. per compute node
  • IBM GPFS or Lustre support: $300/mo.
  • Compute node: $15/mo. per compute node
  • 1275 Data Center Colocation: $100/mo per rack

For example, support for a 20-node standalone cluster with an infiniband interconnect would be priced at:
$300/mo master + $300/mo. infiniband support + (20 nodes x $15/mo/node) = $880/mo.

Backups for storage servers are available from the IT Backups Group and are priced separately.

PIs should check the SCS Service Level Agreement for a full description of the program provisions and requirements.