Laboratory Research Computing - Systems

Overview

Lawrencium is Berkeley Lab's institutional computational resource for Berkeley Lab PIs and their collaborators. It consists of multiple cluster pools that are each equipped with a high performance, low latency FDR and EDR Infiniband interconnect that is suitable for providing an stable, high performing resource for running wide diversity of scientific applications. Currently, they are all coupled with a Hitatchi HNAS high performance NFS storage server that provides a home directory storage and a Data Direct Networks 1.8PB Lustre parallel filesystem that provides high performance scratch space to users.

Compute Nodes
LR5 is the latest addition to the Lawrencium condo cluster consisting of 144 ea. 28-core Broadwell compute nodes connected with a Mellanox FDR Infiniband fabric. Each node is a Dell PowerEdge C6320 server blade equipped with two Xeon Intel 14-core Broadwell processors (24 cores in all) on a single board configured as an SMP unit. The core frequency is 2.4GHz and supports 16 floating-point operations per clock period with a peak performance 1,075 GFLOPS/node. Each node contains 64GB 2400Mhz memory.

LR4 consists of 108 ea. 24-core Haswell compute nodes connected with a Mellanox FDR Infiniband fabric. Each node is a Dell PowerEdge C6320 server blade equipped with two Xeon Intel 12-core Haswell processors (24 cores in all) on a single board configured as an SMP unit. The core frequency is 2.3GHz and supports 16 floating-point operations per clock period with a peak performance 883 GFLOPS/node. Each node contains 64GB 2133Mhz memory.

LR3 consists of 172 16-core compute nodes and 36 20-core nodes connected with a Mellanox FDR Infiniband fabric. Each node is a Dell PowerEdge C6220 server blade equipped with two Xeon Intel 8-core Sandybridge processors (16 cores in all) on a single board configured as an SMP unit. The core frequency is 2.6GHz and supports 8 floating-point operations per clock period with a peak performance of 20.8 GFLOPS/core or 322GFLOPS/node. Each node contains 64GB 1600Mhz memory. The newer 20-core C6220 nodes are similar, but use the 10-core IvyBridge processors running at 2.5Ghz and have 64Gb of 1866Mhz memory.

LR2 is an additional pool of 170 compute nodes intended to augment LR1 to meet the increased computational demand from LBNL researchers. Nodes are a combination of IBM iDataPlex dx360M2, HP SL390, and Dell C6100 servers each equipped with two Xeon Intel hex-core 64-bit Westmere processors (12 cores per node) on a single board configured as an SMP unit. The core frequency is 2.66GHz and supports 4 floating-point operations per clock period with a peak performance of 10.64 GFLOPS/core or 128 GFLOPS/node. Each node contains 24GB of NUMA memory connected via a triple QuickPath Interconnect (QPI) channels. LR2 has a peak performance rating of 20TF, but provides better computational efficiency making it much faster than LR1 even though the peak rating is similar.

Lawrencium GPU: Also purchased as part of the LR4 compute, the Lawrencium GPU nodes are intended for researchers using graphics processing units (GPUs) for computation or image processing. The pool consists of 4 ea. Finetec Computer Supermicro 1U nodes, each equipped with 2 ea. Intel Xeon 4-core 3.0 Ghz Haswell processors (8 total) and 2 ea. Nvidia K80 dual-GPU accelerator boards (4 GPUs total per node). Each GPU offers 2,496 NVIDIA CUDA cores.



Data Transfer (data-xfer)
The data transfer node (DATA-XFER) is a LRC server dedicated to performing transfers between data storage resources such as the NFS Home Directory storage and Global Scratch Lustre Parallel Filesystem; and remote storage resources at other sites. High speed data transfers are facilitated by a direct connection to ESNET via LBL's Science DMZ network architecture. This server, and not the Login Nodes, should be used for large data transfers so as not to impact the interactive performance for other users on the Login Nodes. Users needing to transfer data to and from Lawrencium should ssh from the Login Nodes or their own computers to the Data Transfer Node, lrc-xfer.lbl.gov, before doing their transfers, or else connect directly to that node from file transfer software running on their own computers.

File Transfer Software
For smaller files you can use Secure Copy (SCP) or Secure FTP (SFTP) to transfer files between Lawrencium and another host. For larger files BBCP and GridFTP provide better transfer rates. In addition, Globus Connect - a user-friendly, web-based tool that is especially well suited for large file transfers - makes GridFTP transfers trivial, so users do not have to learn command line options for manual performance tuning. Globus Connect also does automatic performance tuning and has been shown to perform comparable to - or even better (in some cases) than - expert-tuned GridFTP. The Globus endpoint name for the LRC DTN server isĀ lbnl#lrc

Restrictions In order to keep the data transfer nodes performing optimally for data transfers, we request that users restrict interactive use of these systems to tasks that are related to preparing data for transfer or are directly related to data transfer of some form or fashion. Examples of intended usage would be running Python scripts to download data from a remote source, running client software to load data from a file system into a remote database, or compressing (gzip) or bundling (tar) files in preparation for data transfer. Examples of work that should not be done on the data transfer nodes include running a database on the server to do data processing, or running tests that saturate the nodes' resources. The goal in all this is to maintain data transfer systems that have adequate memory and CPU available for interactive user data transfers.