8 Hardware
Apt can allocate experiments on any one of several federated clusters.
8.1 Apt Cluster
This is the cluster that is currently used by default for all experiments on Apt.
The main Apt cluster is housed in the University of Utah’s Downtown Data Center in Salt Lake City, Utah. It contains two classes of nodes:
r320 |
| 128 nodes (Sandy Bridge, 8 cores) |
CPU |
| 1x Xeon E5-2450 processor (8 cores, 2.1Ghz) |
RAM |
| 16GB Memory (4 x 2GB RDIMMs, 1.6Ghz) |
Disks |
| 4 x 500GB 7.2K SATA Drives (RAID5) |
NIC |
| 1GbE Dual port embedded NIC (Broadcom) |
NIC |
| 1 x Mellanox MX354A Dual port FDR CX3 adapter w/1 x QSA adapter |
c6220 |
| 64 nodes (Ivy Bridge, 16 cores) |
CPU |
| 2 x Xeon E5-2650v2 processors (8 cores each, 2.6Ghz) |
RAM |
| 64GB Memory (8 x 8GB DDR-3 RDIMMs, 1.86Ghz) |
Disks |
| 2 x 1TB SATA 3.5” 7.2K rpm hard drives |
NIC |
| 4 x 1GbE embedded Ethernet Ports (Broadcom) |
NIC |
| 1 x Intel X520 PCIe Dual port 10Gb Ethernet NIC |
NIC |
| 1 x Mellanox FDR CX3 Single port mezz card |
All nodes are connected to three networks with one interface each:
A 1 Gbps Ethernet “control network”—
this network is used for remote access, experiment management, etc., and is connected to the public Internet. When you log in to nodes in your experiment using ssh, this is the network you are using. You should not use this network as part of the experiments you run in Apt. A “flexible fabric” that can run up to 56 Gbps and runs either FDR Infiniband or Ethernet. This fabric uses NICs and switches with Mellanox’s VPI technology. This means that we can, on demand, configure each port to be either FDR Inifiniband or 40 Gbps (or even non-standard 56 Gbps) Ethernet. This fabric consists of seven edge switches (Mellanox SX6036G) with 28 connected nodes each. There are two core switches (also SX6036G), and each edge switch connects to both cores with a 3.5:1 blocking factor. This fabric is ideal if you need very low latency, Infiniband, or a few, high-bandwidth Ethernet links.
A 10 Gbps Ethernet “commodity fabric”. One the r320 nodes, a port on the Mellanox NIC (permanently set to Ethernet mode) is used to connect to this fabric; on the c6220 nodes, a dedicated Intel 10 Gbps NIC is used. This fabric is built from two Dell Z9000 switches, each of which has 96 nodes connected to it. It is idea for creating large LANs: each of the two switches has full bisection bandwidth for its 96 ports, and there is a 3.5:1 blocking factor between the two switches.
There is no remote dataset capability at the Apt cluster.
8.2 CloudLab Utah
This cluster is part of CloudLab, but is also available to Apt users.
The CloudLab cluster at the University of Utah is being built in partnership with HP. The first phase of this cluster consists of 315 64-bit ARM servers with 8 cores each, for a total of 2,520 cores. The servers are built on HP’s Moonshot platform using X-GENE system-on-chip designs from Applied Micro. The cluster is hosted in the University of Utah’s Downtown Data Center in Salt Lake City.
More technical details can be found at https://www.aptlab.net/hardware.php#utah
m400 |
| 315 nodes (64-bit ARM) |
CPU |
| Eight 64-bit ARMv8 (Atlas/A57) cores at 2.4 GHz (APM X-GENE) |
RAM |
| 64GB ECC Memory (8x 8 GB DDR3-1600 SO-DIMMs) |
Disk |
| 120 GB of flash (SATA3 / M.2, Micron M500) |
NIC |
| Dual-port Mellanox ConnectX-3 10 GB NIC (PCIe v3.0, 8 lanes |
There are 45 nodes in a chassis, and this cluster consists of seven chassis. Each chassis has two 45XGc switches; each node is connected to both switches, and each chassis switch has four 40Gbps uplinks, for a total of 320Gbps of uplink capacity from each chassis. One switch is used for control traffic, connecting to the Internet, etc. The other is used to build experiment topologies, and should be used for most experimental purposes.
All chassis are interconnected through a large HP FlexFabric 12910 switch which has full bisection bandwidth internally.
We have plans to enable some users to allocate entire chassis; when allocated in this mode, it will be possible to have complete administrator control over the switches in addition to the nodes.