Cluster information

by Costin Caramarcu last modified Mar 19, 2018 04:42 PM
Cluster information

The cluster consists of:

  • 124 worker nodes
  • 2 submit nodes
  • 2 master nodes

 

The worker nodes detail:

  • HPE ProLiant XL190r Gen9
  • 2 CPUs Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
  • NUMA node0 CPU(s): 0-8,18-26
  • NUMA node1 CPU(s): 9-17,27-35
  • Thread(s) per core: 1
  • Core(s) per socket: 18
  • Socket(s): 2
  • NUMA node(s): 2
  • 2x Nvidia K80 or P100 per node (4 K80  or 2 P100 devices  per node)
  • 256 GB Memory
  • InfiniBand EDR connectivity

 

Storage:

  • 1.9 TB of local disk storage per node
  • 1 PB of GPFS distributed storage

 

Partitions:

partitiontime limitallowed qosdefault timepreempt modeuser availability
debug 30 minutes normal 5 minutes off ic
long 24 hours normal 5 minutes off ic
scavenger 6 hours scavenger 5 minutes cancel ic
gen3 72 hours gen3 2 hours off cfn only
gen4 72 hours gen4 2 hours off cfn only
Document Actions