NCEP CCS Conversion Guide
Changes to NCEP Production LoadLeveler Configuration
Back to Main Page
Changes to NCEP Production LoadLeveler Queues

On asp/bsp production uses two LoadLeveler queues:

Class Name:    prod
Nodes in Class:  250 nodes
Typical Use:        Largely Used for multiple processor/multiple node jobs

Class Name:    2
Nodes in Class:  6 nodes
Typical Use:       Used for single processor/serial jobs

On frost/snow production still uses two queues.  The name of the serial job queue has changed.
Class Name:    prod
Nodes in Class:  152 nodes
Typical Use:        Largely Used for multiple processor/multiple node jobs

Class Name:    prodser
Nodes in Class:  4 nodes
Typical Use:       Used for single processor/serial jobs
 

Changes NCEP Production LoadLeveler Command Files

LoadLeveler is essentially configured the same way on frost/snow as it is on asp/bsp.  There are two minor differences which required us to change every LoadLeveler command file in production.  The first difference is the "Feature" tags of prod, dev and beta are no longer being used on frost/snow.  The second difference is we needed to specify each job to stripe communications across both switch "planes" since each node (logical partition) has two switch adapters.

On asp/bsp, we had the "Feature" tags of prod, dev and beta available on each node.  This enabled us to set preferences and requirements for certain jobs to run on certain nodes on the system.  On frost/snow, we will not be using this mechanism to run prod and dev jobs on the same machine.  The gang scheduler within LoadLeveler will give us the ability to perform preemptive scheduling.  Therefore, each instance listed below was removed from the production LoadLeveler command files.

Removed for command files:
# @ requirements = Feature == "prod"
# @ preferences = Feature == "prod"
# @ requirements = Feature != "beta"
The frost/snow nodes have two switch adapters per logical partition (node).  On asp/bsp, each node only had one switch adapter.  When jobs run on the frost/snow nodes, it can stripe the switch communications traffic between the two adapters automatically.  It is possible for a job to run sucessfully on a frost/snow node with one switch adapter down.  In fact, when a job is running and a switch adapter goes down, LoadLeveler will drain the node letting the current jobs finish.  To take advantage of this striping across switch "planes" we were required to modify the "network.MPI" parameter to use "csss" versus "css0".
Modification to command files:
Changed
#@ network.MPI = css0,shared,us
to
#@ network.MPI = csss,shared,us

 
Back to Main Page