[Contents] [Index] [Top] [Bottom] [Prev] [Next]


4. Resources

This chapter describes the system resources LSF keeps track of and how you use LSF resource specifications. Topics covered in this chapter are:

Introduction to Resources

A computer may be thought of as a collection of resources used to execute programs. Different applications often require different resources. For example, a number crunching program may take a lot of CPU power, but a large spreadsheet may need a lot of memory to run well. Some applications may run only on machines of a specific type, and not on others. To run applications as efficiently as possible, the LSF system needs to take these factors into account.

In LSF, resources are handled by naming them and tracking information relevant to them. LSF does its scheduling according to application's resource requirements and resources available on individual hosts. LSF classifies resources in different ways.

Classification by Availability

general resources
These are resources that are available on all hosts, e.g. all the load indices, number of processors on a host, total swap space, host status.
special resources
These are resources that are only associated with some hosts, e.g. FileServer, aix, solaris, SYSV.

Classification by the Way Values Change

dynamic resources
These are resources that change their values dynamically, e.g. all the load indices, host status.
static resources
These are resources that do not change their values, e.g. all resources except load indices and host status are static resources.

Classification by Types of Values

numerical resources
These are resources that take numerical values, e.g. all the load indices, number of processors on a host, host CPU factor.
string resources
These are resources that take string values, e.g. host type, host model, host status.
Boolean resources
These are resources that denote the availability of specific features, e.g. hspice, FileServer, SYSV, aix.

Classification by Definition

configured resources
These are resources defined by user sites, such as external indices and resources defined in the lsf.shared file, e.g. FileServer, fddi.
built-in resources
These are resources that are always defined by LIM, e.g. load indices, number of CPUs, total swap space.

Classification by Location

host-based resources
These are resources that are not shared among hosts, but are tied to individual hosts. An application must run on a particular host to access such resources, e.g. CPU, memory (using up memory on one host does not affect the available memory on another host), swap space.
shared resources
These are resources that are not associated with individual hosts in the same way, but are "owned" by the entire cluster, or a subset of hosts within the cluster. An application can access such a resource from any host which is configured to share it, but doing so affects its value as seen by other hosts, e.g. floating licenses, shared file systems.

Resource names are case sensitive, and can be up to 29 characters in length (excluding some characters reserved as operators in resource requirement strings). You can list the resources available in your cluster using the lsinfo command.

Load Indices

Load indices measure the availability of dynamic, non-shared resources on hosts in the LSF cluster. Load indices built into the LIM are updated at fixed time intervals. External load indices are updated when new values are received from the external load collection program, ELIM, configured by the LSF administrator. Load indices are numeric in value.

Table 1 summarizes the load indices collected by the LIM.

Table 1. Load Indices

Index

Measures

Units

Direction

Averagedover

Update Interval

status

host status

string

15 seconds

r15s

run queue length

processes

increasing

15 seconds

15 seconds

r1m

run queue length

processes

increasing

1 minute

15 seconds

r15m

run queue length

processes

increasing

15 minutes

15 seconds

ut

CPU utilisation

(per cent)

increasing

1 minute

15 seconds

pg

paging activity

pages in + pages out per second

increasing

1 minute

15 seconds

ls

logins

users

increasing

N/A

30 seconds

it

idle time

minutes

decreasing

N/A

30 seconds

swp

available swap space

megabytes

decreasing

N/A

15 seconds

mem

available memory

megabytes

decreasing

N/A

15 seconds

tmp

available space in temporary filesystem1

megabytes

decreasing

N/A

120 seconds

io

disk I/O (shown by lsload -l)

kilobytes per
second

increasing

1 minute

15 seconds

name

external load index configured by LSF administrator

site defined

1Directory C:\temp on NT and /tmp on UNIX.

The status index is a string indicating the current status of the host. This status applies to the LIM and RES. The possible values for status are:

ok
The LIM can select the host for remote execution
busy
A load index exceeds the threshold defined by the LSF administrator; the LIM will not select the host for interactive jobs
lockU
The host is locked by a user or the LSF administrator
lockW
The host's availability time window is closed
unavail
The LIM on the host is not responding
unlicensed
The host does not have a valid LSF license.

If the LIM is available but the RES server is not responding, status begins with a `-'.

Here is an example of the output from lsload:

lsload
HOST_NAME  status  r15s  r1m  r15m  ut   pg   ls  it  tmp  swp   mem
hostN      ok      0.0   0.0  0.1   1%   0.0  1   224 43M  67M   3M
hostK      -ok     0.0   0.0  0.0   3%   0.0  3   0   38M  40M   7M
hostG      busy    *6.2  6.9  9.5   85%  1.1  30  0   5M   400M  385M
hostF      busy    0.1   0.1  0.3   7%   *17  6   0   9M   23M   28M
hostV      unavail

The r15s, r1m and r15m load indices are the 15-second, 1-minute and 15-minute average CPU run queue lengths. This is the average number of processes ready to use the CPU during the given interval.

Note

Run queue length indices are not necessarily the same as the load averages printed by the uptime(1) command; uptime load averages on some platforms also include processes that are in short term wait states (such as paging or disk I/O).

On multiprocessor systems more than one process can execute at a time. LSF scales the run queue value on multiprocessor systems to make the CPU load of uniprocessors and multiprocessors comparable. The scaled value is called the effective run queue length. The -E option shows the effective run queue length.

LSF also adjusts the CPU run queue based on the relative speeds of the processors (the CPU factor). The normalized run queue length is adjusted for both number of processors and CPU speed. The host with the lowest normalized run queue length will run a CPU intensive job the fastest. The -N option shows the normalized CPU run queue lengths.

The ut index measures CPU utilization, which is the percentage of time spent running system and user code. A host with no process running has a ut value of 0 percent; a host on which the CPU is completely busy has a ut of 100 percent.

The pg index gives the virtual memory paging rate in pages per second. This index is closely tied to the amount of available RAM memory and the total size of the processes running on a host; if there is not enough RAM to satisfy all processes, the paging rate will be high. Paging rate is a good measure of how a machine will respond to interactive use; a machine that is paging heavily feels very slow.

The paging rate is reported in units of pages rather than kilobytes, because the relationship between interactive response and paging rate is largely independent of the page size.

The ls index gives the number of users logged in. Each user is counted once, no matter how many times they have logged into the host.

The it index is the interactive idle time of the host, in minutes. Idle time is measured from the last input or output on a directly attached terminal or a network pseudo-terminal supporting a login session.

Note

This does not include activity directly through the X server such as CAD applications or emacs windows, except on SunOS 4, Solaris, and HP-UX systems.

The tmp index is the space available on the file system that contains the /tmp (UNIX) or the C:\temp (NT) directory in megabytes.

The swp index gives the currently available swap space in megabytes. This represents the largest process that can be started on the host.

The mem index is an estimate of the real memory currently available to user processes. This represents the approximate size of the largest process that could be started on a host without causing the host to start paging. This is an approximation because the virtual memory behaviour of operating systems varies from system to system and is hard to predict.

The io index is only displayed with the -l option to lsload. This index measures I/O throughput to disks attached directly to this host, in kilobytes per second. It does not include I/O to disks that are mounted from other hosts.

External load indices are defined by the LSF administrator. The lsinfo command lists the external load indices and the lsload -l command displays the values of all load indices. If you need more information about the external load indices defined at your site, contact your LSF administrator.

Static Resources

Static resources represent host information that does not change over time such as the maximum RAM available to user processes and the number of processors in a machine. Most static resources are determined by the LIM at start-up time. Table 2 lists the static resources reported by the LIM.

Table 2. Static Resources

Index

Measures

Units

Determined by

type

host type

string

configuration

model

host model

string

configuration

hname

host name

string

configuration

cpuf

CPU factor

relative

configuration

server

host can run remote jobs

Boolean

configuration

rexpri

execution priority (UNIX only)

nice(2) argument

configuration

ncpus

number of processors

processors

LIM

ndisks

number of local disks

disks

LIM

maxmem

maximum RAM memory available to users

megabytes

LIM

maxswp

maximum available swap space

megabytes

LIM

maxtmp

maximum available space in temporary file system

megabytes

LIM

The type and model static resources are strings specifying the host type and model.

The CPU factor is the speed of the host's CPU relative to other hosts in the cluster. If one processor is twice the speed of another, its CPU factor should be twice as large. The CPU factors are defined by the LSF administrator. For multiprocessor hosts the CPU factor is the speed of a single processor; LSF automatically scales the host CPU load to account for additional processors.

The server static resource is Boolean; its value is 1 if the host is configured to execute tasks from other hosts, and 0 if the host is a client.

Static resources can be used to select appropriate hosts for particular jobs based on binary architecture, relative CPU speed, and system configuration.

Shared Resources

A shared resource is a resource that is not tied to a specific host, but is associated with the entire cluster, or a specific subset of hosts within the cluster. Examples of shared resources include:

An application may use a shared resource by running on any host from which that resource is accessible. For example, in a cluster in which each host has a local disk but can also access a disk on a file server, the disk on the file server is a shared resource, and the local disk is a host-based resource. There will be one value for the entire cluster which measures the utilization of the shared resource, but each host-based resource is measured separately.

LSF does not contain any built-in shared resources. All shared resources must be configured by the LSF Administrator. A shared resource may be configured to be dynamic or static. In the above example, the total space on the shared disk may be static while the amount of space currently free is dynamic. A site may also configure the shared resource to report numeric, string or Boolean values.

Viewing Shared Resources

In order to view the shared resources in the cluster, use the -s option of the lshosts, lsload, and bhosts commands. For example, suppose a cluster consists of two hosts, each of which have access to a total of five floating licenses for a particular package. They also access a scratch directory, containing 500MB of disk space, from a file server. The LSF administrator has set the resource definitions as shown in Table 3.

Table 3. Example of Shared Resources

Resource Name

Describes

tot_lic

Total number of licenses in cluster

tot_scratch

Total amount of space in shared scratch directory (in MB)

avail_lic

Currently available number of licenses

avail_scratch

Currently available space in shared scratch dir (in MB)

The output of lshosts -s could be:

% lshosts -s
RESOURCE     VALUE  LOCATION
tot_lic      5      host1 host2
tot_scratch  500    host1 host2

The "VALUE" field indicates the amount of that resource. The "LOCATION" column shows the hosts which share this resource. The information displayed by lshosts(1) is static, meaning that the value will not change over time. lsload -s displays the information about shared resources which are dynamic:

% lsload -s
RESOURCE 	 VALUE	 LOCATION
avail_lic 2 host1 host2
avail_scratch 100 host1 host2

The above output indicates that 2 licenses are available, and that the shared scratch directory currently contains 100MB of space.

Under LSF Batch, shared resources may be viewed using bhosts -s:

% bhosts -s
RESOURCE	 TOTAL	 RESERVED	 LOCATION
tot_lic 5 0.0 hostA hostB
tot_scratch 500 0.0 hostA hostB
avail_lic 2 3.0 hostA hostB
avail_scratch 100 400.0 hostA hostB

The "TOTAL" column gives the value of the resource. For dynamic resources, the "RESERVED" column displays the amount that has been reserved by running jobs.

Boolean Resources

Boolean resource names are used to describe features that may be available only on some machines in a cluster. For example:

Any characteristics or attributes of certain hosts that can be useful in selecting hosts for remote jobs may be configured as Boolean resources. Specifying a Boolean resource in the resource requirements of a job limits the set of computers that can execute the job. Table 4 lists some examples of Boolean resources.

Table 4. Examples of Boolean Resources

Resource Name

Describes

Meaning of Example Name

cs

role in cluster

compute server

fs

role in cluster

file server

solaris

operating system

Solaris operating system

frame

available software

FrameMaker license

Listing Resources

The lsinfo command lists all the resources configured in the LSF cluster. See `Displaying Available Resources' on page 14 for an example of the lsinfo command. The lsinfo -l option gives more detail about each index:

% lsinfo -l r1m
RESOURCE_NAME:  r1m
DESCRIPTION: 1-minute CPU run queue length (alias: cpu)
TYPE ORDER INTERVAL BUILTIN DYNAMIC
Numeric Inc 15 Yes Yes

Resource Requirement Strings

A resource requirement string describes the resources a job needs. LSF uses resource requirements to select hosts for remote execution and job execution.

A resource requirement string is divided into four sections:

The selection section specifies the criteria for selecting hosts from the system. The ordering section indicates how the hosts that meet the selection criteria should be sorted. The resource usage section specifies the expected resource consumption of the task. The job spanning section indicates if a (parallel) batch job should span across multiple hosts.

The syntax of a resource requirement expression is:

select[selectstring] order[orderstring] rusage[usagestring] span 
[spanstring]

The section names are select, order, rusage, and span. The syntax for each of selectstring, orderstring, usagestring, and spanstring is defined below.

Note

The square brackets are an essential part of the resource requirement expression.

Depending on the command, one or more of these sections may apply. The lshosts command only selects hosts, but does not order them. The lsload command selects and orders hosts, while lsplace uses the information in select, order, and rusage sections to select an appropriate host for a task. The lsloadadj command uses the rusage section to determine how the load information should be adjusted on a host, while bsub uses all four sections. Sections that do not apply for a command are ignored.

If no section name is given, then the entire string is treated as a selection string. The select keyword may be omitted if the selection string is the first string in the resource requirement.

Selection String

The selection string specifies the characteristics a host must have to match the resource requirement. It is a logical expression built from a set of resource names. The lsinfo command lists all the resource names and their descriptions. The resource names swap, idle, logins, and cpu are accepted as aliases for swp, it, ls, and r1m respectively.

The selection string can combine resource names with logical and arithmetic operators. Non-zero arithmetic values are treated as logical TRUE, and zero as logical FALSE. Boolean resources (for example, server to denote LSF server hosts) have a value of one if they are defined for a host, and zero otherwise.

Table 5 shows the operators that can be used in selection strings. The operators are listed in order of decreasing precedence.

Table 5. Operators in Resource Requirements

Syntax

Meaning

-a
!a

Negative of a
Logical not: 1 if a==0, 0 otherwise

a * b
a / b

Multiply a and b
Divide a by b

a + b
a - b

Add a and b
Subtract b from a

a > b
a < b
a >= b
a <= b

1 if a is greater than b, 0 otherwise
1 if a is less than b, 0 otherwise
1 if a is greater than or equal to b, 0 otherwise
1 if a is less than or equal to b, 0 otherwise

a == b
a != b

1 if a is equal to b, 0 otherwise
1 if a is not equal to b, 0 otherwise

a && b

Logical AND: 1 if both a and b are non-zero, 0 otherwise

a || b

Logical OR: 1 if either a or b is non-zero, 0 otherwise

The selection string is evaluated for each host; if the result is non-zero, then that host is selected. For example:

select[(swp > 50 && type == MIPS) || (swp > 35 && type == ALPHA)]
select[((2*r15s + 3*r1m + r15m) / 6 < 1.0) && !fs && (cpuf > 
4.0)]

For the string resources type and model, the special value any selects any value and local selects the same value as that of the local host. For example, type==local selects hosts of the same type as the host submitting the job. If a job can run on any type of host, include type==any in the resource requirements. If no type is specified, the default depends on the command. For lshosts, lsload, lsmon and lslogin the default is type==any. For lsplace, lsrun, lsgrun, and bsub the default is type==local unless a model or Boolean resource is specified, in which case it is type==any.

Order String

The order string allows the selected hosts to be sorted according to the values of resources. The syntax of the order string is

[-]res[:[-]res]...

Each res must be a dynamic load index; that is, one of the indices r15s, r1m, r15m, ut, pg, io, ls, it, tmp, swp, mem, or an external load index defined by the LSF administrator. For example, swp:r1m:tmp:r15s is a valid order string.

Note

The values of r15s, r1m, and r15m used for sorting are the normalized load indices returned by lsload -N (see `Load Indices' on page 37).

The order string is used for host sorting and selection. The ordering begins with the rightmost index in the order string and proceeds from right to left. The hosts are sorted into order based on each load index, and if more hosts are available than were requested, the LIM drops the least desirable hosts according to that index. The remaining hosts are then sorted by the next index.

After the hosts are sorted by the leftmost index in the order string, the final phase of sorting orders the hosts according to their status, with hosts that are currently not available for load sharing (that is, not in the ok state) listed at the end.

Because the hosts are resorted for each load index, only the host status and the leftmost index in the order string actually affect the order in which hosts are listed. The other indices are only used to drop undesirable hosts from the list.

When sorting is done on each index, the direction in which the hosts are sorted (increasing vs decreasing values) is determined by the default order returned by lsinfo for that index. This direction is chosen such that after sorting, the hosts are ordered from best to worst on that index.

When an index name is preceded by a minus sign `-', the sorting order is reversed so that hosts are ordered from worst to best on that index.

The default sorting order is r1m:pg (except for lslogin(1): ls:r1m).

Resource Usage String

This string defines the expected resource usage of the task. It is used to specify resource reservations for LSF Batch jobs, or for mapping tasks onto hosts and adjusting the load when running interactive jobs.

LSF Batch Jobs

For LSF Batch jobs, the resource usage section is used along with the queue configuration parameter RES_REQ (see `Scheduling Conditions' on page 65). External indices are also considered in the resource usage string.

The syntax of the resource usage string is

res=value[:res=value]...[:res=value][:duration=value][:decay=valu
e]

The res parameter can be any load index. The value parameter is the initial reserved amount. If res or value is not given, the default is not to reserve that resource.

The duration parameter is the time period within which the specified resources should be reserved. It is specified in minutes by default. If the value is followed by the letter 'h', it is specified in hours. For example, 'duration=30' and 'duration=2h' specify a duration of 30 minutes and two hours respectively. If duration is not specified, the default is to reserve the total amount for the lifetime of the job.

The decay parameter indicates how the reserved amount should decrease over the duration. A value of 1, 'decay=1', indicates that system should linearly decrease the amount reserved over the duration. The default decay value is 0, which causes the total amount to be reserved for the entire duration. Values other than 0 or 1 are unsupported. If duration is not specified decay is ignored.

rusage[mem=50:duration=100:decay=1]

The above example indicates that 50MB memory should be reserved for the job. As the job runs, the amount reserved will decrease at approximately 0.5 megabytes per minute until the 100 minutes is up.

LSF Base Jobs

Resource reservation is only available for LSF Batch. If you run jobs using LSF Base, such as through lsrun, LIM uses resource usage to determine the placement of jobs. LIM's placement is limited in comparison to LSF Batch in that the LIM does not track when an application finishes. Resource usage requests are used to temporarily increase the load so that a host is not overloaded. When LIM makes a placement advice, external load indices are not considered in the resource usage string. In this case, the syntax of the resource usage string is

res[=value]:res[=value]: ... :res[=value]

The res is one of the resources whose value is returned by the lsload command.

rusage[r1m=0.5:mem=20:swp=40]

The above example indicates that the task is expected to increase the 1-minute run queue length by 0.5, consume 20 Mbytes of memory and 40 Mbytes of swap space.

If no value is specified, the task is assumed to be intensive in using that resource. In this case no more than one task will be assigned to a host regardless of how many CPUs it has.

The default resource usage for a task is r15s=1.0:r1m=1.0:r15m=1.0. This indicates a CPU intensive task which consumes few other resources.

Job Spanning String

This string specifies the locality of a parallel job (see `Specifying Locality' on page 104). Currently only the following two cases are supported:

span[hosts=1]

This indicates that all the processors allocated to this job must be on the same host.

span[ptile=n]

This indicates that only n processors on each host should be allocated to the job regardless of how many processors the host possesses.

If span is omitted, LSF Batch will allocate the required processors for the job from the available set of processors.

Specifying Shared Resources

A shared resource may be used in the resource requirement string of any LSF command. For example when submitting an LSF Batch job which requires a certain amount of shared scratch space, you might submit the job as follows:

% bsub -R "avail_scratch > 200 && swap > 50" myjob

The above assumes that all hosts in the cluster have access to the shared scratch space. The job will only be scheduled if the value of the "avail_scratch" resource is more than 200MB and will go to a host with at least 50MB of available swap space.

It is possible for a system to be configured so that only some hosts within the LSF cluster have access to the scratch space. In order to exclude hosts which cannot access a shared resource, the "defined(resource_name)" function must be specified in the resource requirement string. For example:

% bsub -R "defined(avail_scratch) && avail_scratch > 100 && swap > 100" myjob

would exclude any hosts which cannot access the scratch resource. The LSF administrator configures which hosts do and do not have access to a particular shared resource.

Shared resources can also work together with the resource reservation mechanism of LSF Batch to prevent over-committing resources when scheduling. To indicate that a shared resource is to be reserved while a job is running, specify the resource name in the 'rusage' section of the resource requirement string. For example:

% bsub -R "select[defined(verilog_lic)] rusage[verilog_lic=1]" myjob

would schedule the job on a host when there is verilog license available. The license will be reserved by the job after it is scheduled, until it completes.

Configuring Resource Requirements

Some applications require resources other than the default. LSF can store resource requirements for specific applications so that the application automatically runs with the correct resources. For frequently used commands and software packages, the LSF administrator can set up cluster-wide resource requirements available to all users in the cluster. See the LSF Batch Administrator's Guide for more information.

You may have applications that you need to control yourself. Perhaps your administrator did not set them up for load sharing for all users, or you need a non-standard setup. You can use LSF commands to find out resource names available in your system, and tell LSF about the needs of your applications. LSF stores the resource requirements for you from then on.

Remote Task List File

A task is a UNIX or NT command or a user-created executable program; the terms application or job are also used to refer to tasks.

The resource requirements of applications are stored in the remote task list file. When you run a job through LSF, LSF automatically picks up the job's default resource requirement string from the remote task list files, unless you explicitly override the default by specifying the resource requirement string on the command line.

There are three sets of task list files: the system-wide default file lsf.task, the cluster default file lsf.task.cluster, and the user file $HOME/.lsftask. The system and cluster default files apply to all users. The user file specifies the tasks to be added to or removed from the system lists for your jobs. Resource requirements specified in your user file override those in the system lists.

Managing Your Task List

The lsrtasks command inspects and modifies the remote task list. Invoking lsrtasks commands with no arguments displays the resource requirements of tasks in the remote list, separated from the task name by `/'.

% lsrtasks
cc/cpu cfd3d/type == SG1 && cpu compressdir/cpu:mem f77/cpu verilog/cpu && cadence compress/cpu dsim/type == any hspice/cpu && cadence nas/swp > 200 && cpu compress/-:cpu:mem epi/hpux11 sparc regression/cpu cc/type == local synopsys/swp >150 && cpu

You can specify resource requirements when tasks are added to the user's remote task list. If the task to be added is already in the list, its resource requirements are replaced.

% lsrtasks + myjob/swap>=100 && cpu

This adds myjob to the remote tasks list with its resource requirements.

Using Resource Requirements

Most LSF commands accept a -R resreq argument to specify resource requirements. The exact behaviour depends on the command; for example, specifying a resource requirement for the lsload command displays the load levels for all hosts that have the requested resources.

Specifying resource requirements for the lsrun command causes LSF to select the best host out of the set of hosts that have the requested resources. The -R resreq option overrides any resource requirements specified in the remote task list. For an example of the lsrun command with the -R resreq option see `Running Remote Jobs with lsrun' on page 140.


[Contents] [Index] [Top] [Bottom] [Prev] [Next]


doc@platform.com

Copyright © 1994-1998 Platform Computing Corporation.
All rights reserved.