This appendix describes how to run LSF on Windows NT. It is assumed that you are already familiar with LSF concepts, and have installed LSF on Windows NT following the instructions in the LSF Installation Guide.
pview
) for monitoring processes.
telnet
daemon to enable remote login sessions or some other form of remote access software to allow for easier management.
cmd.exe
instead of /bin/sh
as on UNIX. For example, the queue-level pre and post-exec commands are invoked as:
cmd.exe /C pre-exec command
NUL
rather than /dev/null
as on UNIX. LSF translates /dev/null
to NUL
for
Windows NT.
/etc
directory on UNIX corresponds to the %SYSTEMROOT%
directory on NT.
/tmp
as the temporary directory. On
Windows NT, the temporary directory used by LSF can be configured by setting
LSF_TMPDIR
as a system environment variable. If that variable
is not found, LSF goes to the next item in the following list, until a directory
is defined:
LSF_TMPDIR
environment variable
LSF_TMPDIR
variable in thelsf.conf
file
TMP
environment variable (C:\temp
by default)
environment variable (
TEMPC:\temp
by default)
%SYSTEMROOT%
-s
option of bkill
has no meaning on Windows NT. LSF, however, supports the job control functionality
by providing the equivalent of SIGSTOP
, SIGCONT
,
and SIGTERM
to suspend, resume, and terminate a job. These can
be accessed through the commands bstop
, bresume
,
and bkill
.
umask
parameter is ignored on Windows NT.
bsub
, remember that the syntax of
the commands must be specified in the form understood by Windows NT batch
files. For example to specify multiple commands in a single line, use `&&
'
as the command separator instead of `;
' as in UNIX. For example,
use:
bsub `cmd1 && cmd2'
instead of:
bsub `cmd1; cmd2'
Also when specifying commands from standard input, use CTRL-Z to indicate EOF. On UNIX, CTRL-D is used. For example:
c:\temp> bsub -q simulation
bsub> myjob arg1 arg2
bsub> ^Z
tmp
index returned by lim
measures the space
on the drive specified by the TEMP
system environment variable.
The directories work
, logs
, bin
, lib
, etc
, and conf
, are all subdirectories of your LSF directory.
For all LSF files, Platform Computing recommends that you give full control permission to the Domain Admins user group. Other permissions should be set as shown:
work, logs
LSF primary administrator: full control (All) (All)
domain administrator: full control (All) (All)
everyone: special access (R) (R)
bin, lib, etc
LSF primary administrator: full control (All) (All)
domain administrator: full control (All) (All)
everyone: special access (RX) (RX)
conf
LSF primary administrator: full control (All) (All)
domain administrator: full control (All) (All)
When LSF needs to send email to users, it invokes the program defined by LSB_MAILPROG in the lsf.conf
file (in the etc
subdirectory). If LSB_MAILPROG is not defined, no email is sent.
To use email, you need to use LSF's lsmail.exe
program, which can send email to a UNIX host by using the Windows NT rsh
utility (%WINDIR%\system32\rsh) to invoke sendmail(1)
on the UNIX host. In order for this to work, the UNIX machine must be set up to allow the NT rsh
client to run on it.
To support this method of sending email, lsmail.exe
should be copied to a file corresponding to the name of the UNIX host. For example,
copy lsmail.exe unixhost.exe
Here unixhost
is a UNIX machine which supports sendmail(1)
. The LSB_MAILPROG should correspond to the unixhost.exe
file. For example:
LSB_MAILPROG=//serverA/tools/lsf/bin/unixhost.exe
See `LSB_MAILPROG' on page 162 for details on how LSB_MAILPROG
is invoked.
The command shell (cmd.exe
) under Windows NT 4.0 cannot be started from a directory which is specified as a UNC name. For example, if you type the following on the command line, cmd.exe
will end up starting in the directory specified by %WINDIR%, the system root directory of the current machine.
start /d\\serverA\share\username cmd.exe
As a result, jobs submitted from a shared directory will not start in the correct directory on the execution host.
The command shell from Windows NT 3.51, however, does support this feature. Microsoft has confirmed that this is a bug in NT 4.0, and included a fix in service pack 3 (refer to the article Q156276 in the Microsoft Knowledge Database for information).
In order for LSF to work correctly on Windows NT 4.0 machines, you can use one of three methods.
cmd.exe
with the cmd.exe
from service pack 3. The cmd.exe
file typically resides in the %WINDIR%\system32
directory. LSF modifies the appropriate registry keys mentioned in article Q156276 to allow the UNC path to work.
cmd.exe
into the %WINDIR%\system32
directory under another name, e.g. cmd351.exe
, and set the LSF_CMD_SHELL variable in the lsf.conf
file to tell LSF to use this shell instead of cmd.exe
.
For example, put the following line into the lsf.conf
file:
LSF_CMD_SHELL=cmd351.exe
To run jobs in a mixed cluster, LSF users should have a user account with the same user name on UNIX and Windows NT. It is particularly important that the LSF primary administrator user account always have the same user name on both platforms.
If you used the Windows NT version of LSF Setup to create the UNIX/NT mixed cluster, as described in the LSF Installation Guide, the following settings have already been configured.
The LSF configuration files must be accessible from both NT and UNIX hosts. You can set up a shared file system between the UNIX and NT machines via NFS client on NT or an SMB server on UNIX, or, alternatively, you can replicate the configuration files. No matter how you arrange your configuration files, you must make sure that the port numbers (LSF_LIM_PORT, LSF_RES_PORT, LSF_SBD_PORT and LSF_MBD_PORT) defined in the lsf.conf
file are the same on both UNIX and NT.
For example, if you use an SMB server on the UNIX side, you would simply set the three variables--LSF_CONFDIR, LSB_CONFDIR, and LSF_SHAREDIR--in the lsf.conf
file to point to the corresponding directories used by the UNIX hosts. The LSF_CONFDIR and LSB_CONFDIR directories must be accessible to all users (read permission). However, only the LSF primary administrator should have full control of these directories (read and write permissions).
By default, LSF transfers environment variables from the submission host to the execution host. However, some environment variables are not applicable to another operating system.
When submitting a job from a Windows NT machine to a UNIX machine, the -L
option of the bsub
command can be used to reinitialize the environment variables. If submitting a job from a UNIX machine to a NT machine, you can set the environment variables explicitly in your job script. Alternatively, the Job Starter feature can be used to reset the environment variables before starting the job. LSF automatically resets the PATH on the execution host if the submission host is of a different type.
If the submission host is Windows NT and the execution host is UNIX, then the PATH variable is set to /bin:/usr/bin:/sbin:/usr/sbin
and LSF_BINDIR (if defined in the lsf.conf
file) is appended to it. If the submission host is UNIX and the execution host is Windows NT, the PATH variable is set to the system PATH variable with LSF_BINDIR appended to it.
The lssrvcntrl.exe
binary only works when invoked from a Windows NT host. You will not be able to start LSF daemons on a Windows NT machine from a UNIX host.
LSF supports signal conversion between UNIX and Windows NT for remote interactive execution through RES (when you are using lsrun
and bsub -I
).
On Windows NT, the CTRL+C and CTRL+BREAK key combinations are treated as signals for console applications (these signals are also called console control events). LSF supports these two NT console signals for remote interactive execution, i.e. on the execution host LSF regenerates these signals for users' tasks. In a mixed NT/UNIX environment, LSF has the following default conversion between the NT console signals and the UNIX signals:
UNIX/NT Signal Conversion
For example, if you issue lsrun
or bsub -I
commands from an NT console, but the task is running on a UNIX host, pressing the CTRL+C keys will generate a UNIX SIGINT signal to your task on the UNIX host. The reverse is also true.
For lsrun
(but not bsub -I
), LSF allows users to define their own signal conversion using the following two environment variables.
For example, suppose a user sets the following:
LSF_NT2UNIX_CLTRC=SIGXXXX LSF_NT2UNIX_CLTRB=SIGYYYY
Here, SIGXXXX/SIGYYYY are UNIX signal names such as SIGQUIT, SIGTTIN, etc. The conversions will then be: CTRL+C = SIGXXXX and CTRL+BREAK = SIGYYYY.
If both LSF_NT2UNIX_CLTRC and LSF_NT2UNIX_CLTRB are set to the same value, (LSF_NT2UNIX_CLTRC=SIGXXXX and LSF_NT2UNIX_CLTRB=SIGXXXX), then on the Windows NT execution host, CTRL+C will be generated.
For bsub -I
, there is no conversion other than the default conversion.
The LSF service and daemons on each LSF server host will start automatically when the machine is restarted.
If you cannot restart each host at this time, log on as an LSF cluster administrator (a member of the LSF Global Administrators group) and start the LSF service and daemons manually.
You should not use the primary LSF administrator's account (normally lsfadmin
) to start or stop LSF service and daemons.
To start the LSF service and daemons, use any one of the following methods:
lssrvcntrl start -m all lssrvman
Usage information forlssrvcntrl
is available by typinglssrvcntrl
with no options.
Each user who wants to use LSF needs to supply the password of his/her domain user account. Use the lspasswd.exe
command, and follow the instructions. For example:
lspasswd [-u user_name]
If you do not specify the -u
option as above, the user is assumed to be the current user.
In addition, all users need to have the "Logon as a batch job" privilege on every LSF server host. For this purpose, you can simply put all LSF users into the `LSF user group' created for or assigned by you during the installation. The LSF user group has the "Logon as a batch job" privilege on all LSF server hosts.
pty
-type options for lsrun
and bsub -I
, i.e. the -P
and -S
options for lsrun
and -Ip
and -Is
options for bsub
are not supported.
elim
, esub
, or eexec
), the command must be a binary executable, that is, elim.exe
or esub.exe
. It cannot be a batch file such as elim.bat
.
LSF_USE_HOSTEQUIV
parameter in lsf.conf
is ignored on Windows NT.
nice>=0
corresponds to an NT priority class of IDLE
nice<0
corresponds to an NT priority class of NORMAL
HIGH
or REAL-TIME
priority classes.
io
index shows 0, unless the disk performance counters are turned on. To turn on disk performance counters, use the DISKPERF
command.
Turning on the performance counters incurs extra overhead in disk I/O.
SBD_SLEEP_TIME
. This is because sbatchd
periodically checks if the limit has been exceeded. On UNIX systems, the CPU limit can be enforced by the OS at the process level.
html
subdirectory of your LSF
directory.