This appendix describes how to run LSF on Windows NT. It is assumed that you are already familiar with LSF concepts, and have installed LSF on Windows NT following the instructions in the LSF Installation Guide.
pview) for monitoring processes.
telnet daemon to enable remote login sessions or some other form of remote access software to allow for easier management.
cmd.exe instead of /bin/sh as on UNIX. For example, the queue-level pre and post-exec commands are invoked as:
cmd.exe /C pre-exec command
NUL rather than /dev/null
as on UNIX. LSF translates /dev/null to NUL for
Windows NT.
/etc directory on UNIX corresponds to the %SYSTEMROOT% directory on NT.
/tmp as the temporary directory. On
Windows NT, the temporary directory used by LSF can be configured by setting
LSF_TMPDIR as a system environment variable. If that variable
is not found, LSF goes to the next item in the following list, until a directory
is defined:
LSF_TMPDIRenvironment variable
LSF_TMPDIRvariable in thelsf.conffile
TMPenvironment variable (C:\tempby default)environment variable (
TEMPC:\tempby default)
%SYSTEMROOT%
-s option of bkill
has no meaning on Windows NT. LSF, however, supports the job control functionality
by providing the equivalent of SIGSTOP, SIGCONT,
and SIGTERM to suspend, resume, and terminate a job. These can
be accessed through the commands bstop, bresume,
and bkill.
umask parameter is ignored on Windows NT.
bsub, remember that the syntax of
the commands must be specified in the form understood by Windows NT batch
files. For example to specify multiple commands in a single line, use `&&'
as the command separator instead of `;' as in UNIX. For example,
use:
bsub `cmd1 && cmd2'
instead of:
bsub `cmd1; cmd2'
Also when specifying commands from standard input, use CTRL-Z to indicate EOF. On UNIX, CTRL-D is used. For example:
c:\temp> bsub -q simulation
bsub> myjob arg1 arg2
bsub> ^Z
tmp index returned by lim measures the space
on the drive specified by the TEMP system environment variable.
The directories work, logs, bin, lib, etc, and conf, are all subdirectories of your LSF directory.
For all LSF files, Platform Computing recommends that you give full control permission to the Domain Admins user group. Other permissions should be set as shown:
work, logs
LSF primary administrator: full control (All) (All)
domain administrator: full control (All) (All)
everyone: special access (R) (R)
bin, lib, etc
LSF primary administrator: full control (All) (All)
domain administrator: full control (All) (All)
everyone: special access (RX) (RX)
conf
LSF primary administrator: full control (All) (All)
domain administrator: full control (All) (All)
When LSF needs to send email to users, it invokes the program defined by LSB_MAILPROG in the lsf.conf file (in the etc subdirectory). If LSB_MAILPROG is not defined, no email is sent.
To use email, you need to use LSF's lsmail.exe program, which can send email to a UNIX host by using the Windows NT rsh utility (%WINDIR%\system32\rsh) to invoke sendmail(1) on the UNIX host. In order for this to work, the UNIX machine must be set up to allow the NT rsh client to run on it.
To support this method of sending email, lsmail.exe should be copied to a file corresponding to the name of the UNIX host. For example,
copy lsmail.exe unixhost.exe
Here unixhost is a UNIX machine which supports sendmail(1). The LSB_MAILPROG should correspond to the unixhost.exe file. For example:
LSB_MAILPROG=//serverA/tools/lsf/bin/unixhost.exe
See `LSB_MAILPROG' on page 162 for details on how LSB_MAILPROG is invoked.
The command shell (cmd.exe) under Windows NT 4.0 cannot be started from a directory which is specified as a UNC name. For example, if you type the following on the command line, cmd.exe will end up starting in the directory specified by %WINDIR%, the system root directory of the current machine.
start /d\\serverA\share\username cmd.exe
As a result, jobs submitted from a shared directory will not start in the correct directory on the execution host.
The command shell from Windows NT 3.51, however, does support this feature. Microsoft has confirmed that this is a bug in NT 4.0, and included a fix in service pack 3 (refer to the article Q156276 in the Microsoft Knowledge Database for information).
In order for LSF to work correctly on Windows NT 4.0 machines, you can use one of three methods.
cmd.exe with the cmd.exe from service pack 3. The cmd.exe file typically resides in the %WINDIR%\system32 directory. LSF modifies the appropriate registry keys mentioned in article Q156276 to allow the UNC path to work.
cmd.exe into the %WINDIR%\system32 directory under another name, e.g. cmd351.exe, and set the LSF_CMD_SHELL variable in the lsf.conf file to tell LSF to use this shell instead of cmd.exe.
For example, put the following line into the lsf.conf file:
LSF_CMD_SHELL=cmd351.exe
To run jobs in a mixed cluster, LSF users should have a user account with the same user name on UNIX and Windows NT. It is particularly important that the LSF primary administrator user account always have the same user name on both platforms.
If you used the Windows NT version of LSF Setup to create the UNIX/NT mixed cluster, as described in the LSF Installation Guide, the following settings have already been configured.
The LSF configuration files must be accessible from both NT and UNIX hosts. You can set up a shared file system between the UNIX and NT machines via NFS client on NT or an SMB server on UNIX, or, alternatively, you can replicate the configuration files. No matter how you arrange your configuration files, you must make sure that the port numbers (LSF_LIM_PORT, LSF_RES_PORT, LSF_SBD_PORT and LSF_MBD_PORT) defined in the lsf.conf file are the same on both UNIX and NT.
For example, if you use an SMB server on the UNIX side, you would simply set the three variables--LSF_CONFDIR, LSB_CONFDIR, and LSF_SHAREDIR--in the lsf.conf file to point to the corresponding directories used by the UNIX hosts. The LSF_CONFDIR and LSB_CONFDIR directories must be accessible to all users (read permission). However, only the LSF primary administrator should have full control of these directories (read and write permissions).
By default, LSF transfers environment variables from the submission host to the execution host. However, some environment variables are not applicable to another operating system.
When submitting a job from a Windows NT machine to a UNIX machine, the -L option of the bsub command can be used to reinitialize the environment variables. If submitting a job from a UNIX machine to a NT machine, you can set the environment variables explicitly in your job script. Alternatively, the Job Starter feature can be used to reset the environment variables before starting the job. LSF automatically resets the PATH on the execution host if the submission host is of a different type.
If the submission host is Windows NT and the execution host is UNIX, then the PATH variable is set to /bin:/usr/bin:/sbin:/usr/sbin and LSF_BINDIR (if defined in the lsf.conf file) is appended to it. If the submission host is UNIX and the execution host is Windows NT, the PATH variable is set to the system PATH variable with LSF_BINDIR appended to it.
The lssrvcntrl.exe binary only works when invoked from a Windows NT host. You will not be able to start LSF daemons on a Windows NT machine from a UNIX host.
LSF supports signal conversion between UNIX and Windows NT for remote interactive execution through RES (when you are using lsrun and bsub -I).
On Windows NT, the CTRL+C and CTRL+BREAK key combinations are treated as signals for console applications (these signals are also called console control events). LSF supports these two NT console signals for remote interactive execution, i.e. on the execution host LSF regenerates these signals for users' tasks. In a mixed NT/UNIX environment, LSF has the following default conversion between the NT console signals and the UNIX signals:
UNIX/NT Signal Conversion
For example, if you issue lsrun or bsub -I commands from an NT console, but the task is running on a UNIX host, pressing the CTRL+C keys will generate a UNIX SIGINT signal to your task on the UNIX host. The reverse is also true.
For lsrun (but not bsub -I), LSF allows users to define their own signal conversion using the following two environment variables.
For example, suppose a user sets the following:
LSF_NT2UNIX_CLTRC=SIGXXXX LSF_NT2UNIX_CLTRB=SIGYYYY
Here, SIGXXXX/SIGYYYY are UNIX signal names such as SIGQUIT, SIGTTIN, etc. The conversions will then be: CTRL+C = SIGXXXX and CTRL+BREAK = SIGYYYY.
If both LSF_NT2UNIX_CLTRC and LSF_NT2UNIX_CLTRB are set to the same value, (LSF_NT2UNIX_CLTRC=SIGXXXX and LSF_NT2UNIX_CLTRB=SIGXXXX), then on the Windows NT execution host, CTRL+C will be generated.
For bsub -I, there is no conversion other than the default conversion.
The LSF service and daemons on each LSF server host will start automatically when the machine is restarted.
If you cannot restart each host at this time, log on as an LSF cluster administrator (a member of the LSF Global Administrators group) and start the LSF service and daemons manually.
You should not use the primary LSF administrator's account (normally lsfadmin) to start or stop LSF service and daemons.
To start the LSF service and daemons, use any one of the following methods:
lssrvcntrl start -m all lssrvman
Usage information forlssrvcntrlis available by typinglssrvcntrlwith no options.
Each user who wants to use LSF needs to supply the password of his/her domain user account. Use the lspasswd.exe command, and follow the instructions. For example:
lspasswd [-u user_name]
If you do not specify the -u option as above, the user is assumed to be the current user.
In addition, all users need to have the "Logon as a batch job" privilege on every LSF server host. For this purpose, you can simply put all LSF users into the `LSF user group' created for or assigned by you during the installation. The LSF user group has the "Logon as a batch job" privilege on all LSF server hosts.
pty-type options for lsrun and bsub -I, i.e. the -P and -S options for lsrun and -Ip and -Is options for bsub are not supported.
elim, esub, or eexec), the command must be a binary executable, that is, elim.exe or esub.exe. It cannot be a batch file such as elim.bat.
LSF_USE_HOSTEQUIV parameter in lsf.conf is ignored on Windows NT.
nice>=0 corresponds to an NT priority class of IDLE
nice<0 corresponds to an NT priority class of NORMAL
HIGH or REAL-TIME priority classes.
io index shows 0, unless the disk performance counters are turned on. To turn on disk performance counters, use the DISKPERF command.
Turning on the performance counters incurs extra overhead in disk I/O.
SBD_SLEEP_TIME. This is because sbatchd periodically checks if the limit has been exceeded. On UNIX systems, the CPU limit can be enforced by the OS at the process level.
html subdirectory of your LSF directory.