This section of the LSF installation documentation describes how to add another host machine to an existing LSF cluster, any time after the initial installation and setup. The procedures contained in this section assume that an LSF cluster is installed, configured, and running correctly at your site. Adding an additional host to your LSF cluster at this point involves most of the same steps that were required to add hosts when LSF was installed initially.
You do not need to shut down the LSF daemons before you add another host to the cluster. LSF can continue to operate while you configure the new machine.
This section of the LSF installation documentation set describes how to:
lsfsetup
program,
on each LSF host in the cluster, to:
lsf.conf
,
simplifying setup and maintenance by allowing all hosts to access the
same configuration file
lsf.cluster.
cluster
file
If you are using LSF's custom installation procedure, you must create the symbolic links to host type specific directories manually rather than using the lsfsetup
program.
Adding a host to an existing LSF cluster after initial installation and setup is done, for the most part, the same way as adding hosts to the cluster at install time.
The primary difference is that when you add one or more new hosts later on, you must find out if they are hosts of the same type as those currently in the cluster. If they are not, you must obtain the LSF distribution file for the host type of the host machine being added, and install those LSF files before adding the new host.
You can check this in the following way.
root
).
/usr/local/lsf
by default).
This is the directory that contains all host type dependent LSF files, which include programs, daemons, and libraries compiled for a specific type of host machine. These files can be shared by all hosts of the same type in the cluster.
More information on the LSF_MACHDEP
directory can be found
in `LSF Directory Structure' on page 3.
If the appropriate host type is currently installed, there will be a subdirectory bearing the name of the host type.
If the appropriate host type is not currently installed, there will be no subdirectory bearing its name, and you will have to obtain the appropriate LSF distribution file and install the software.
Instructions for installing software for another LSF host type after the initial installation and setup are contained in `Adding a Host Type' on page 75.
If the LSF software appropriate for the host type of the host machine you are adding to the cluster, continue with the steps in the next section, `Host Setup Procedures'.
Once you know that the LSF software appropriate for the host type of the host machine you are adding to the cluster has been installed, you can follow the host setup procedures in `LSF Host Setup' on page 19.
You should repeat this procedure for each host that you want to add to an existing LSF cluster. When you have finished setting up all the hosts you are adding to your cluster, you can move on to the next section, `Adding Host Information to the Cluster Configuration File'.
After you have successfully run the lsfsetup
program's Host Setup functions on the host you are adding to the cluster, you
must add information about the new host (such as its name, for example) to your
cluster's configuration file, lsf.cluster.cluster_name
.
This file is located in the LSF_CONFDIR directory.
root
.
lsfsetup
program.
It has been installed in the LSF_MACHDEP
/etc directory.
You will be prompted to confirm the location of the LSF configuration file
(lsf.conf
) you want to use to set up the new host.
During installation, the lsfsetup
command created the lsf.conf
file in the LSF_SERVERDIR
based on your decisions.
lsfsetup
has found the correct configuration file,
or enter a path to the correct one if, for any reason, the path displayed
is incorrect.
You are prompted to input the name of the host you are adding to the cluster.
The vi
text editor is started on the LSF configuration file
where the host thresholds are configured.
You may want to use the default values for that host type now, and change them later on when you have more experience or more information. This can be done without interrupting LSF service.
Once you have entered all desired host information into the configuration file, you can proceed to the next section, `Reconfiguring the Cluster'.
After changing the cluster configuration file to include the information for the new host(s), you must tell LSF that it should reread the file to pick up the changes.
Running
lsadmin
this way causes LSF to check the configuration file for errors. If no errors are found, a message indicating this is displayed, and you are asked if you want to restart LSF's LIMs on all hosts, and reconfigure the LIM daemons.
The changes to the configuration files are committed. Once you have successfully completed this step, you can proceed to `Starting LSF Servers at Boot Time' on page 86.
LSF uses UDP and TCP ports for communication. All hosts in the cluster must use the same port numbers so that each host can connect to the servers on other hosts. There are three alternative places to configure the port numbers for the LSF services:
/etc/services
file.
/etc/lsf.conf
file.
To determine which is used in your system, run the command ypwhich -m services
. If this command displays a host name, your network is using NIS. On Solaris 2.3 systems, run the command:
% nismatch name=login services.org_dir
If this command returns a service entry for the login service, your network is using NIS+.
The Host Setup option in the lsfsetup
command tries to find out where the services should be registered. If the services database is in the /etc/services
file, lsfsetup
adds the LSF services to that file.
If your services database is in an NIS or NIS+ database, you must add the entries to your database by hand. The following is the contents of the example.services
file provided in the distribution directory. This file contains examples of the entries you must add to the services database.
# /etc/services entries for LSF daemons.
#
res 3878/tcp # remote execution server
lim 3879/udp # load information manager
mbatchd 3881/tcp # master lsbatch daemon
sbatchd 3882/tcp # slave lsbatch daemon
#
# Add this if ident is not already defined
# in your /etc/services file
ident 113/tcp auth tap # identd
Some NIS implementations fail if the NIS source file contains blank lines, causing many system services to become unavailable. Make sure that all the lines you add either contain valid service entries or begin with a comment character `#
'.
If any other service listed in your services database has the same port number as one of the LSF services, you can change the port number for the LSF service. You must use the same port numbers on every LSF host.
If you are running NIS, you only need to modify the services database once per NIS master. On some hosts the NIS database and commands are in the /var/yp
directory; on others NIS is found in /etc/yp
. Follow these steps:
ypwhich -m services
command to find the name of the
NIS master host.
root
.
/var/yp/src/services
or /etc/yp/src/services
file on the NIS master host and add the contents of the example.services
file.
/var/yp
or /etc/yp.
% ypmake services
On some hosts the master copy of the services database is stored in a different location; refer to your system documentation for more information.
On systems running NIS+ the procedure is similar; again, please refer to your system documentation.
If you do not want to change the /etc/services
file or the NIS database, you can configure the service port numbers in the lsf.conf
file (typically installed in /etc
). Edit the lsf.conf
file and add the following lines:
LSF_RES_PORT=3878
LSF_LIM_PORT=3879
LSB_MBD_PORT=3881
LSB_SBD_PORT=3882
LSF_ID_PORT=113
You must make sure that the same entries are added to the /etc/lsf.conf
file on every host.
The lsfsetup
Host Setup procedure normally configures each LSF server host to start the LSF daemons when the host boots. This section describes the changes lsfsetup
makes to your system, and describes how to perform this setup by hand.
The LSF daemons must be run by root on every server host in the cluster. The steps required to set up daemons are different under different versions of UNIX. In any case, the LSF daemons should be started after all other networking and NFS daemons, and after the filesystems containing the LSF executables and configuration files are available.
On BSD-based UNIX systems such as ULTRIX, SunOS 4, and ConvexOS, the startup commands should be placed at the end of the /etc/rc.local
script. lsfsetup
adds the following text to the /etc/rc.local
script to start the daemons:
# %LSF_START% Start LSF daemons
/usr/local/lsf/etc/lsf_daemons start
# %LSF_END%
On HP-UX 9.x, you should add the above command to the localrc
function in the /etc/rc
file. If your site has created a local startup file such as /etc/rc.local
, you should put the startup command into that file instead.
On System V- and POSIX-based systems such as Digital UNIX, Solaris, SGI IRIX, and HP-UX 10.x, daemons are started and stopped by scripts in the /etc/init.d
and/etc/rc*.d
, or /sbin/init.d
and /sbin/rc*.d
, directories. lsfsetup
links the LSF_SERVERDIR/lsf_daemons
script file from the distribution into the appropriate place depending on the run state defined in /etc/inittab
. If the /etc/init.d
directory exists, lsfsetup
creates symbolic links in the /etc
directories; if /sbin/init.d
exists, the links are created in /sbin
. As an example, lsfsetup
will create the following links if the run state in /etc/inittab
is defined as 3:
# ln -s /usr/local/lsf/etc/lsf_daemons /etc/init.d/lsf
# ln -s /etc/init.d/lsf /etc/rc3.d/S95lsf
The LSF daemons must be started on every server host in the LSF cluster.