Configuration¶
The configuration of hanythingondemand starts with the hod.conf
file. This is an ini
style configuration file
with at least two sections: Meta
and Config
. Here is an example taken from the Hadoop 2.3 configs:
[Meta]
version=1
[Config]
modules=Hadoop/2.3.0-cdh5.0.0
master_env=HADOOP_HOME,EBROOTHADOOP,JAVA_HOME
services=resourcemanager.conf,nodemanager.conf,screen.conf
config_writer=hod.config.writer.hadoop_xml
workdir=/tmp
directories=$localworkdir/dfs/name,$localworkdir/dfs/data
autogen=hadoop
Here we have the Meta section with version set to 1. Version refers to the hanythingondemand configuration version. This is a placeholder in case we change the configurations around. That’s all the Meta
information is needed (for now). The following parameters are set in the Config
section:
autogen
- Configuration files to autogenerate. This can be hadoop, hadoop_on_lustre2, or left blank. If it is set then hanythingondemand will create a basic configuration for you. This is particularly useful since it will calculate values for memory settings.You can then override any settings you feel necessary.config_writer
- a reference to the python code that will output the configuration used by the services.directories
- directories to create. If the service would fail without some directories being created, they should be entered here.master_env
- environment variables to pass from the master node to the slave nodes. This is used because MPI slaves don’t have an environment.modules
- modules that must be loaded when the cluster begins.services
- a list of service files containing start and stop script information.workdir
- place where logs and temporary data is written. Configuration files will be copied here as well.localworkdir
is a subdirectory ofworkdir
and is useful for whenworkdir
is on a shared file system.
Template parameters¶
There are some templating variables that can be entered into the configuration files. These use a dollar sign ($
) prefix.
masterhostname
- hostname for the master node.masterdataname
- hostname for the Infiniband interface of the master nodehostname
- hostname for the local node.hostaddress
- ip for the local node.dataname
- hostname for the Infiniband interface of the local node.dataaddress
- ip for the Infiniband interface of the local node.user
- user name of the person running the cluster.pid
- process ID.workdir
- workdir as defined.localworkdir
- subdirectory of workdir qualified using the node name and a pid. This is used for keeping distinct per-node directories on a shared file system.
Service configs¶
Service configs have three sections: Unit
, Exec
and Environment
. Here is an example:
[Unit]
Name=nodemanager
RunsOn=all
[Service]
ExecStart=$$EBROOTHADOOP/sbin/yarn-daemon.sh start nodemanager
ExecStop=$$EBROOTHADOOP/sbin/yarn-daemon.sh stop nodemanager
[Environment]
YARN_NICENESS=1 /usr/bin/ionice -c2 -n0 /usr/bin/hwloc-bind socket:0
HADOOP_CONF_DIR=$localworkdir/conf
YARN_LOG_DIR=$localworkdir/log
YARN_PID_DIR=$localworkdir/pid
Name
- name of the service.RunsOn
-(all|master|slave)
. Determines which nodes/group of nodes to run the service.ExecStartPre
- script to run before starting the service. e.g. used in HDFS to run the-format
script.ExecStart
- script to start the serviceExecStop
- script to stop the serviceEnvironment
- Environment variable definitions used for the service.
Autogenerated configuration¶
Autogenerating configurations is a powerful feature that lets you run services inside hanythingondemand on new clusters without having to hand calculate all the memory settings by hand.
For example,
- If your administrators installed a brand spanking new cluster with a large amount of memory available, you don’t have to create a bunch of new configuration files to reflect the new system. It should all work seamlessly.
- You are holding a class and would like to allocate each student half a node - then they can use autogenerated settings along with
--rm-ppn=<half-the-number-of-cores>
To autogenerate some configurations, set autogen
setting to an appropriate value in the Config
section.
Preview configuration¶
To preview the output configuration files that your hod.conf
file would
produce, one can use the genconfig
command:
hod genconfig --hodconf=/path/to/hod.conf --workdir=/path/to/workdir
Here, --workdir
is the output directory (which will be created if it
doesn’t yet exist) and --hodconf
is the input configuration file.