.. _configuration: Configuration ============= The configuration of hanythingondemand starts with the ``hod.conf`` file. This is an ``ini`` style configuration file with at least two sections: ``Meta`` and ``Config``. Here is an example taken from the Hadoop 2.3 configs:: [Meta] version=1 [Config] modules=Hadoop/2.3.0-cdh5.0.0 master_env=HADOOP_HOME,EBROOTHADOOP,JAVA_HOME services=resourcemanager.conf,nodemanager.conf,screen.conf config_writer=hod.config.writer.hadoop_xml workdir=/tmp directories=$localworkdir/dfs/name,$localworkdir/dfs/data autogen=hadoop Here we have the Meta section with version set to 1. Version refers to the hanythingondemand configuration version. This is a placeholder in case we change the configurations around. That's all the ``Meta`` information is needed (for now). The following parameters are set in the ``Config`` section: * ``autogen`` - Configuration files to autogenerate. This can be `hadoop`, `hadoop_on_lustre2`, or left blank. If it is set then hanythingondemand will create a basic configuration for you. This is particularly useful since it will calculate values for memory settings.You can then override any settings you feel necessary. * ``config_writer`` - a reference to the python code that will output the configuration used by the services. * ``directories`` - directories to create. If the service would fail without some directories being created, they should be entered here. * ``master_env`` - environment variables to pass from the master node to the slave nodes. This is used because MPI slaves don't have an environment. * ``modules`` - modules that must be loaded when the cluster begins. * ``services`` - a list of service files containing start and stop script information. * ``workdir`` - place where logs and temporary data is written. Configuration files will be copied here as well. ``localworkdir`` is a subdirectory of ``workdir`` and is useful for when ``workdir`` is on a shared file system. Template parameters ------------------- There are some templating variables that can be entered into the configuration files. These use a dollar sign (``$``) prefix. * ``masterhostname`` - hostname for the master node. * ``masterdataname`` - hostname for the Infiniband interface of the master node * ``hostname`` - hostname for the local node. * ``hostaddress`` - ip for the local node. * ``dataname`` - hostname for the Infiniband interface of the local node. * ``dataaddress`` - ip for the Infiniband interface of the local node. * ``user`` - user name of the person running the cluster. * ``pid`` - process ID. * ``workdir`` - workdir as defined. * ``localworkdir`` - subdirectory of workdir qualified using the node name and a pid. This is used for keeping distinct per-node directories on a shared file system. Service configs --------------- Service configs have three sections: ``Unit``, ``Exec`` and ``Environment``. Here is an example:: [Unit] Name=nodemanager RunsOn=all [Service] ExecStart=$$EBROOTHADOOP/sbin/yarn-daemon.sh start nodemanager ExecStop=$$EBROOTHADOOP/sbin/yarn-daemon.sh stop nodemanager [Environment] YARN_NICENESS=1 /usr/bin/ionice -c2 -n0 /usr/bin/hwloc-bind socket:0 HADOOP_CONF_DIR=$localworkdir/conf YARN_LOG_DIR=$localworkdir/log YARN_PID_DIR=$localworkdir/pid * ``Name`` - name of the service. * ``RunsOn`` - ``(all|master|slave)``. Determines which nodes/group of nodes to run the service. * ``ExecStartPre`` - script to run before starting the service. e.g. used in HDFS to run the ``-format`` script. * ``ExecStart`` - script to start the service * ``ExecStop`` - script to stop the service * ``Environment`` - Environment variable definitions used for the service. Autogenerated configuration --------------------------- Autogenerating configurations is a powerful feature that lets you run services inside hanythingondemand on new clusters without having to hand calculate all the memory settings by hand. For example, * If your administrators installed a brand spanking new cluster with a large amount of memory available, you don't have to create a bunch of new configuration files to reflect the new system. It should all work seamlessly. * You are holding a class and would like to allocate each student half a node - then they can use autogenerated settings along with ``--rm-ppn=`` To autogenerate some configurations, set ``autogen`` setting to an appropriate value in the ``Config`` section. Preview configuration --------------------- To preview the output configuration files that your ``hod.conf`` file would produce, one can use the ``genconfig`` command:: hod genconfig --hodconf=/path/to/hod.conf --workdir=/path/to/workdir Here, ``--workdir`` is the output directory (which will be created if it doesn't yet exist) and ``--hodconf`` is the input configuration file.