wiki:Doc/TemplateCustom

Version 8 (modified by /C=FR/O=CNRS/OU=UMR8607/CN=Michel Jouvin/emailAddress=jouvin@…, 18 years ago) (diff)

--

QWG Templates and Site Customization

This page describes the template framework structure QWG templates is part of and how to integrate site custimizations into this framework.

QWG template framework has been designed to achieve the following goals :

  • Provide generic templates for software installation and configuration, with local site parameters separated from provided templates.
  • Preserve site customization across software upgrades (OS or middleware)
  • Allow support of several sites sharing some configuration database, leveraging overall management load but keeping required flexibility.

Note : QWG Templates, starting with release 2.7.0-7, require PAN compiler 5.0.4 or later.

Template Hierarchies

QWG LCG2 templates rely on a basic template structure based on template hierarchies. Each template hierarchy is dedicated to one specific aspect of the configuration, e.g. OS, LCG middleware, site specific parameters...

There is no imposed naming of template hierarchies, except for the clusters hierarchy which must contain cluster definition. clusters is the only hierarchy searched for machine profiles.

The suggested layout is :

  • os : this hierarchy is used for OS related templates (e.g. RPMs associated with each feature group). Generally this hierarchy is made of one sub-hierarchy per OS version/architecture (e.g. sl305-i386, sl305-x64). Most of the templates in these hierarchy are generated from OS distribution and should not be edited.
  • grid : this hierarchy is used for templates related to EGEE/LCG middleware installation and configuration. Generally this hierarchy is made of one sub-hierarchy per middleware version (e.g. lcg-2.7.0, glite-1.5.0). This hierarchy typically contains templates provided by QWG LCG. Most of these templates are configurable through variable definition and should require no edit.
  • standard : this hierarchy is used for other kind of standard templates provided by some products, e.g. Quattor core templates, pan standard templates, Lemon templates... Generally this hierarchy contains one directory tree for each product. The templates in this hierarchy should not be edited.
  • sites : this hierarchy is used for templates that are not standard (site specific templates or site customized version of standard templates) but are (potentially) common to several clusters. This hierarchy generally contains one sub-hierarchy per site. site concept is explained in more detail later but has no requirement to be linked to a physical location.
  • clusters : this hierarchy is used for cluster specific templates. There should be one sub-hierarchy per cluster. A cluster defines a group of machines sharing some common configuration. One specific of a cluster is that it must contain a directory profiles containing the machine profiles (e.g. object template used to define a machine configuration). It is valid for a cluster to have an empty profiles directory.

Other pages describe in more details layout of LCG2 templates and gLite templates.

Clusters and Template Hierarchies

Each cluster is can be associated with one specific OS version, middleware version or sites by defining the appropriate include path used by pan compiler to locate templates. This include path is defined in the file cluster.build.properties at the top of each cluster hierarchy (cfg/clusters/cluster-name).

This file must contain one line defining the property cluster.pan.includes as a list of space separated hierarchy list. The hierarchy is interpreted as afile pattern relative to the cfg directory (or whatever has been specified for cfg property in file quattor.build.xml). It must end with *. To specify all sub-directories below a given directory, the pattern **/* must be used.

The include path is processed in the order specified. If a template exists in several hierarchies, the first one found according to the include path order is used. Conversely, if a template exists in several directories of a hierarchy, the inclusion order is unspecified.

A cluster.build.properties example is :

cluster.pan.includes=sites/lal/**/* sites/grif/**/* grid/lcg-2.7.0/**/* os/sl305-i386/**/* standard/**/* 

As reflected by this example, standard templates (grid, os and standard) are generally inserted last into the include path, using this relative order.

By definition, standard templates are generic templates and are easily shared. If one site or cluster really requires a specific version of a template, it can be duplicated in the cluster or site hierarchy.

Note : there is no implicit hierarchy included, except clusters/cluster-name. In particular, if you want to use standard, you need to explicitly specify it.

Cluster parameters

For every cluster, it is possible to customize its configuration in template pro_site_cluster_info.tpl. There must be one such template per cluster. As a general rule, you need to define the following properties for each cluster (value mentionned are just examples) :

#
# basic site information
#
"/system/cluster/name" = "LCG 2.7.0";
"/system/cluster/type" = "batch";
"/system/state" = "production";
"/system/siterelease" = "SL 3.05";
"/system/rootmail" = "grid.support@lal.in2p3.fr";

You can also define variable FILESYSTEM_CONFIG_SITE as an alternative template name containing a filesystem layout for the cluster (or node if this is in a machine profile). For example :

FILESYSTEM_CONFIG_SITE = "pro_lcg2_system_filesystems";

pro_site_cluster_info.tpl is often used to define the default root password in a cluster. This can be done with the following PAN instructions :

#
# set root password on machines
#
include pro_software_component_accounts;
"/software/components/accounts/rootpwd" = default("$1$57qRuCXe$NPngMkg4BrPBf5hfJzJh21");
"/software/components/accounts/shadowpwd" = true;

The encrypted password value must be provided. It can be obtained with the following command :

openssl passwd -1 my_preferred_password

Selecting OS version

There are 2 possibilities to select the OS version used by a cluster or a specific node :

  • Define the OS version at the OS level : this is done by adding the appropriate path in cluster.build.properties. Look at the example above, in section Doc/TemplateCustom. Using this method, all the machines in the cluster MUST run the same OS version/architecture. This is the only method available for SL versions before SL 4.2.
  • Define the OS version at the machine level : this is done by defining the appropriate PAN loadpath in the machine template. A cluster default can be defined in pro_site_cluster_info.tpl and a specific setting can be define in a machine profile to override this default. A template for OS selection is provided in LCG2 templates, in directory os (there is one template per OS version/architecture). This method allows to have different version/architecture combination into the same cluster. This is the recommended method for any cluster running SL4.

This second method requires OS template path in cluster.build.properties to be defined as os instead of the hierarchy used in first method.

Both methods can also be combined by specifying OS template path as follow in cluster.build.properties :

os os/sl305-i386/**/*

With these entries, the cluster will use SL 3.05 i386 as the default OS version for the cluster but this is possible to override this in any machine templates, using the second method. This is the only possibility to have both SL3 and SL4 in the same cluster, as SL3 templates don't support the second method.

RPM repositories

Each machine profile must contain the list of RPM repositories used by the profile (cluster). This is not done in any standard template. This should be done last in the machine profile and you should avoid doing this twice as this is a time consuming operation.

This is generally done by one template and, by convention, this template is located in repository directory of cluster hierarchy. Example of such templates are provided as part of QWG templates. After creating the repository list, the template must call the following 2 functions to check the presence of all RPMs required in one of the repository and purge the profile of all the unneeded information :

#
# Standard stuff: resolve repository and purge not used entries
#
"/software/packages" = resolve_pkg_rep(value("/software/repositories"));
"/software/repositories" = purge_rep_list(value("/software/packages"));

Supporting several sites with one database

The recommended layout for Quattor templates provides a flexible support for multiple sites from a unique configuration database. According to the include path defined for each cluster, each cluster can share all standard templates part of its configuration information with other clusters.

Configuration common between several clusters is stored in a site hierarchy, stored under sites directory. As far as Quattor is concerned, a site doesn't have to match a geographical location but is an abstract entity corresponding to a set of shared configuration parameters. Very often, a cluster will belong to several sites corresponding to different set of common parameters. For example, in the above example, the cluster belongs to 2 sites : lal and grif. lal defines parameters corresponding to a specific geographical locations (like network related parameters) that are common to all kind of clusters (grid machines, non grid servers, desktops...). grif, on the other hand, defines parameters that are common to all grid machines, whatever geographical location they belong to.

Template Compilation Tool

The recommended method to process all the templates and build machine profiles is to useant tool, a Java based equivalent of make, provided with SCDB (it can also be used without SCDB). ant brings the advantage of platform independance, allowing to do Quattor management tasks on any platform (Unix, Windows, MacOS).

Look at SCDB usage documentation for more information on how to use this tool.