wiki:Doc/gLite/TemplateCustomization/General

Service-Independent gLite Configuration

This section contains information about how to configure gLite service-independent parameters. Refer to the dedicated page for configuring individual gLite services.

Machine types

QWG templates provide a template per machine type (CE, SE, RB, ...). They are located in machine-types directory and are intended to be generic templates. No modification should be needed.

To configure a specific machine with gLite middleware, you just need to include the appropriate machine type template into the machine profile, after specifying a template containing the specific configuration for this particular machine with the variable xxx_CONFIG_SITE (look in the template for the exact name of the variable).

Here is an example for configuring a Torque-based CE :

object template profile_grid10;

# Define specific configuration for a GRIF CE to be added to
# standard configuration
variable CE_CONFIG_SITE = "pro_ce_torque_grif";

# Configure as a CE (Torque) + Site's BDII
include machine-types/ce;

#
# software repositories (should be last)
#
include repository_common;

In this example, CE_CONFIG_SITE specify the name of a template defining the Torque configuration.

All the machine types share a common basic configuration, described in template machine-types/base.tpl. This template allows you to add site-specific configuration to this common base configuration (e.g. configuration of a monitoring agent). This is done by defining the variable GLITE_BASE_CONFIG_SITE to a template containing the site-specific configuration to be added to the common configuration (at the end of the common configuration). This variable can be defined, for example, in the template pro_site_cluster_info.tpl.

The following sections describe specific variables that can be used with each machine type. The machine type template to include is specified at the beginning of the section as Base template. In addition, to get more details, you can look at examples.

Creating a New Machine Type

All gLite machines types use a common base configuration, described in machine-types/base.tpl. This template is responsible in particular to do the base OS configuration, VO configuration and NFS configuration.

When creating a new machine type derived from this gLite base machine type, it is necessary, at the very end of the new machine type, to include the gLite update and postconfig templates, using the following PAN statement:

# gLite updates
include { 'update/config' };

# Do any final OS configuration needed
include { return(GLITE_OS_POSTCONFIG) };

Without gLite OS postconfig template, machine-types/base.tpl is not expected to compile succesfully.

Site Information

Every grid (gLite) site must publish some general information about itself, mainly:

  • SITE_NAME: the site name
  • SITE_LOC: site geographical location. Format must be "City, Country".
  • SITE_LAT: site latitude (number)
  • SITE_LONG: site longitude (number)
  • SITE_EMAIL : sysadmins contact for the site
  • SITE_SECURITY_EMAIL : site email contact for security issues. Default to SITE_EMAIL.
  • SITE_USER_SUPPORT_EMAIL : site email contact for user support. Default to SITE_EMAIL.
  • SITE_OTHER_INFO : this is used to set the GlueSiteOtherInfo attribute of the site object. This attribute generally contains a list of key/value pairs. The value can be either a simple string or a list of string. The exact list is grid dependent and may change from time. For example there is generally a GRID key listing all the grids the site belong to. See example of site parameters for more details.
    • For WLCG sites, it must define the WLCG role (tier) and the attached T1. See example of site parameters for more details.

See GOC wiki for more information on site informations.

VO Configuration

The list of VOs to configure on a specific node is defined in the variable VOS. Generally a site-wide default value is defined in site/glite/config.tpl (defined with operator ?=). This value can be overridden on a specific machine by defining VOS variable in the machine profile, before including the machine type profile.

An example VOS definition is :

variable VOS ?= list('alice',
                     'atlas',
                     'biomed',
                     'calice',
                     'cms',
                     'cppm',
                     'dteam',
                     'dzero',
                     'egeode',
                     'lhcb',
                     'ops',
                     'planck',
                     );

Note : dteam and ops are mandatory VOs for EGEE sites.

As an alternative to listing explicitly all the VOs supported on a node, it is possible to define variable VOS as the string ALL (instead of a list). In this case, all VOs with parameters available in the configuration (normally all the VOs registered in the CIC portal) are configured. This specific value should normally be restricted to UIs where there are no VO accounts created. Its main usage is to let a user on a UI act as a member of any VO they may be registered in. On a gsissh-enabled UI, it is advisable to restrict the VOs allowed to connect to the UI with gsissh to a limited number of VOs when VOS='ALL'. See the section on UI configuration for more details.

For each VO listed in VOS, there must be a template defining the VO parameters in vo/params or an entry in vo/site/aliases. The template name in vo/params must be the VO full name even though a VO alias name is used in VOS. If the VO to be added has no template to define its parameters, refer to the next section about adding a new VO.

Note: VO alias names are alternative names for VOs locally defined. Unlike VO names which are guaranteed to be unique, VO aliases may clash with another alias or full name. They must be used mainly to maintain backward compatibility in existing configurations where a name other than the VO full name was used. The use of VO alias is strongly discouraged for a new configuration or new VOs added to an existing configuration. For some specific purposes, it is possible to execute a site-specific template just before starting the VO configuration, after the site parameters have been read and the OS configuration has been done. Use variable NODE_VO_CONFIG to specify the name of the template.

VO accounts

Templates related to VO configuration handle everything related to VO configuration on a specific node, including creation of VO accounts (pool accounts, SW manager...). See below for the parameters related to account creation. Account names are generated based on the first letters of the VO name plus some arbitrary characters unique to each VO. For these characters, it is recommended to use the new algorithm that is better to ensure uniqueness of the generated characters. For backward compatibility, it is disabled by default. To enable it, define the following variable:

variable VO_USE_LEGACY_ACCOUNT_SUFFIX ?= false;

By default, the VO accounts are created locked to prevent their interactive use. There is one exception: if the variable GSISSH_SERVER_ENABLED equals true, these accounts are automatically unlocked. This happens mainly on UI and VOBOX.

Defining a VO alias name

Note: this feature was added mainly to ease the transition with first generation of configuration. Even though it is still supported, its use is discouraged as it tends to make configuration troubleshooting harder.

VO names, now based on a DNS-like name, can be quite long . This is possible to define a local alias for the VO name and use it in the site configuration in place of the VO name.

To define such an alias, a template aliases.tpl must exist in directory vo/site in your site or cluster directory. This template must define the variable VOS_ALIASES as a nlist where the key is the VO alias name and the value the actual VO name.

For example:

variable VOS_ALIASES ?= nlist(
  'agata',  'vo.agata.org',
  'apc',  'vo.apc.univ-paris7.fr',
  'astro',  'astro.vo.eu-egee.org',
  'lal',  'vo.lal.in2p3.fr',
);

Site-Specific Defaults for VO Parameters

It is possible to define site-specific defaults for VOs that override standard default. This must be done by defining entry DEFAULT in nlist variable VOS_SITE_PARAMS. This entry is used to define parameters that will apply to all VOs if they are not defined explicitly in VO parameters.

Each entry value must be the name of a structure template or a nlist defining any of these properties :

  • create_home : Create home directories for VO accounts. Default defined by variable CREATE_HOME variable.
  • create_keys : Create SSH keys for VO accounts. Default defined by variable CREATE_KEYS variable.
  • unlock_accounts : a regexp defining host names where the VO accounts must be unlocked
  • pool_digits : default number of digits to use when creating pool accounts
  • pool_offset : offset from VO base uid for the first pool account (normal users)
  • pool_start : index of the first account to create for a VO in its allocated VO range
  • pool_size : number of pool accounts to create by default for a VO (normal users)
  • fqan_pool_size : number of pool accounts to create for specific FQANs if there is no specific value defined in the FQAN mapping entry. 1 disables the use of pool accounts (a static account is created instead). Default: 1.
  • sw_mgr_role : description of VO software manager role. Avoid to change default.
  • swmgr_pool_accounts_disabled: when
  • Location of standard services. See below.

Note: some properties are invalid in the context of the DEFAULT entry, in particular: account_prefix, base_uid, gid, name, voms_servers, voms_roles.

Overriding default VO Parameters

In addition to define default values for VO parameters, it is possible to override default VO parameters, as specified in templates located in vo/params, with site-specific values. This can be done on a per-VO basis or for all VOs configured on a machine, using the previously described variable (nlist), VOS_SITE_PARAMS:

  • To override default parameters for one specific VO, the key must be the VO name, as used in VOS variable.
  • To override default parameters for all configured VOs, use special entry LOCAL.

The value can be either a nlist defining the site-specific parameters or a string referring to a template. When the entry for a VO is a nlist or is not defined, if a template vo/site/voname can be located, it'll be loaded before applying parameters specified in VOS_SITE_PARAMS.

The allowed properties are the same as for default parameters.

Note: some properties are invalid in the context of the LOCAL entry (as with DEFAULT), in particular: account_prefix, base_uid, gid, name, voms_servers, voms_mappings.

The site-specific parameters are merged with default ones for each VO. They never replace default parameters. In particular, for voms_servers and voms_mappings, attributes specified into site-specific parameters are merged with attributes specified in the standard parameters for the same VOMS server or VOMS mapping. Site parameters need only to specify non default attributes, not the whole list of servers or roles with all their attributes.

For voms_servers, if the entry in site-specific parameters has the attribute host defined and if there is not matching entry in standard parameters, a new VOMS server is added. If the entry in site parameters has no host attribute defined but the name attribute is present, the site parameters are taken into account only if there is a matching entry in standard parameters.

For example, to define a site-specific WMS for VO Alice, the recommended solution is to create a template vo/site/alice.tpl in your site directory like :

structure template vo/site/alice;

'wms_hosts' = 'wms.example.org';

Alternatively, you can define these parameters directly into VOS_SITE_PARAMS :

variable VOS_SITE_PARAMS = nlist ('alice', nlist('wms_hosts' , 'wms.example.org',
                                                ),
                                 );

Site-specific parameters for VOMS role accounts

VOs often define roles in VOMS for specific purposes. For example, the ATLAS VO defines the role production which is used by users in charge of running production jobs. The roles defined for a VO in its VO ID card (see https://cic.gridops.org) are automatically retrieved by the SCDB update.vo.config ant task. By default, a single account with an arbitrary suffix is automatically generated for each role found. For example, the following is an extract of the mapping information generated for roles in the ATLAS VO:

"voms_mappings" ?= list(
     nlist("description", "SW manager",
       "fqan", "/atlas/Role=lcgadmin",
       "suffix", "s"),
     nlist("description", "production",
       "fqan", "/atlas/Role=production",
       "suffix", "p"),
     nlist("description", "pilot",
       "fqan", "/atlas/Role=pilot",
       "suffix", "hs"),
...

A particular site may wish to define its own parameters for a particular VOMS role. This can be done easily defining the attribute voms_mappings in VO site-specific parameters. If the entry in site-specific parameters has the attribute fqan defined and if there is not matching entry in standard parameters, a new VOMS mapping is added at the end of the list of standard mappings. If the entry in site parameters has no fqan attribute defined but the description attribute is present, the site parameters are taken into account only if there is a matching entry in standard parameters.

Note: in previous versions of the QWG templates, there used to be a variable VOMS_ROLE_CONFIG_SITE to do the site-specific configuration of VOMS mappings. This variable is now ignored and must be replaced by voms_mappings definition into VO site-specific parameters, as explained above.

For each mapping, in addition to the standard attributes (description, fqan, suffix, suffix2), the following attributes can be used:

  • enabled. When defined to false, the matching mapping in standard templates is ignored.
  • pool_size: if greater than 1, the number of pool accounts to create for this mapping. If 1, disable the use of pool accounts for this mapping.

For example, to configure the CMS role production to use pool accounts (with 20 accounts) and disable the role t1production, you may add the following to your vo/site/cms.tpl (or directly in VOS_SITE_PARAMS variable):

'voms_mappings' = list(
     nlist('description', 'production',
           'pool_size', 20,
          ),
     nlist('fqan', '/cms/Role=t1production',
           'enabled', false,
          ),
);

To use pool accounts with all the specific FQANs declared in VO parameters, using the same number of accounts in the pool for each FQAN, it is possible to define propery fqan_pool_size in the VO-specific entry or in the DEFAULT entry of VOS_SITE_PARAMS variable. In addition, it is possible to exclude the use of pool accounts for the software manager (as it has implications on software area permissions), even if pool accounts are enabled for other FQANs, by defining VO attribute swmgr_pool_accounts_disabled to true, either in a VO-specific entry or in the DEFAULT entry.

For example, to use pool accounts for each specific FQAN (except software manager) of each VO, creating 10 accounts per FQAN, except for Atlas where 20 accounts per FQAN are created:

variable VOS_SITE_PARAMS ?= nlist(
  'DEFAULT', nlist('fqan_pool_size', 10,
                   'swmgr_pool_accounts_disabled, true,
                  ),
  'atlas',   nlist('fqan_pool_size', 20),
);

When using pool accounts for specific FQANs, a group is created for all pool accounts related to each FQAN. By default the pool account primary group remains the VO group and this FQAN group is added as a secondary group. Variable VO_FQAN_POOL_ACCOUNTS_USE_FQAN_GROUP, if defined to true, changes the primary group to the FQAN group, the VO group becoming the secondary group.

Adding a New VO

Note: the procedure to create a new VO definition here is for very specific cases. The normal procedure is to register it properly on CIC Portal and generate the configuration information from the portal with ant update.vo.config (when using SCDB).

Adding a new VO involves the creation of a template defining VO parameters. This template name must be the name you use to refer to the VO in rest of the configuration but is not required to be the real VO name (can be an alias used in the configuration). This template must be located in directory vo/params, in one of your cluster- or site-specific hierarchy of templates or in gLite templates.

Note : if you create a template for a new VO, be sure to commit it to the QWG repository if you have write access to it, or to send it to the QWG developers. There is normally no reason for a VO definition not to be generally available.

To create a template to describe a new VO, the easiest is to copy the template for an already configured VO. The main variables supported in this template are :

  • name : VO official name. No default.
  • account_prefix : prefix to use when creating accounts for the VO. Generally the 3 first letters of the VO name. No default.
  • voms_servers : a nlist describing VOMS server used by the VO, if any. If the VO has several (redundant) VOMS servers, this property can be a list of nlist. For each VOMS server, supported properties are :
    • name : name of the VOMS server. This is a name used internally by template. By default, template defining VOMS server certificate has the same name. No default.
    • host : VOMS server host name. No default.
    • port : VOMS server port associated with this VO. No default.
    • cert : template name, in vo/certs , defining VOMS server certificate. If not specified, defaults to the VOMS server name.
  • voms_mappings (replace deprecated voms_roles) : list of VOMS groups/roles supported by the VO. This property is optional. This is a nlist with one entry per mapping (mapped accounts). The supported properties for each entriy are :
    • description : description of the mapping. This property is informational, except for VO software manager where it must be SW manager (with this exact casing).
    • pattern (replace deprecated name) : VO group/role combinations mapped to this account. This can be a string or a list of string (if several group/role combinations are mapped to the same account). Each value can be either a role name (without /ROLE=) or a group/role combination in standard format /group1/group2/.../ROLE=rolename. Note that and /ROLE keywords are required to be upper case, that there may be several groups but only one role and if both are present, role must be the last one. Look at LHCb VO parameters for an example.
    • suffix : suffix to append to account_prefix to build account name associated with this role.
  • base_uid : first uid to use for the VO.
  • create_home : Create home directories for VO accounts. Default defined by variable CREATE_HOME variable.
  • create_keys : Create SSH keys for VO accounts. Default defined by variable CREATE_KEYS variable.
  • gid : GID associated with VO accounts. Default : first pool account UID.
  • pool_size : number of pool accounts to create for the VO. Defaults : 200.
  • pool_digits : number of digits to use for pool accounts. Must be large enough to handle pool_size. Default is 3.
  • pool_offset : define offset from VO base uid for the first pool account
  • Location of standard services. See below.

In addition to this template, you need to have another template defining the public key of the VOMS server used by the VO. This template has the name of the VOMS server by default. It can be explicitly defined with certproperty of a VOMS server entry. If the new VO is using an already used VOMS server, there is no need to add the certificate.

Default Services for a VO

Location of standard services to use with a specific VO can be defined either in the VO parameters or in the site-specific parameters for a VO. Services that can be configured are :

  • proxy : name of the proxy server used by the VO. No default, optional.
  • rb_hosts : LCG RB host name to use by default. Service ports will be set to default values. Can be a list or a single value.
  • wms_hosts : gLite WMS host name to use by default. Service ports will be set to default values. Can be a list or a single value.
  • catalog : define catalog type used by the VO. Optional. Must be defined only for VO still using RLS (value must be rls or RLS).

In addition to variables above, it is possible to use the following variables if you need more control over service location or endpoints :

  • nshosts : name:port of the RB used by the VO (Network Server). No default.
  • lbhosts : name:port of the RB used by the VO (Logging and Bookeeping). No default.
  • wms_nshosts : name:port of the WMS used by the VO (Network Server). Can be a list or a single value. No default.
  • wms_lbhosts : name:port of the WMS used by the VO (Logging and Bookeeping). Can be a list or a single value. No default.
  • wms_proxies : endpoint URI of WMProxy used by the VO. Can be a list or a single value. No default.

VO-Specific Areas

There are a couple of variables available to customize VO-specific areas (software area, VO accounts home directories...) :

  • VO_SW_AREAS : a nlist with one entry per VO (key is the VO name as used in VOS variable). The value is a directory to use for the VO software area. Be sure to list this directory or its parent in WN_SHARED_AREAS if you want to use a shared filesystem for this area (this is highly recommended). Directories listed in this variable will be created with the appropriate permissions (0755 for VO group). In addition to per VO entries, entry DEFAULT may be used to create one SW area for each configured VO on the current node : in this case the value is the parent directory for SW areas and the per VO directory name is the VO name (default) or the SW manager userid if variable VO_SW_AREAS_USE_SWMGR is defined to true.
  • VO_SW_AREAS_USE_SWMGR : when set to true, VO SW manager userid is used as a directory name for the SW area for VOs without an explicit entry in VO_SW_AREAS.
  • VO_HOMES : a nlist with one entry per VO (key is the VO name as used in VOS variable). The value is a directory prefix to use when creating home directories for accounts. A suffix will be added to this name corresponding to the VO role suffix for role accounts or the the account number for pool accounts. By default, VO accounts are created in /home. 2 keywords allow to create a subdirectory per VO under the directory parent to avoid too many entries at the same level. Look at documentation about LCG CE for more information.
  • VO_SWMGR_HOMES : a nlist with one entry per VO (key is the VO name as used in VOS variable). The value is a directory to use as the home directory for the VO software manager. If there is not entry for a VO, VO_HOMES is used. Main purpose of this variable is to define home directory for the software manager as the VO software area. This can be achieved easily by assigning VO_SW_AREAS to this variable.
  • CREATE_HOME : this variable controls creation of VO accounts home directories. It accepts 3 values : true, false and undef. undef is a conditional true : home directories are not created if they reside on a NFS shared file system (it is listed in WN_SHARED_AREAS) and the NFS server is not the current machine.

Tuning VO configuration on a specific node

Each machine type templates define VO configuration (pool accounts, gridmap file/dir...) appropriate to the machine type. If you want to change this configuration, on a specific node, you can use the following variables :

  • NODE_VO_ACCOUNTS (boolean) : VO accounts must be created for each VO initialized. Default : true.
  • NODE_VO_GRIDMAPDIR_CONFIG (boolean) : gridmapdir entries must be initialized for pool accounts. Default : false.
  • NODE_VO_WLCONFIG (boolean) : initialize workload management environment for each VO. Normally enabled only on resource brokers. Default : false.
  • NODE_VO_CREATEHOME (boolean) : create home directories for pool accounts. Default : true.

In addition you can execute actions specific to the local machine by defining the following variable (mainly used to define a VO list specific to a node by assigning a non default value to VOS variable) :

  • NODE_VO_CONFIG (string) : site-specific template that must be included before actually doing VO intialization. Allow for specific VO modification to default VO configuration. Default : none.

Note : before modifying default VO configuration for a specific machine, be sure what you want to do is valid. Misconfiguring VO can have dramatic effects on service availability.

Mapping of VOMS groups/roles into grid-mapfile

grid-mapfile is used as a source of mapping information between users DN and Unix accounts when this cannot be obtained from VOMS.

Default behaviour for describing user mapping in grid-mapfile used to be mapping users with a specific role to the account corresponding to this role. Unfortunately, the result is unpredictable if a user has several roles in the VO. The default in QWG templates, starting with release gLite-3.0.2-12, is to always map users to normal users in grid-mapfile. To obtain a mapping based on a specific role, users have to get a proxy with the required VOMS extensions using voms-proxy-init --voms.

2 variables allow to modify this default behaviour for generating grid-mapfile:

  • VO_GRIDMAPFILE_MAP_VOMS_ROLES: when set to true, a grid-mapfile entry is added for each valid VO FQANs in addition to the VO members.
  • VO_VOMS_FQAN_FILTER: this nlist allows to define on a per-VO basis what are the VOMS FQANs to add to the grid-mapfile. The key is a VO name or DEFAULT for the default entry. Default entry if present is applied to all VOs without an explicit entry. If there is no entry for a VO and there is no default entry defined, all VO users and valid FQANs are added to the grid-mapfile. This variable is ignored if VO_GRIDMAPFILE_MAP_VOMS_ROLES is not true. The entry value must be either a FQAN declared in VO parameters (without the initial /voname), a VOMS mapping description as declared in the VO parameters or / to allow all users and valid FQANs (this can be used to override a more specific value defined later in the configuration).

These 2 variables are mainly used on VO boxes where they should be defined with appropriate values by the standard configuration.

Note: further restrictions can be imposed in authorization defined by the FQAN filter through user banning.

.lsc file support

QWG support .lsc file to authenticate server. This is the default value for all service but WMS which not supported this file format yet.

To remove this options add

 variable VOMS_LSC_FILE ?= false;

to the machine template

Allocation of Service Accounts

Some services allow to define a specific account to be used to run the service. In this case, there is one template for each of these accounts in common/users. The name of the template generally matches the user account created or, when the template is empty, the name of the service.

A site can redefine account names or characteristics (uid, home directory...). To do this, you should not edit directly the standard templates as the changes will be lost in the next version of the template (or you will have to redo them by hand). You should create a users directory somewhere in your site or cluster hierarchy (e.g. under the site directory, not directly at the same level else it will not work without adjusting cluster.build.properties) and put your customized version of the template here.

Note : don't change the name of the template, even if you change the name of the account used (else you'll need to modify standard templates requiring this user).

Trusted CAs

There is one template defining the list of trusted CAs (called the CA trust policy). A default trust policy, called the EGI core policy, is distributed as part of the standard templates. If you need to adjust it or use another trust policy, produce a site template with the information about all the CAs you accept (generally a list of RPM) and define the variable SECURITY_CA_TRUST_POLICY to the name of this template.

If you want a trust policy example, look at the default policy. If you need to update this template, refer to the standard procedure? to do it.

Trusted CAs are defined through a set of RPMs. The default template used to configure the RPM repository holding them is repository/ca : this template should be provided by the site as part of the site RPM repository templates. If you want to use another template to configure the repository holding the CA RPMS, you must define the variable SECURITY_CA_RPM_REPOSITORY to the name of the template to use. If there is no specific template for this (o you use another mean of configuring it), define this variable to null.

Globus

Globus is used by most of the gLite services. Some variables allow to configure Globus parameters, in particular Globus ephemeral port ranges.

  • GLOBUS_TCP_PORT_RANGE_MIN: lower port in TCP ephemeral port range. Default: 20000.
  • GLOBUS_TCP_PORT_RANGE_MAX: upper port in TCP ephemeral port range. Must be greater or equal to lower port. Default: 25000.
  • GLOBUS_UDP_PORT_RANGE_MIN: lower port in UDP ephemeral port range. Default: none.
  • GLOBUS_UDP_PORT_RANGE_MAX: upper port in UDP ephemeral port range. Must be greater or equal to lower. Default: none.

LCAS / LCMAPS

LCAS and LCMAPS are 2 underlying services, generally used together, by most grid services to manage authorization and user mapping. LCAS is responsible for managing authorization based on configured policies (banned users, timeslots permitted...) and LCMAPS is responsible for mapping a grid DN to a Unix user account.

LCMAPS configuration is based on VO configured and on VOMS group/role mapping rules.

LCAS can be configured with the following variables to restrict access to a grid resource like a CE:

  • LCAS_BANNED_USERS: list of user DNs forbidden access to the resource. By default, this list is empty (it as a template DN which will never match a real user).
  • LCAS_TIMESLOT_ENTRIES: a list of timeslot specification specifying when the resource is opened to grid access. See LCAS documentation for more information on the format. By default, there is no restriction.

Shared gridmapdir

QWG templates support configuration of a shared gridmapdir between different machines. This is typically used when several CEs share the same WNs to ensure a consistent mapping of DNs to userids through all CEs. The QWG implementation is not restricted to CEs, even though it doesn't really make sense for other services.

If several machines have to share the same gridmapdir, the variable GRIDMAPDIR_SHARED_PATH must be defined in their profile. This variable is undefined by default. When defined it must refer to an existing path on the machine that will use it or the gridmapdir will not be configured as shared.

Even though it is not mandatory, gridmapdir is generally shared using NFS. To enable NFS-sharing of the gridmapdir, you must define variable GRIDMAPDIR_SHARED_SERVER to the host name serving the gridmapdir. It doesn't need to be one of the machine using it (for example it can be a dedicated NFS server). If the server is managed with Quattor, Quattor will ensure that the NFS is properly configured to export the reference gridmapdir (as specified by SITE_DEF_GRIDMAPDIR on this machine) as GRIDMAPDIR_SHARED_PATH. On the "clients" (the other machines using the shared gridmapdir), NFS will be configured to mount the shared gridmapdir and SITE_DEF_GRIDMAPDIR will be redefined as a link to this mount point.

2 other variables allow to customize gridmapdir sharing according to your needs:

  • GRIDMAPDIR_SHARED_PROTOCOL: if anything different from nfs, QWG templates will not configure NFS for sharing the gridmapdir. The sharing must be done by other means in such a way that GRIDMAPDIR_SHARED_PATH is available when gridmapdir is configured on the client machine. Default: nfs.
  • GRIDMAPDIR_SHARED_CLIENTS: a list of machines sharing the gridmapdir. Default: CE_HOSTS variable (all the CEs sharing the same WNs).

Shared File Systems

It is recommended to use a shared file system mounted (at least) on CE and WNs for VO software areas. It is also sometimes convenient to use a shared file system for VO pool accounts (this is more or less a requirement to run MPI jobs). Currently, QWG templates support the use of NFS or non-NFS shared file systems but only the NFS service is configured by the templates. For other distributed file system (AFS, LUSTRE, GPFS...), you must add the necessary configuration to the site-specific configuration.

Configuration is done by the following variables :

  • WN_SHARED_AREAS : a nlist with one entry per file system which is shared between worker nodes and CE (key is the escaped file system mount point). See below the format of the entries for NFS-served file systems. For other distributed file systems providing a global namespace (like AFS, LUSTRE, GPFS), the entry value must be undef. It is important to add an entry in this list for each shared file system, even though not NFS served, as some parts of the configuration (eg. Torque configuration) use this information to distinguish between local and shared file systems.
  • NFS_AUTOFS : when true, use autofs to mount NFS file systems on NFS clients. This is the recommended setting, as this is the only one to avoid complex inter-dependency in startup order. But for backward compatibility, default value is false.

Note : variable WN_NFS_AREAS has been deprecated and replaced by WN_SHARED_AREAS. It the latter is not defined, WN_NFS_AREAS is used if defined.

Note : non shared filesystem for home directories is supported only with Globus job manager lcgpbs.

NFS server is configured on any machine (whatever its machine type) managed by Quattor and listed as the NFS server for one of the entries in WN_SHARED_AREAS. All actions required are done automatically. If the NFS server listed is not managed by Quattor, it is necessary to force CREATE_HOME to true on one machine.

NFS client can be potentially configured on any machine type but by default this is done only on CE and WNs. To configure the client on other machine types, define variable NFS_SERVER_ENABLED to one the following values:

  • undef: configure NFS client if needed according to the configuration (WN_SHARED_AREAS contents).
  • true: force configuration of NFS client even though there is no NFS file system to mount on the machine.

Specifying server of a NFS file system

In variable WN_SHARED_AREAS, the value of each entry specified the NFS server for the file system and optionally the file system mount point on the server if it is different than the one used on the clients. The general format for the value is a URL like:

nfs|nfs3|nfs4://server[/mount/point]

When the protocol specified is nfs (without an explicit version), server will be configured with both versions and client, unless an explicit version is request (see next section), will be configured with v3.

The legacy format:

server[:/mount/point]

is still supported and is equivalent to:

nfs3://server[/mount/point]

Selecting NFS version to use on the client

For NFS, both v3 and v4 are supported. When the version is specified in the protocol token of the server URL (see previous section), this version is used both on the server and on the clients. Otherwise both versions are configured on the server and the version configured on the client depends on the following variables:

  • NFS_CLIENT_VERSION: a nlist with one entry per node whose key is the escaped client host name and the value is a string ('3' or '4'). If the client configured has an entry in this variable, the specified NFS version is used.
  • NFS_CLIENT_DEFAULT_VERSION: a nlist where entry keys are either an escaped regexp matched against the node being configured or 'DEFAULT'. If host name of the client being configured is matched by one of the regexp, the specified value is used. Else if DEFAULT entry is present it is used.
  • If no match was found with the previous variables, v3 is used.

Suppose you want to configure v4 on all your grid nodes and only on these nodes and that their host names always start with prefix grid and belonging to domain example.org, you can use the following definition:

variable NFS_CLIENT_DEFAULT_VERSION = nlist(
  'DEFAULT',     '3',
  '^grid.*\.example\.org$',   '4',
);

Specifying NFS options

There are two variables to define mount options to be used with NFS file systems :

  • NFS_DEFAULT_MOUNT_OPTIONS : defines mount options to be used by default, if none are explicitly defined for a filesystem.
  • NFS_MOUNT_OPTS : defines mount options to be used for a specific file system. This variable is a nlist with one entry per file system : key must be the escaped path of the mount point.

Defining NFS exports

NFS exports can be defined using a set of variables. By default only CE and worker nodes are given access to NFS server. This variables can be redefined either in NFS server profile, in the cluster the NFS server belongs to or in the gLite site parameters used by NFS server.

Note : the following variables don't configure filesystem mounting. For this see Configuring shared filesystems.

Variables available to customize the NFS export ACL are :

  • NFS_CE_HOSTS : list of CE hosts requiring access to NFS server (default is CE_HOST)
  • NFS_WN_HOSTS : list of WN hosts requiring access to NFS server (default is WN_HOSTS)
  • NFS_LOCAL_CLIENTS : list of other local hosts requiring access to NFS server

These variables can be a string, a list or a nlist. A string value is interpreted as a list with one element. When specified as a list or string, the value must be a regexp matching name of nodes that must be given access to NFS server. In this case, the access rights (export options) is the string specified in variable NFS_DEFAULT_RIGHTS. When specified as a nlist, the key must be an escaped regexp matching node names (in exports format (only * and ? wilcards permitted) and the value is the export options between ().

Note : when possible, this is recommended to replace default value for NFS_WN_HOSTS (list of all WNs) by one or several regexps matching WN names to reduce the number of hosts on the export line.

NFS_DEFAULT_RIGHTS is a nlist which must contain a DEFAULT entry used for any file system without an explicit entry and optionally one entry per file system (key is the escaped file system path) when defaults are not appropriate. If not defined, default is rw with root squashing enabled for all file systems (DEFAULT entry), except /home where root squashing is disabled.

Antoher variable, NFS_CLIENT_HOSTS, allows to define the clients allowed to access the file system on a per file system basis. There is a default entry (DEFAULT) used for any file system without an explicit entry. The default value for default entry is all the hosts specified by NFS_CE_HOSTS, NFS_WN_HOSTS and NFS_LOCAL_CLIENTS. Keys specifying file systems must be the escaped file system mount point. Host list of allowed clients may be specified using regexps in export format.

Note: currently NFS_CLIENT_HOSTS is used to build the list of hosts in exports file but has no impact on the mounting of file systems on clients.

Relocation of Home Directories of VO Accounts

When using a NFS-served file system for home directories, the traditional approach to mount it under '/home' has several drawbacks. In particular, all service accounts also have NFS-based home directories and this may impact all services when NFS becomes unavailable or irresponsive. On the other hand, this is desirable to keep a unified configuration shared by machines sharing the NFS file systems and the other machines (e.g. WMS, VOBOX... all machines with VO accounts).

With QWG templates this is easily done by defining variable VO_HOMES_NFS_ROOT to the directory parent to use for home directories on a machine with NFS client configured, when parent described in variable VO_HOMES is /home. The directory pointed by VO_HOMES_NFS_ROOT must correspond to an entry (or children of an entry) in WN_SHARED_AREAS. Look at site parameter example for more details.

When modifying an existing configuration, a careful planning is needed. This cannot be done on the fly. To avoid a long reconfiguration of ncm-accounts, this generally involves:

  • On the NFS server, remounting the file system containing home directies on the new mount point
  • Delete accounts using /home (except ident) from /etc/password. This can be done with a script deployed and executed with ncm-filecopy
  • Remove symlink /home if any (autofs configuration) and create a directory /home
  • Update your site parameters and deploy the changes, defining ncm-useraccess as a post-dependency for ncm-accounts if it is used in the configuration. This will ensure that during deployment all accounts are recreated and the ssh, Kerberos... configuration for the user is done.

NFS Server

Base template : machine-types/nfs.

When using this template, it is possible to configure a machine as a dedicated NFS server whose configuration is shared with grid machines for file system configuration and accounts. But in QWG templates, any gLite machine type will be configured as a NFS server as soon as the machine is listed as the NFS server for one of the file systems in WN_SHARED_AREAS.

Monitoring

The variable MONITORING_CONFIG_SITE, which defaults to 'site/monitoring/config', can be used to specify the monitoring tools template that will be included.

RPMs Repositories

repository/config/glite.tpl describes the RPM repositories used to locate RPMs required by gLite templates. Default RPM repository configuration in QWG Templates requires 5 RPMs repositories plus an optional one for each gLite version. Name given here are the default ones.

  • glite_repos_prefix : gLite RPMs shipped with gLite.
  • glite_repos_prefix_externals : RPMs required by gLite and shipped with it but developed and maintained outside gLite.
  • glite_repos_prefix_updates : official updates to gLite base RPMs, as provided by gLite releases.
  • glite_repos_prefix_unofficial (optional) : unofficial updates to gLite base RPMs used at the site. Normally empty.
  • mpi : RPMs related to MPI.
  • ca : CA RPMs as distributed by Grid PMA.

glite_repos_prefix can be customized without editing the standard template, defining REPOSITORY_GLITE_PREFIX variable. If not explicitly defined, it defaults to glite_3_0_0 for gLite 3.0 and glite_3_1 for gLite 3.1.

All required repositories must have an associated template whose name is the same as the repository, in site- or cluster-specific templates. Optional repository is ignored if its associated template is not present. Each template describe the content of the repositories. When using SCDB?, these templates are maintained with command ant update.rep.templates.

Note : it is not required to use this structure and you can edit this template to match your local conventions, if different. When upgrading QWG templates, be sure to revert changes to this template.

A template version of these RPM repositories is distributed as part of examples (templates/trunk/sites/example/repository). They can be used to compile examples but for deployment of a real configuration, you need to build your own version of these templates. You can create an initial version of these repositories by downloading RPMs from the URL mentioned at top the template examples with wget or src/utils/misc/rpmUpdates.pl. Then update the URL at the top of the template examples to match your local repositories.

Last modified 5 years ago Last modified on Mar 2, 2012, 5:05:55 PM