= gLite Template Customization = [[TracNav]] [[TOC(inline)]] Site customization to QWGtemplates is done through a small set of templates used to define variables used as input by QWG templates. This doesn't cover OS basic configurationt that is decribed in the page about [wiki:Doc/TemplateCustom template framework]. All site parameters related to QWG middleware are supposed to be declared in template {{{pro_lcg2_config_site.tpl}}}. A sensible default value is provided for all required variables in template source:template/trunk/grid/glite-3.0.0/defaults/site.tpl provided as part of QWG templates. This template must be included as part of the site {{{pro_lcg2_config_site.tpl}}} that must provide an explicit value for at least all the variable {{{undef}}} in template source:template/trunk/grid/glite-3.0.0/defaults/site.tpl. To ease transition from LCG2 to gLite3, the template defining default parameters can still be accessed as source:template/trunk/grid/lcg-2.7.0/sources/pro_lcg2_config_system_defaults.tpl. == Machine types == QWG templates provide a template per machine type (CE, SE, RB, ...). They are located in {{{machine-types}}} directory and are intended to be generic templates. No modification should be needed. To configure a specific machine with gLite middleware, you just need to include the appropriate machine type template into the machine profile, after specifying a template containing the specific configuration for this particular machine with the variable {{{xxx_CONFIG_SITE}}} (look in the template for the exact name of the variable). Here an example for configuring a Torque based CE : {{{ object template profile_grid10; # Define specific configuration for a GRIF CE to be added to # standard configuration variable CE_TORQUE_CONFIG_SITE = "pro_ce_torque_grif"; # Configure as a CE (Torque) + Site's BDII include pro_ce_torque; # # software repositories (should be last) # include repository_common; }}} In this example, {{{CE_TORQUE_CONFIG_SITE}}} specify the name of a template defining the Torque configuration. All the machine types share a common basic configuration, described in template `machine-types/base.tpl`. This template allows to add site specific configuration to this common basic configuration (e.g. configuration of a monitoring agent). This is done by defining variable {{{GLITE_BASE_CONFIG_SITE}}} to a template containing the site specific configuration to be added to the common configuration (at the end of the common configuration). This variable can be defined, for example, in template {{{pro_site_cluster_info.tpl}}}. == VO Configuration == List of VOs to configure on a specific node is defined in variable `VOS`. Generally a site-wide default value is defined in `pro_lcg2_config_site.tpl` (defined with operator `?=`). This value can be overidden on a specific machine by defining `VOS` variable in the machine profile, before including the machine type profile. An example of VOS definition is : {{{ variable VOS ?= list('alice', 'atlas', 'biomed', 'calice', 'cms', 'cppm', 'dteam', 'dzero', 'egeode', 'lhcb', 'ops', 'planck', ); }}} ''Note : `dteam` and `ops` are mandatory VOs.'' For each VO listed in `VOS`, there must be a template defining the VO parameters in `vo/params`. The template name must be the same as the VO name used in `VOS`. If the VO to be added has no template to define its parameters, refer to next section about adding a new VO. === Defining Site Specific Defaults for VOs === It is possible to define site specific defaults for VOs that override standard default. This must be done by defining variable `VOS_SITE_PARAMS` as a nlist with an entry `DEFAULT`. The value must be the name of a structure template defining any of these properties : * `create_home` : Create home directories for VO accounts. Default defined by variable `CREATE_HOME` variable. * `create_keys` : Create SSH keys for VO accounts. Default defined by variable `CREATE_KEYS` variable. * `pool_digits` : define default number of digits to use when creating pool accounts * `pool_offset` : define offset from VO base uid for the first pool account * `pool_size` : number of pool accounts to create by default for a VO * `sw_mgr_role` : description of VO software manager role. Avoid to change default. === Adding a new VO === Adding a new VO involved the creation of a template defining VO parameters. This template name must be the name you use to refer to the VO in rest of the configuration but is not required to be the real VO name (can be an alias used in the configuration). This template must be located in directory `vo/params`, in one of your cluster or site specific hierarchy of templates or in gLite templates. ''Note : if you create a template for a new VO, be sure to commit it to the QWG repository if you have write access toit, or to send it to QWG developpers. There is normally no reason for a VO definition not to be generally available.'' To create a template to describe a new VO, the easiest is to copy the template for an already configured VO. The main variables supported in this template are : * `name` : VO official name. No default. * `account_prefix` : prefix to use when creating accounts for the VO. Generally the 3 first letters of the VO name. No default. * `voms_servers` : a nlist describing VOMS server used by the VO, if any. If the VO has several (redundant) VOMS servers, this property can be a list of nlist. For each VOMS server, supported properties are : * `name` : name of the VOMS server. This is a name used internally by template. By default, template defining VOMS server certificate has the same name. No default. * `host` : VOMS server host name. No default. * `port` : VOMS server port associated with this VO. No default. * `cert` : template name, in `vo/certs` , defining VOMS server certificate. If not specified, defaults to the VOMS server name. * `voms_roles` : list of VOMS roles supported by the VO. This property is optional. For each role, the entry is a nlist with the following possible properties : * `description` : description of the VO role. This property is informational, except for VO software manager where it must be "SW manager" * `name` : VO role name, as defined on the VOMS server * `suffix` : suffix to append to `account_prefix` to build account name associated with this role. * `proxy` : name of the proxy server used by the VO. No default, optional. * `nshosts` : name:port of the RB used by the VO (Network Server). No default. * `lbhosts` : name:port of the RB used by the VO (Logging and Bookeeping). No default. * `pool_size` : number of pool accounts to create for the VO. Defaults : 200. * `pool_digits` : number of digits to use for pool accounts. Must be large enough to handle `pool_size`. Default is 3. * `base_uid` : first uid to use for the VO. * `catalog` : define catalog type used by the VO. Optional. Must be defined only for VO still using `RLS` (value must be `rls` or `RLS`). * `create_home` : Create home directories for VO accounts. Default defined by variable `CREATE_HOME` variable. * `create_keys` : Create SSH keys for VO accounts. Default defined by variable `CREATE_KEYS` variable. In addition to this template, you need to have another template defining the public key of the VOMS server used by the VO. This template has the name of the VOMS server by default. It can be explicitly defined with `cert`property of a VOMS server entry. If the new VO is using an already used VOMS server, there is no need to add the certificate. === Tuning VO configuration on a specific node === Each machine type templates define VO configuration (pool accounts, gridmap file/dir...) appropriate to the machine type. If you want to change this configuration, on a specific node, you can use the following variables : * `NODE_VO_POOLACCOUNTS` (boolean) : pool account must be created for each VO initialized. Default : true. * `NODE_VO_GRIDMAPDIR_CONFIG` (boolean) : gridmapdir entries must be initialized for pool accounts. Default : `NODE_VO_POOLACCOUNTS` variable. * `NODE_VO_WLCONFIG` (boolean) : initialize workload management environment for each VO. Normally enabled only on resource brokers. Default : false. * `NODE_VO_CREATEHOME` (boolean) : create home directories for pool accounts. Default : true. In addition you can execute actions specific to the local site or machine by defining the following variable : * `NODE_VO_SITE_CONFIG` (string) : site specific template that must be included before actually doing VO intialization. Allow for specific VO modification to default VO configuration.Default : none. '''Note : before modifying default VO configuration for a specific machine, be sure what you want to do is valid. Misconfiguring VO can have dramatic effects on service availability.''' == Allocation of Service Accounts == Some services allow to define a specific account to be used to run the service. In this case, there is one template for each of these accounts in `common/users`. The name of the template generally matches the user account created or, when the template is empty, the name of the service. A site can redefine account names or characteristics (uid, home directory...). To do this, you should not edit directly the standard templates as the changes will be lost in the next version of the template (or you will have to redo them by hand). You should create a `users` directory somewhere in your site or cluster hierarchy (e.g. under the `site` directory, not directly at the same level else it will not work without adjusting `cluster.build.properties`) and put your customized version of the template here. '''Note : don't change the name of the template, even if you change the name of the account used''' (else you'll need to modify standard templates needing this user). == Accepted CAs == There is one template defining all the accepted CAs. We generally produced a new one each time there is a new release of the list of CAs officially accepted by EGEE. If you need to adjust it, create a site or cluster specific copy of `common/security/cas.tpl` in a directory `common/security`. If you need to update this template, refer to the standard [wiki:Development/AutoTemplates#TrustedCAsTemplate procedure] to do it. == LCG CE Configuration == QWG templates handle configuration of the LCG CE and the selected batch system (LRMS). To select the LRMS you want to use, you have to define variable `CE_BATCH_NAME`. '''There is no default'''. If you want to use Torque/MAUI, recommended version is `torque2`. The value of `CE_BATCH_NAME` must match a directory in `common` directory of gLite3 templates. ''Note : as of gLite 3.0.2, LRMS supported are Torque v1 (`torque1`) and Torque v2 (`torque2`), with MAUI scheduler.'' Previous versions of QWG templates used to require definition of `CE_BATCH_SYS`. This is deprecated : this variable is now computed from `CE_BATCH_NAME`. === PBS/Torque === PBS/Torque related templates support the following variables : * `CE_QUEUES` : a nlist with one entry per queue (key is the queue name). For each queue, the value itself is a nlist. One mandatory key is `attr` and defines the queue parameters (`qmgr set queue` options). Another optional key is `vos` and is used to explicitly define the VOs which have access to the queue (by default, only the VO with the same name as the queue has access). Look at [source:templates/trunk/grid/lcg-2.7.0/site/pro_lcg2_config_site.tpl pro_lcg2_config_site.tpl] example for an example on how to define one queue for each supported VO. * `WN_NFS_AREAS` : a nlist with one entry per file system that must be NFS mounted on worker nodes (key is the escaped file system mount point). Value for each entry is the name of the NFS server and optionaly the path on the NFS server if different from the path on the worker node. * `WN_ATTRS` : this variable is a nlist with one entry per worker node (key is the escaped node fullname). Each value is a set of PBS/Torque attribute to set on the node. Value value are any `key=value` supported by `qmgr set server` command. One useful value is `status=offline` to cause a specific node to drain or `status=online` to reenable the node. Just suppressing `status=offline` is not enough to reenable the node. One specific entry in `WN_ATTRS` is `DEFAULT` : this entry is applied to any node that doesn't have a specific entry. * `WN_CPUS_DEF` : default number of CPU per worker node. * `WN_CPUS` : a nlist with one entry per worker node (key is the node fullname) having a number of CPUs different from the default. === MAUI === MAUI related templates support the following variables : * `MAUI_CFG` : the content of this variable must contain the full content of `maui.cfg` file. Look at [source:templates/trunk/grid/lcg-2.7.0/site/pro_lcg2_config_site_maui.tpl pro_lcg2_config_site_maui.tpl] example on how to define this variable from other configuration elements. * `MAUI_WN_PART_DEF` : default node partition to use with worker nodes * `MAUI_WN_PART` : a nlist with one entry per worker node (key is node fullname). The value is the name of the MAUI partition where to place the specific worker node. === CE Status === CE related templates use variable `CE_STATUS` to control CE state. Supported values are : * `Production` : this is the normal state. CE receives and processes jobs. * `Draining` : CE doesn't accept new jobs but continues to execute jobs queued (as long as they are WNs available to execute them). * `Closed` : CE doesn't accept new jobs and jobs already queued are not executed. Only running jobs can complete. * `Queuing` : CE accepts new jobs but will not execute them. `CE_STATUS` indicates the desired status of the CE. All the necessary actions are taken to set the CE in the requested status. Default status (if variable is not specified) is `Production`. This variable can be used in conjunction to [wiki:Doc/LCG2/TemplateLayout#PBSTorque WN_ATTRS] to drain queues and/or nodes. === Run-Time Environment === gLite 3.0 templates introduce a new way to define `GlueHostApplicationSoftwareRunTimeEnvironment`. Previously it was necessary to define a list of all tags in the site configuration template. As most of these tags are standard tags attached to a release of the middleware, there is now a default list of tags defined in the default configuration site template, [source:templates/trunk/grid/glite-3.0.0/defaults/site.tpl defaults/site.tpl]. To supplement this list with tags specific to the site (e.g. `LCG_SC3`), define a variable `CE_RUNTIMEENV_SITE` instead of defining `CE_RUNTIMEENV` : {{{ variable CE_RUNTIMEENV_SITE = list("LCG_SC3"); }}} This change is backward compatible : if `CE_RUNTIMEENV` is defined in the site configuration template, this value will be used. == DPM Configuration == DPM related standard templates require a site template to describe the service site configuration. The variable `DPM_CONFIG_SITE` must contain the name of this template. This template defines the whole DPM configuration, including all disk servers used and is used to configure all the machines part of the DPM configuration. There is no default template provided for DPM configuration. To build your own template, you can look at template [source:templates/trunk/sites/example/site/pro_se_dpm_config.tpl pro_se_dpm_config.tpl] in examples provided with QWG templates. On DPM head node, variable `SEDPM_SRM_SERVER` must be defined to `true`. This variable is `false` by default (DPM disk servers). If you want to use Oracle version of DPM server define the following variable in your machine profile : {{{ variable DPM_SERVER_MYSQL = false; }}} As of DPM 1.5.10, the script used to publish dynamic information for DPM into BDII (space used/free per VO) has not been updated to interact properly with VOMS mapping. As a result, all VO specific pools are not counted into values published. QWG templates provide a fixed version of the script that can be installed by adding the following line to DPM head node profile : {{{ include glite/se_dpm/server/info_dynamic_voms; }}} To work properly this script requires `/opt/lcg/etc/DPMCONFIG` (or whatever file you defined for DPNS database connexion information) to be readable by world. This can be achieved by adding the following line to your DPM configuration in your site specific template : {{{ "/software/components/dpmlfc/options/dpm/db/configmode" = "644"; }}} == LFC Configuration == LFC related standard templates require a site template to describe the service site configuration. The variable `LFC_CONFIG_SITE` must contain the name of this template. Normally the only thing really required in this site specific template is the password for LFC user (by default `lfc`) and the MySQL administrator (by default `root`). There a no default value provided for these password. Look at standard LFC [source:templates/trunk/glite-3.0.0/glite/lfc/config] configuration template for the syntax. If you want to use Oracle version of LFC server define the following variable in your machine profile : {{{ variable LFC_SERVER_MYSQL = false; }}} LFC templates allow a LFC server to act as a central LFC server (registered in BDII) for somes VOS and as a local LFC server for the others. This are 2 variables controlling what is registered in the BDII : * `LFC_CENTRAL_VOS` : list of VOs for which the LFC server must be registered in BDII as a central server. Default is an empty list. * `LFC_LOCAL_VOS` : list all VOs for which the server must be registered in BDII as a local server. Default to all supported VOs (`VOS`variable). If a VO is in both lists, it is removed from `LFC_LOCAL_VOS`. If you don't want this server to be registered as a local server for any VO, even if configured on this node (present in `VOS` list), you must define this variable as an empty list : {{{ variable LFC_LOCAL_VOS = list(); }}} VOs listed in both lists must be present in `VOS` variable. These 2 variables have no impact on GSI (security) configuration and don't control access to the server.