wiki:ReleaseNotes/gLite-3.0

Release Notes for gLite3.0 Templates

Table of Contents

  1. QWG Releases
  2. gLite Updates
  3. Main Changes and Known Problems
    1. gLite-3.0.2-14 : default name for VO SW area changed to VO name
    2. gLite-3.0.2-14 : syntax of templates compliant with PAN v8
    3. gLite-3.0.2-14 : AII upgraded to v2
    4. gLite-3.0.2-14 : ncm-query default output format change
    5. gLite-3.0.2-14 : support for WMS added
    6. gLite-3.0.2-14 : FTS Client Configuration Improvement
    7. gLite-3.0.2-14 : DPM and LFC 1.6.7 available as unofficial update
    8. gLite-3.0.2-14 : Improvements to configuration of RPM repositories
    9. gLite-3.0.2-14 : Torque MOM no longer started on non WN machines
    10. gLite-3.0.2-14 : MAUI monitoring script improved
    11. gLite-3.0.2-14 : BDII search filter adjusted to work with Glue 1.3 schema
    12. gLite-3.0.2-14 : LFC, DPM and CE BDII publication changes
    13. gLite-3.0.2-14 : support for BDII caching enabled
    14. gLite-3.0.2-13 : support for BDII caching
    15. gLite-3.0.2-13 : replacement of Globus MDS by a resource BDII in gLite …
    16. gLite-3.0.2-13 : configuration of NFS exports changes
    17. gLite-3.0.2-12 : gLite WMS clients configured
    18. gLite-3.0.2-12 : new GIP plugin for LCG CE
    19. gLite-3.0.2-12 : DPM/LFC namespace configured for suppored VOs
    20. gLite-3.0.2-12 : DPM_USER variable renamed DPM_DAEMON_USER
    21. gLite-3.0.2-12 : DPM daemons must be restarted after upgrade
    22. gLite-3.0.2-12 : standard GIP plugin must be used for DPM
    23. gLite-3.0.2-12 : standard/pan/structures.tpl removed
    24. gLite-3.0.2-12 : PAN function default() removed
    25. gLite-3.0.2-12 : AII major reorganization
    26. gLite-3.0.2-12 : common/glite moved to components/glite
    27. gLite-3.0.2-12 : components migrated to namespace
    28. gLite-3.0.2-12 : glite-3.0.0/defaults/ and standard/os/ reorganized
    29. gLite-3.0.2-12 : repository templates migrated to namespace repository/
    30. gLite-3.0.2-12 : DPM/LFC 1.6.4 issues with non VOMS proxies
    31. gLite-3.0.2-12 : DPM/LFC 1.6.4 configuration changes
    32. gLite-3.0.2-12 : default update set to 25
    33. gLite-3.0.2-12 : new BDII configuration options
    34. gLite-3.0.2-12 : VOMS groups/roles mapping changes
    35. gLite-3.0.2-11 : upgrade to panc v7 recommended
    36. gLite-3.0.2-11 : panc v6 restrictions
    37. gLite-3.0.2-11 : gLite update 24/25 doesn't work
    38. gLite-3.0.2-11 : new names of gLite repository templates
    39. gLite-3.0.2-11 : SE_HOSTS format change
    40. gLite-3.0.2-11 : migration to namespaced components and standard templates
    41. gLite-3.0.2-10 : SE_HOST_DEFAULT_SC3 required with panc v6
    42. gLite-3.0.2-10 : DPM and LFC 1.6.3 upgrade
    43. gLite-3.0.2-10 : Torque/MAUI restart required on CE and WNs
    44. gLite-3.0.2-10 : SE removed from NFS exports
    45. gLite-3.0.2-10 : SE_HOST_DEFAULT deprecated
    46. gLite-3.0.2-10 : SEDPM_DISK_HOSTS no longer used
    47. gLite-3.0.2-10 : DPM : support added for SRM v2.2
    48. gLite-3.0.2-10 : pro_software_component_dpmlfc included in standard …
    49. gLite-3.0.2-9 requires panc >= 6.0.3
    50. lfc/config.tpl compilation error
    51. Upgrading a LCG RB to update 13 and later
    52. Update of voms.cern.ch certificate
    53. AII : aii-shellfe error about bootloader
    54. AII : ncm-template required
    55. Change in how to run MPI jobs
    56. Shared working areas for MPI jobs (Torque v2)
    57. quattor/config not found
    58. LCMAPS error after upgrading from LCG 2.7.0
    59. DPM upgrade from LCG 2.6/2.7
    60. Condor RPM name not matching internal name
    61. LCG RB upgrade
    62. fetch-crl
    63. GlueHostApplicationSoftwareRunTimeEnvironment
  4. Change Log

This page contains information about each release of QWG Templates for gLite 3.0, in particular new or changes features and known problems. To know how to configure the template, refer to the page on gLite templates customization.

Note : Information in this page, particularly Known Problems section, may refer to a not yet announced release. These information are related to an upcoming release and documents things already in the gLite-3.0.0 branch.

QWG Releases

Note : you can have a look at ongoing developments and progress of upcoming release through Roadmap button, last entries in the gLite-3.0.0 branch ChangeLog or full log of trunk branch.'

Date Release Description
24/7/2006 Creation of branch gLite-3.0.0
26/7/2006 First release of QWG templates for gLite 3.0.0
26/7/2006 Second release of QWG templates for gLite 3.0.0
29/7/2006 Third release of QWG templates for gLite 3.0.0
17/8/2006 Fourth release of QWG templates for gLite 3.0
18/8/2006 Fith release of QWG templates for gLite 3.0.0
13/9/2006 First release of QWG templates for gLite 3.0.2
15/9/2006 Second release of QWG templates for gLite 3.0.2 (CA 1.9)
20/10/2006 Thrid release of QWG templates for gLite 3.0.2 (CA 1.10, gLite update 7 (including critical security fixes)
7/12/2006 Fourth release of QWG templates for gLite 3.0.2 (gLite update 9, LRMS configuration)
19/12/2006 Fifth release of QWG templates for gLite 3.0.2 (gLite update 10, new VO configuration)
21/12/2006 Sixth release of QWG templates for gLite 3.0.2 (gLite update 11)
12/01/2007 Seventh release of QWG templates for gLite 3.0.2 (CA RPMs 1.11)
03/02/2007 Eighth release of QWG templates for gLite 3.0.2 (gLite 3.0 update 12)
16/02/2007 Nineth release of QWG templates for gLite 3.0.2 (gLite 3.0 update 13, CA RPMs 1.12)
25/03/2007 Tenth release of QWG templates for gLite 3.0.2 (gLite 3.0 update 14-18, CA RPMs 1.13)
21/05/2007 Eleventh release of QWG templates for gLite 3.0.2 (gLite 3.0 update 19-24, dCache support)
31/7/2007 Twelveth release of QWG templates for gLite 3.0.2 (gLite 3.0 update 25-29, CA 1.15, gLite WMS)
7/12/2007 Thirteenth release of QWG templates for gLite 3.0.2 (gLite 3.0 update 30-37, CA 1.18, resource BDII)
26/10/2008 Last release of QWG templates for gLite 3.0.2 (gLite 3.0 update 38-44, CA 1.19-25, AII v2, WMS)

Note: there are no more relases of QWG templates for gLite 3.0 planned, as all the node types are now supported under gLite 3.1.

gLite Updates

QWG templates releases deliver the last gLite updates available at the time of the release. There is no equivalent between QWG release number (-n) and gLite update numbers. Sometimes one QWG templates release deliver several gLite updates. In each QWG release, there is a default associated gLite update (generally the last one).

Starting with QWG release 3.0.2-10, QWG releases provide a standard mechanism for selecting the gLite update you want to deploy on a per node, per cluster or per site basis.

For exampple, QWG release 3.0.2-10 delivers update 18 as the default update. If you want to stay with update 15 on your DPM server, you may define the following variable in DPM server profile :

variable GLITE_UPDATE_VERSION = '15';

Content of gLite updates and associated release notes can be viewed at http://glite.web.cern.ch/glite/packages/R3.0/updates.asp.

Main Changes and Known Problems

gLite-3.0.2-14 : default name for VO SW area changed to VO name

In previous version of the QWG templates, SW areas created using the DEFAULT entry in VO_SW_AREAS had a name based on SW manager userid for the VO. This has been changed to be the VO name (VO full name even if a VO alias is used) by default and old behaviour can be restored defining variable VO_SW_AREAS_USE_SWMGR to true.

gLite-3.0.2-14 : syntax of templates compliant with PAN v8

PAN compiler v8 introduces some syntax changes. A few valid construct in previous versions are now marked as deprecated and will not be supported anymore in v9. All templates have been made compliant with the new syntax, removing all deprecated constructs. The main changes are:

  • include now requires a DML: include mytempl; must be replaced by include {'mytempl'};. This is to prepare changed planned for v9.
  • All automatic variables like self, object, argv, argc must now be capitalized.
  • type is no longer a valid keyword to bind a path to a type: bind must be used instead. type is now restricted to type definitions.

gLite-3.0.2-14 : AII upgraded to v2

QWG release gLite-3.0.2-14 delivers AII v2. This new version of AII, used for initial installation, brings much more flexibility in definition of file system and block devices and in ability to specifiy site-specific actions during installation. In particular, former Kickstart template has been replaced by hooks that are much easier to specify. You can find some documentation about new AII configuration and migration issue in the page about initial installation.

The price for this new flexibiliy is that non backward-compatible changes are required in site configuration. This part of the configuration is normally not used after initial installation, so there is no real risk of breaking a working configuration. The main problem will be difficulties to install new machines or reinstall old ones.

It is possible to install QWG release gLite-3.0.2-14 without updating AII to v2. To do it, proceed with the normal update procedure, following upgrade instructions, and revert the changes to cfg/standard/quattor directories before committing the changes.

gLite-3.0.2-14 : ncm-query default output format change

Last version of ncm-query, provided by QWG release gLite-3.0.2-14gLite-3.0.2-14, introduces a sligtht change in the default display of nlist (hash). Previously the key was printed literally, now it is unescaped by default to enhance readability.

Should this new default not be appropriate for your needs, use new option --no-unescape.

gLite-3.0.2-14 : support for WMS added

gLite 3.0 version of WMS 3.1 is now supported by QWG templates, both with WMS and LB on a separate machine (recommended) or on the same one.

Look at gLite customization page for configuration information.

In the current release, as provided by r3153, there is a know problem with startup of LB services when LB runs on a dedicated machine (#141). After booting the machine, services must be started manually with command :

/etc/init.d/gLite start

gLite-3.0.2-14 : FTS Client Configuration Improvement

FTS client configuration has been reworked to be simpler and more flexible. The main variable to configure the FTS client is now FTS_SERVER_HOST. Look a gListe customization for more details.

gLite-3.0.2-14 : DPM and LFC 1.6.7 available as unofficial update

DPM 1.6.7 has been released as part of gLite 3.1 update 9 but not yet as part of gLite 3.0. Release gLite-3.0.2-14 of QWG Templates brings this version as an unofficial update, not deployed by default. To use it, you need to include template update/unofficial/dpmlfc-1.6.7 in your configuration. Recommended place to do it is either the DPM/LFC node profiles or update/unofficial/rpms if you want to use it on all your gLite 3.0 DPM/LFC nodes.

Note : don't forget to remove this include as soon as DPM 1.6.7 for gLite 3.0 has been officially released.

There is no problem to run the gLite 3.0 standard version (DPM 1.6.5) in a DPM configuration with gLite 3.1 nodes running DPM 1.6.7. There is no schema change between 1.6.5 and 1.6.7.

gLite-3.0.2-14 : Improvements to configuration of RPM repositories

Release gLite-3.0.2-14 of QWG Templates brings more flexibility in configuration of RPM repositories. In previous versions, a site had to provide 3 RPM repositories per gLite version (release, externals, updates) with fixed names or had to edit the standard template repository/config/glite.tpl.

There is now limited possibilities to tune the standard configuration without editing the standard templates :

  • Define the repository prefix used for all the repositories of a particular gLite version using variable REPOSITORY_GLITE_PREFIX. Default is glite_3_0_0.
  • Ability to define a site specific repository associated each gLite version by providing a repository template whose name is repository_prefix_unofficial.tpl.

See gLite template customization for more details.

gLite-3.0.2-14 : Torque MOM no longer started on non WN machines

Previous version of the templates was starting Torque MOM on a machine with Torque client configured even if the machine was not listed as a WN, e.g. a VO box or a WN installed but not yet integrated in the CE. This resulted in a lot of messages in /var/log/messages on the CE. Now the client is cofigured but Torque MOM is not started if the machine is not listed in WORKER_NODES variable.

gLite-3.0.2-14 : MAUI monitoring script improved

When MAUI is found running but not responding, previous version of the monitoring script restarted MAUI. As experience shown that the problem is often due to Torque, the new version first tries to restart Torque and restarts MAUI only if the problem persists.

gLite-3.0.2-14 : BDII search filter adjusted to work with Glue 1.3 schema

BDII used to defined a search filter when querying other BDIIs (resource BDIIs for site BDII, site BDIIs for top-level BDII) requiring GlueSchemaVersion information to be present into each object. This is information is deprecated in Glue 1.3 and is not present anymore in Glue 1.3 templates provided with gLite 3.1. As a consequence gLite 3.1 resources (like LCG CE) are not published into BDII.

BDII configuration provided with gLite-3.0.2-14 doesn't define anymore a BDII search filter (conversely to YAIM). This works well with Glue 1.3 compliant resources. The only side effect could concern sites using BDII to publish non Glue-compliant resources.

For more information, see GGUS ticket #30127.

gLite-3.0.2-14 : LFC, DPM and CE BDII publication changes

A GIP provider is now used to dynamically publish LFC information into BDII, in replacement for old purely static LDIF file.

DPM publication is now compliant with Glue 1.3 schema and recommendations from GSSD for publishing space tokens (SRM 2.2). This configuration is passing all LCG GSSD BDII tests (that should be incorporated into SAM in the future). In addition, default access point for DPM systems is now based on DNS domain name of DPM server instead of variable SITE_DOMAIN. If it doesn't match your need, you need to explicitly defined the access point with accessPoint attribute in SE_HOSTS variable.

On the CE, information published as part of GlueSubCluster now includes the total number of CPUs computed from WN_HOSTS ànd WN_CPUS` variables.

Information about the site (GlueSite object) is now published by site BDIIs (configured as combined BDIIs), as it is done in gLite 3.1, instead of CE as it was done until now. This avoids to publish several times the site information if you run several CE or if your site includes a hierarchy of subsite BDIIs.

gLite-3.0.2-14 : support for BDII caching enabled

In gLite-3.0.2-14, BDII caching in site BDII in now enabled by default. In this configuration, query of resource BDIIs is no longer done by the BDII itself but by a GIP provider. This allows the information to remain available even if the BDII is not answering.

Note : this may complicate BDII troubleshooting as a malfunctioning BDII will not have an immediate impact on published information. Use BDII log files (/opt/bdii/var/bdii.log and /opt/bdii/var/tmp/stderr.log) to check BDII is working properly or to get information to troubleshoot problems.

To revert to old behaviour (BDII without caching), define the following variable in your profile or site parameters :

variable BDII_USE_GIP_CACHE = false;

gLite-3.0.2-13 : support for BDII caching

gLite update 35 introduced a new configuration for site BDIIs allowing to cache information about resources and avoid they disappear from BDII in case their resource BDII doesn't answer. Because this feature has not been thoroughly tested, it is disabled by default. To activate it, you need to define the following variable in your site parameters :

variable BDII_USE_GIP_CACHE = undef;

This configuration should become the default in next release of QWG templates.

gLite-3.0.2-13 : replacement of Globus MDS by a resource BDII in gLite update 35

With gLite update 36, all services that previously used a Globus MDS to publish their resources are now using a resource BDII. This change, in QWG templates gLite-3.0.2-13, is almost transparent with the following exceptions :

  • Corresponding BDII URLs must be adjusted by editing BDII_URLS variable in your site parameters. For each URL, the port must be changed from GRIS_PORT (2135) to BDII_PORT (2170) and mds-vo-name must be changed from localto resource.
  • User edguser must be added to ADMIN3 list in MAUI configuration if you are using the new MAUI-based GIP plugin for CE (GIP_CE_USE_MAUI=true)
  • Globus MDS must be stopped manually after the update. This is not critical, keeping it running is harmless. To do it use the following commands :
    chkconfig --level 345 globus-mds off
    service ldap stop
    

In addition, all machine types publishing resources into BDII (almost all except UI, WN and disk servers) can now be configured as site BDII. Refer to documentation about gLite customization for more information.

gLite-3.0.2-13 : configuration of NFS exports changes

Configuration of NFS exports has been enhanced to provide more flexibility in specifying the list of host with access to a NFS server and their access right. Look at gLite customization for more details.

This includes new variables and variables whose name has changed. Old variables (SITE_xxx_ACL) are still used if the new ones are not defined. As part of the changes, default SE is no longer in the export list : if this is required, you must add it to variable NFS_LOCAL_CLIENTS.

gLite-3.0.2-12 : gLite WMS clients configured

QWG release 3.0.2-12 is the first release to properly configure gLite WMS clients, both the legacy LB/NS client (glite-job-xxx commands) and the WMProxy interface (glite-wms-job-xxx commands).

Currently, no VO has a default WMS host define in its parameters. To be able to use gLite WMS clients for a VO, you need to define a WMS host in your VO site parameters, either for a specific VO or in the default parameters applied to all VOs.

gLite-3.0.2-12 : new GIP plugin for LCG CE

A new GIP plugin for LCG CE has been developped at LAL. This new plugin uses MAUI command diagnose to collect data about CE usage instead of Torque qstat.

The reason for this new plugin is to be able to correctly publish CE information when using advanced MAUI features like Standing Reservations (this allow to reserve some job slots for certain types of jobs). With lcg-info-dynamic-pbs, if you declare standing reservations, they appear as free job slots even if they are not accesible by the queue. The new plugin properly handles that. It could be extended in the future to support other advanced MAUI features. This plugin should work properly even with a basic MAUI configuration without any advanced features used.

This new plugin is disabled by default. To use it as a replacement for lcg-info-dynamic-pbs, you need to define GIP_CE_USE_MAUI variable to true in your cluster or site templates (for example source:templates/trunk/site/example/pro_lcg2_config_site.tpl site/example/pro_lcg2_config_site.tpl] :

variable GIP_CE_USE_MAUI ?= true;

gLite-3.0.2-12 : DPM/LFC namespace configured for suppored VOs

In QWG templates 3.0.2-12, DPM and LFC standard configuration templates add all supported VOs on the DPM/LFC server (as specified by VOS variable) to the node configuratin. Namespace is then initialized (mkdir, chown, chmod, setacl) for all theses VOs. No attempt is made to updade an already configured namespace.

gLite-3.0.2-12 : DPM_USER variable renamed DPM_DAEMON_USER

Variable DPM_USER used to configure userid DPM daemons run under has been renamed DPM_DAEMON_USER. In case you explicitly defined DPM_USER in your site configuration, you are advised to update the variable name. If DPM_USER is defined, it is used as default value for DPM_DAEMON_USER.

gLite-3.0.2-12 : DPM daemons must be restarted after upgrade

Installation of new DPM version doesn't trig a restart of DPM deamons (in case a schema upgrade is required). To use the new version don't forget to restart them. You can do it either by running the service xxx restart commande for each daemon or by doing the following on each DPM nodes :

rm /etc/shift.conf
ncm-ncd --configure dpmlfc

gLite-3.0.2-12 : standard GIP plugin must be used for DPM

As of gLite 3.0.2 update 27, there is a new GIP plugin for DPM which solves problems of the (very old) previous plugin. If you are using a site specific version of the plugin (one was provided by QWG Templates to fix issues of the previous standard plugin), you need to revert to the standard plugin as dpm-qryconf output format has changed and breaks previous versions.

gLite-3.0.2-12 : standard/pan/structures.tpl removed

Template standard/pan/structures.tpl has been removed. Since 3.0.2-10, it was only a wrapper to standard/quattor/schema.tpl. All the standard templates using it have been updated. If any of your cluster or site templates still use it, you need to update them. You can use the same script as suggested for other changes, just replacing $lines[i] =~.... by :

    $lines[$i] =~ s%pan/structures%quattor/schema%;

The following command allows to build the list of templates needed to be updated :

 files=`find cfg -name \*.tpl -exec grep -l pan/structures {} \;

gLite-3.0.2-12 : PAN function default() removed

PAN function default() has been removed from standard templates (it was a user-defined function, not a built-in one). This function is deprecated for a very long time and must be replaced by '?=' operator. All standard templates are free of default() function since release gLite-3.0.2-10.

If you have any site/cluster specific templates still making use of default() function, you need to edit them and replace default() function call by ?= operator. The following script allow to do it on a large number of templates :

#!/usr/bin/perl
# This script replace any occurence of PAN default() function by '?=' operator.
# Must be passed a list of templates to update as parameters.

use strict;

foreach my $file (@ARGV) {
  print "Updating $file...\n";
  open (FILE, "$file");
  my @lines = <FILE>;
  close FILE;

  my $i = 0;
  while ( $i < @lines ) {
    $lines[$i] =~ s%=\s*default\s*[(](.+)[)]\s*;%?= $1;%;
    $i++;
  }
  
  my $contents = join "", @lines;
  open (FILE, ">$file");
  syswrite FILE, $contents;
  close FILE;
}

To select templates needing to be updated, you can use the following command :

find cfg/clusters cfg/sites -name \*.tpl -exec grep -l 'default(' {} \;

gLite-3.0.2-12 : AII major reorganization

QWG release gLite-3.0.2-12 contains the last release of standard templates to configure AII. This is a major change compared on previous releases, not entirely backward compatible.

Here is the list of major changes affecting cluster/site templates :

  • /software/components/aii moved to /system/aii : as a consequence all cluster/site templates with references to /software/components/aii should be edited. This should normally be done quite easily with the following commands :
    #!/bin/sh
    files=`find cfg/clusters cfg/sites -name \*.tpl -exec grep -l /software/components/aii {} \;`
    sed -i -e 's%/software/components/aii%/system/aii%' $files
    
  • To be able to compile your templates, you need to define in your cluster or site templates variable AII_OSINSTALL_ROOT as the start of the URL where OS RPMs can be downloaded from. This must not include the OS version number. Recommandation is to define this variable in your site site/pro_site_global_variables.tpl. Look at example for more information.
  • If you have any reference to /software/components/aii/active or /software/components/aii/restart in your templates, you need to remove them to compile successfully.
  • In the last version of AII templates, all the AII customization can be done through variables. Look at quattor/aii/config.tpl to get the whole list of available variables. It is recommended to update existing configurations to use these variables instead of direct assignment to AII configuration path.

gLite-3.0.2-12 : common/glite moved to components/glite

For consistency, sub-directories of common/glite directories, containing ncm-glite sub-components templates, have been moved under the ncm-glite configuration directory, components/glite.

This should have no impact on site specific templates as these templates are rarely referenced directly by site templates. If you are affected by this change, you'll need to edit your site templates to reflect the new path.

gLite-3.0.2-12 : components migrated to namespace

This release updates most of the components to the last version, in order to use namespaced version of the components. This changes the template name used to include to component in the configuration. Most of the changes are internal to QWG templates but some components are also included in site templates. The change is pretty straighforward : pro_sofware_component_xxx must be replaced by components/xxx/config.

The following script may help to do it in several templates. It receives as argument the templates to update.

#!/usr/bin/perl
# This script replaces every occurence of 'pro_software_component_xxx' by the namespaced path. It receives
# the list of templates to update as arguments (space separated).
 
use strict;

foreach my $file (@ARGV) {
  open (FILE, "$file");
  my @lines = <FILE>;
  close FILE;

  my $i = 0;
  while ( $i < @lines ) {
    $lines[$i] =~ s%pro_software_component_(.*);%components/$1/config;%;
    $i++;
  }
  
  my $contents = join "", @lines;
  open (FILE, ">$file");
  syswrite FILE, $contents;
  close FILE;
}

To select your site templates to pass as arguments to the script, use a command like :

find cfg/clusters cfg/sites -name \*.tpl -exec grep -l 'pro_software_component' {} \;

To check that the differences produced by this conversion are only related to RPM updates, do the following :

  1. Save your current build directory as build.saved in the (same directory) :
    rm -Rf build.saved
    cp -R build build.saved
    
  2. Compile your new configuration :
    external/ant/bin/ant
    
  3. Check the differences with the following (horrible) command that will list all differences that are configuration differences (should be none) :
    src/utils/profiles/compare_xml -v | egrep '^-|\+' | grep -v version | \
                          egrep -v '^(-|\+)</*[0-9a-f_]*>$' | grep -v '^@@.*@@$'
    

gLite-3.0.2-12 : glite-3.0.0/defaults/ and standard/os/ reorganized

To avoid template name clashes, defaults/ namespace used in glite-3.0.0 to provide default configuration of gLite has been reorganized. Changes are :

  • Templates have been moved to defaults/glite
  • defaults/glite.tpl has been renamed defaults/glite/config.tpl
  • A new version.tpl has been added to define gLite major version and default update (formely in defaults/glite/config.tpl). This template is added very early in the configuration so that this information can be used by any other templates.

Templates in standard/os have been renamed with a shorter name (pro_os_ prefix removed) and converted to namespace.

All these changes should be transparent as these templates are normally not accessed directly from site/cluster templates.

gLite-3.0.2-12 : repository templates migrated to namespace repository/

In QWG release gLite-3.0.2-12, templates related to RPM repositories configuration have been migrate to namespace repository/, to match the agreed convention. In addition, the templates have been renamed withouth the leading repository_lal or repository_common prefix.

Note : templates for very old OS versions (SL 3.05 and SL 4.2) have not been migrated to the new structure.

repository/ namespace now has the following structure :

  • config.tpl is the new name for the per-cluster repository_common.tpl that defines the RPM repositories used in the cluster. Look at example.
  • config/ sub-namespace contains the repository used by a specific part of QWG templates. Normally this namespace is used mainly in standard, grid, or os (SL 4.4 i386 example).
  • Other .tpl files in repository/` defining the repository contents. These templates are generally locating in the site specific templates. Look at examples.

This change is not completly backward compatible, as it requires updating repository/config.tpl (former repository/repository_common.tpl) to use the new names.

You are advised to update your configuration to adhere to this standard naming scheme, as it should allow smoother upgrades in the future. But if you want to ignore this change in a first stage, you can revert changes affecting repository/*.tpl templates : this way you should be able to install this release without changes in your site configuration.

To update your configuration to use the new naming scheme for repositories, you need :

  • Rename your repository templates (probably in your site template hierarchy) to adhere to new repository template names with svn mv.
  • Remove everything in these templates after the initial comments and execute ant update.rep.templates.
  • Check and if necessary update cluster.build.properties for each of your clusters : be sure to have the namespace form of your site directory in the include path. Look at cluster example.

If you are using SWrep to manage repositories and repository templates, upgrade to a version supporting repository namespace (part of Quattor 1.3).

For more information about repository templates used by the default repository/config/glite.tpl, look at gLite templates customization.

Note : be sure to use SCDB 2.1.2 or later when upgrading to new naming scheme. There is a known issue with previous versions.

gLite-3.0.2-12 : DPM/LFC 1.6.4 issues with non VOMS proxies

There is a potential issue with DPM/LFC 1.6.4 if a user has a proxy without VOMS extensions. Look at http://glite.web.cern.ch/glite/packages/R3.0/updates.asp (section about DPM/LFC 1.6.4) for more information on the workaround, if you need it.

gLite-3.0.2-12 : DPM/LFC 1.6.4 configuration changes

If you install update 24 or later, you'll get DPM/LFC 1.6.4 installed. This new version requires a database schema upgrade. To complete this upgrade, you need to :

  1. Install the new version ; currently running daemons will be unaffected.
  2. On DPM head node, stop your DPM daemons.
  3. Backup your current database with mysql-dump.
  4. Run the script to upgrade database schema, /opt/lcg/share/DPM/dpm-secondary-groups
  5. Restart your DPM daemons
  6. Restart daemons on DPM disk servers

This upgrade implies a DPM downtime during 10-15 minues. It is recommended to declare a downtime of your CE during this period.

DPM 1.6.4 is using BDII instead of Globus MDS to publish information into the BDII. After the upgrade, you need to change BDII_URL in your pro_lcg2_config_site.tpl for the upgraded SE. The changes required are :

  • Port number is BDII port (2170) instead of MDS port (2135)
  • DN base is mds-vo-name=resource,o=grid instead of mds-vo-name=local,o=grid.

gLite-3.0.2-12 : default update set to 25

Default gLite update has been set to 25. If upgrading from a previous version of QWG templates, be aware that this implies a DPM upgrade that requires a schema change. Be sure to stop your DPM server before updating QWG templates or to define GLITE_UPDATE_VERSION in DPM nodes profiles to the currently running version to prevent a DPM upgrade during QWG templates update.

gLite-3.0.2-12 : new BDII configuration options

BDII configuration has been enhanced to add flexibility (in particular support for Freedom of Choice) and support new resource BDII (replacement for Globus MDS). New configuration is backward compatible. To take advantage of new options, look at BDII configuration documentation.

gLite-3.0.2-12 : VOMS groups/roles mapping changes

Previously, mapping of VOMS roles was described in variable voms_roles of VO parameters. The role to map was given as a string in name key. To allow a more flexible mapping, this key can now be a list and each value can be a simple value interpreted as a role name or a group/role specification using '/GROUP=.../ROLE=...'. This fixes a problem with LHCb Software Manager. In addition, a few variable names have been changed : the old variable name is still used if the new one is not present. Look at documentation about gLite templates customization for more details about changes in variable names.

In addition, previous behaviour of mapping users to their role in grid-mapfile has been changed. By default, a user is always mapped as a normal user in grid-mapfile (grid-mapfile is used only if the user has no VOMS extensions in his proxy). Thus, to be mapped to an account corresponding to a specific role (e.g. SW manager), the user has to get a proxy using voms-proxy-init --voms. To revert to the previous behaviour, you need to define variable VO_GRIDMAPFILE_MAP_VOMS_ROLES to true in your machine profile or a site specific template.

gLite-3.0.2-11 : upgrade to panc v7 recommended

gLite-3.0.2-11 is the last version to support panc v6. See related note about restrictions.

New version of the QWG templates will begin to take advantage of new features introduced in panc v7. Even if not required to use QWG templates release gLite-3.0.2-11, you are advised to upgrade to panc v7 after upgrading the QWG templates, in ordre to prepare for future releases.

If you are using SCDB, just update to last version of SCDB Tools : panc v7 is the default compiler since SCDB Tools v2. Follow instructions about upgrading SCDB. It is recommended to do SCDB upgrade after installing QWG release 3.0.2-11, to avoid problems if profiles contain non ASCII characters (look at SCDB Release Notes).

If you are not using SCDB, follow instruction about installing PAN Compiler.

gLite-3.0.2-11 : panc v6 restrictions

There are a few restrictions if you want to use panc v6 with QWG templates gLite-3.0.2-11 :

  • SE_HOST_DEFAULT_SC3 variable must be define in pro_lcg2_config_site.tpl, even if it is deprecated and no longer used by anybody.
  • At least one of the VO you are supporting must have a SW area defined.
  • SE_HOSTS must be defined even if there is no SE in your configuration. In this case, define as an empty nlist :
    variable SE_HOSTS = nlist();
    

gLite-3.0.2-11 : gLite update 24/25 doesn't work

gLite update 24 is available in QWG release gLite-3.0.2-11 but is not the default update. It was discovered after the release that its support in this release is broken (dependency issues for WN, misconfiguration for DPM). Should you need to install this update before next release, be sure to use at least r1828 of gLite-3.0.0. Also, be aware that the new DPM version require a database schema change and some changes to site BDII configuration (look at related entry in release notes).

Note : The high priority part of update 24, the new certificate of lcg-voms.cern.ch, is part of gLite-3.0.2-11 independently of the actual update installed. You are advised to install gLite-3.0.2-11 before expiration of lcg-voms.cern.ch certificate, May 29th.

gLite-3.0.2-11 : new names of gLite repository templates

repository/glite.tpl shipped with gLite-3.0.2-11 uses namespaces to access RPM repository templates. The repository templates have been renamed without the repository_lal_ prefix.

If you want to ignore this change, you can just revert repository/glite.tpl to version supplied with previous version of QWG templates. This is recommended during the upgrade to 3.0.2-11.. In this case, you probably need to edit your previous template and replace the line include pro_declaration_functions_general; by :

include pan/functions;

To update your configuration to use namespaced templates for repositories, you first need to upgrade SCDB Tools to 2.1.2 or later. After, execute the following steps :

  • Rename your repository templates (probably in your site template hierarchy).
  • Remove everything after the initial comments and execute ant update.rep.templates.
  • Check and if necessary update cluster.build.properties for each of your clusters : be sure to have the namespace form of your site directory in the include path. Look at cluster example.

If you are using SWrep to manage repositories and repository templates, upgrade to a version support repository namespace (part of Quattor 1.3).

For more information about repository templates used by the default repository/glite.tpl, look at gLite templates customization.

Note : if, after updating your repository templates to use repository namespace, you get an error in SPMA functions, look at SCDB release notes.

gLite-3.0.2-11 : SE_HOSTS format change

SE_HOSTS variable format has changed. Previously, it used to be a list of SE host names with several "companion" variables (SE_TYPES, SE_ARCH, SE_ACCESS, STORAGE_DIRS). This was quite hard to maintain in sync.

SE_HOSTS is now a nlist with one entry per SE. The key is the SE host name, the value is a nlist describing SE parameters. Look at gLite templates customization for more details.

Old format is still accepted but you are advised to update your site configuration and change SE_HOSTS to conform to new format. All the previous SE_xxx variables can be removed. As part of this change, you may have to update how BDII_URLS is built in your pro_lcg2_config_site.tpl if you use the suggested loop over SE_HOSTS. Previously the SE host name was the value (third parameter from first()/next() functions), now this is the key (second parameter). Look at example for more information.

gLite-3.0.2-11 : migration to namespaced components and standard templates

QWG templates release gLite-3.0.2-11 introduces migration of PAN/Quattor standard templates and component templates to namespaced version. Namespaces are a PAN feature allowing a better organization of templates and improving ability to easily locate where a template sits.

This migration implies changes in name of templates. This can lead to some backward compabilitity problem. In order to minimize the impact on site specific templates, templates with the previous name are still maintained as wrappers to new templates. To keep the distribution as clean as possible, these templates are not part of the release but can be downloaded from QWG repository trunk.

If after installing this release, you cannot compile because some templates are missing, you are advised to fix them in order to use new names. If this is not possible immediatly, you can download the compatibility templates and install them in your site or cluster directory.

For components, the rule to convert from old name to new name is the following :

  • pro_software_component_xxx becomes components/xxx/config
  • pro_declaration_component_xxx becomes components/xxx/schema
  • pro_declaration_functions_xxx becomes components/xxx/functions

Look at the compatibility templates to find the exact new name.

As a consequence of the migration of these templates to namespace, you can probably clean up cluster.build.properties in your clusters. For gLite template hierarchy the only required elements in include path are :

grid/glite-3.0.0 grid/glite-3.0.0/components

Look at cluster example for more information.

gLite-3.0.2-10 : SE_HOST_DEFAULT_SC3 required with panc v6

In QWG templates release gLite-3.0.2-10, SE_HOST_DEFAULT_SC3 has been made optional. Unfortunalty, this cause a problem if you are using PAN Compiler v6. There are 2 possible workarounds :

  • Define this variable to your SE. This will have no impact as this is not used anymore by anypart of the middleware.
  • Upgrade to PAN Compiler v7. If your are using SCDB, upgrade SCDB Tools to last version.

gLite-3.0.2-10 : DPM and LFC 1.6.3 upgrade

QWG templates release gLite-3.0.2-10 provides DPM and LFC version released as part of gLite update 16 (1.6.3). This version requires a schema upgrade for DPM and LFC databases. It is necessary to shutdown the services and run a the YAIM script to achieve this (/opt/glite/yaim/functions/config_DPM_upgrade for DPM or /opt/glite/yaim/functions/config_lfc_upgrade for LFC) or follow the instructions at https://twiki.cern.ch/twiki/bin/view/LCG/DpmSrmv2Support. This requires careful planning : to avoid causing job failure during the upgrade, the CE must be closed and a schedule downtime must be defined in GOC DB.

Note : be aware that doing an unplanned upgrade of DPM can result in database corruption.

To allow more flexibility it is possible to deploy the QWG release on all nodes except DPM and LFC nodes by defining GLITE_UPDATE_VERSION variable in the profile of these nodes.

gLite-3.0.2-10 : Torque/MAUI restart required on CE and WNs

QWG release gLite-3.0.2-10 delivers a new version of Torque/MAUI. This version is a fixed version of what was released in gLite update 16 and should be release in an upcoming gLite update.

After installing QWG release gLite-3.0.2-10 with gLite update 16 or later (see above for information on gLite update selection), you need to restart Torque/MAUI on CE and WNs. This involves :

  • Login in the CE, stop services pbs_server and maui (maui must generally be stopped with kill -TERM), start services pbs_server and maui.
  • Defining LRMS_CLIENT_RESTART to force a Torque client restart on each WN.

gLite-3.0.2-10 : SE removed from NFS exports

Until QWG release gLite-3.0.2-10, default export list for NFS served file systems contained an entry for the default SE. This has been changed. The only node added to the export list by default is the CE. All others must be added using variable SITE_WN_HOSTS whose value is typically a regexp matching name of nodes requiring access to the NFS file system. See site parameters example.

gLite-3.0.2-10 : SE_HOST_DEFAULT deprecated

To allow greater flexibility in definition of close and default SE, SE_HOST_DEFAULT variable has been replaced by 2 variables supporting per VO definitions. Look at section on SE configuration for more details.

For backward compatibility, if SE_HOST_DEFAULT variable is present and new variables are not defined, its value is used for both close and default SE of all VOs.

gLite-3.0.2-10 : SEDPM_DISK_HOSTS no longer used

SEDPM_DISK_HOSTS was used to configure GridIce on DPM disk servers. GridIce configuration is now based on DPM configuration. This variable is no longer used and can be safely removed.

gLite-3.0.2-10 : DPM : support added for SRM v2.2

QWG Templates and ncm-dpmlfc have been updated to allow management of DPM SRM v2.2 service. To enable it, you need to edit your DPM site configuration template pointed by variable SEDPM_CONFIG_SITE and add an entry for SRM v2.2 service, similar to the entry for SRM v2. Look at example of DPM site configuration template.

gLite-3.0.2-10 : pro_software_component_dpmlfc included in standard DPM configuration

pro_software_component_dpmlfc is now included as part of the standard DPM configuration, before including the template defining DPM site configuration. Thus, this is no longer necessary to include it in the template defining the local DPM configuration.

It is recommended that you edit your template defining DPM site configuration to suppress include of pro_software_component_dpmlfc, as the name of this template will change in a future release as a consequence of conversion to namespace.

gLite-3.0.2-9 requires panc >= 6.0.3

As of QWG Templates release gLite-3.0.2-9, minimum required version of panc compiler is 6.0.3.

lfc/config.tpl compilation error

After installing QWG Templates release gLite-3.0.2-9, if you get an error compiling lfc/config.tpl, be sure to read section on LFC site parameters. This happened because there is no longer any password defaults provided.

Upgrading a LCG RB to update 13 and later

If you want to upgrade a LCG RB from gLite 3.0 <= update 12 to gLite 3.0 >= update 13 (corresponding to QWG templates release >= gLite-3.0.2-9), be sure to read the release notes. Because of an internal change, all unfinished jobs submitted through the RB will be forgot. Thus it is recommended to drain the RB at least 2 days before doing the upgrade.

To drain a RB, the easiest is to stop the network server with the following command :

service edg-wl-ns stop

When the RB is draining, no new job can be submitted and outpout of completed jobs cannot be retrieved. But users can get information about the status of their jobs.

It is a good idea to stop the Quattor client on the RB during this period using command :

service ncm-cidspd stop

Update of voms.cern.ch certificate

Release 3.0.2-6 of QWG templates provides an updated version of vo/certs/cern-alt.tpl (certificate of voms.cern.ch)' named vo/certs/cern-alt.tpl.new, as provided by gLite 3.0 update 11. It cannot be activated right now as the certificate has not yet been updated on the server.

When the server will have been updated (should happen 9/1/07), you'll have to replace current cern-alt.tpl with this new one by overwritting existing certificate and then deploy as usual.

AII : aii-shellfe error about bootloader

Release 3.0.2-9 of QWG templates introduces the support for explicit specification of the disk to use to install the boot loader. This is required for systems with a very large number of disks.

Because of this change, this is necessary to update the Kickstart template you use. You can find an up to date working Kickstart template either in QWG repository. This template must be installed in directory point by templatedir in /etc/aii-osinstall.conf on your Quattor server (normally `/usr/lib/aii/osinstall).

AII : ncm-template required

Release 3.0.2-5 of QWG templates upgrades component ncm-ncd. The new version requires ncm-template.

As this component is installed as part of Kickstart initial installation during post installation script, it is necessary to update the Kickstart configuration template. You can find a working Kickstart template either in QWG repository. This template must be installed in directory point by templatedir in /etc/aii-osinstall.conf on your Quattor server (normally `/usr/lib/aii/osinstall).

Change in how to run MPI jobs

MPI integration into middleware changed substancially in release 3.0.2-5 of QWG templates. These changes are the result of an effort to make the MPI integration more efficient, more flexible and... more stable. New design for MPI integration has been agreed upon by a large community in a meeting held in Dublin in December 2006.

More information on how to use MPI in grid jobs is available at url http://grid.ie/mpi/wiki/FrontPage.

Shared working areas for MPI jobs (Torque v2)

Because EDG_WL_SCRATCH is defined unconditionally to the directory created by Torque on the worker node for the job, MPI jobs have no shared working areas even if home directories are shared. An attempt to fix this was made in 3.0.2-6 but broke the normal behaviour for non MPI jobs in shared home directories configurations (which is to have the working area on the WN local directory). Thus the change was reverted in 3.0.2-7.

This problem should be fixed in 3.0.2-8. As a temporary workaround, you can keep common/torque2/client/config from 3.0.2-6 if it worked for you.

quattor/config not found

After upgrading to QWG template release gLite-3.0.2-5, if PAN compiler complains it cannot find quattor/config, you need to add standard before standard/**/* in your clusters cluster.build.properties.

LCMAPS error after upgrading from LCG 2.7.0

This is caused by VOMS related libraries having been moved from /opt/edg to /opt/glite.

ncm-ldconf, ran as part of the upgrade, is updating shared libraries cache (/etc/ld.so.cache) only if the contents of /etc/ld.so.conf has been changed. Unfortunatly this is not the case between LCG 2.7 and gLite 3.0. It just happens that some libraries have been moved from one path to another...

To fix this problem, log on the machine and run :

ldconfig

No service restart is needed.

DPM upgrade from LCG 2.6/2.7

gLite 3.0 DPM (1.5) includes integration with VOMS and requires a database schema upgrade. This must be done manually on the DPM master node. The following steps are needed :

  • Create a script to call the upgrade procedure (replace by value for your sites) :
    #!/bin/sh
    
    requires () {
    echo "requires : nothing done"
    }
    
    # Edit to match your site
    export MY_DOMAIN='your.dom.ain'
    export DPM_HOST='dpm.your.dom.ain'
    export DPM_DB_USER='AdminDbUser'      # Generally root
    export DPM_DB_PASSWORD='AdminDBPwd'
    
    . /opt/glite/yaim/functions/config_DPM_upgrade
    
    config_DPM_upgrade
    
  • Check that AdminDBUSer/AdminDBPwd has full privileges on your database server
  • Run the script
  • Restart all DPM daemons. The easiest is to delete /etc/shift.conf and run the following command :
    ncm-ncd --configure dpmlfc
    
  • Run the command in /etc/cron.d/lcgdm-mapfile-update.ncm-cron.cron

Condor RPM name not matching internal name

RB, VOBOX, WMS normally require Condor RPM condor-6.7.10-linux-x86-glibc23-dynamic-1.i386.rpm. Unfortunatly the internal name of this RPM is condor-6.7.10-1.i386. This doesn't work with SPMA that use internal name to know if a RPM is already installed.

To workaround this problem, templates load condor-6.7.10-1.i386.rpm, which also exists (but seems different) in gLite 3.0 distribution (external packages). This requires the following step are required for loading the right RPM :

  • In RPM repository for gLite external packages, rename condor-6.7.10-1.i386.rpm to something else.
  • Create a symlink called condor-6.7.10-1.i386.rpm to condor-6.7.10-linux-x86-glibc23-dynamic-1.i386.rpm.

This problem has been logged to GGUS, ticket 10567.

LCG RB upgrade

gLite 3.0 includes a new version of Condor that is no longer installed in /opt/condor but /opt/condor-version. Also default name for Condor configuration file is now condor_config instead of condor.conf.

Condor relies on CONDORG_INSTALL_PATH and CONDOR_CONFIG environment variables to know where it is installed and where is the configuration file. Unfortunatly, the script starting Condor (/etc/init.d/edg-wl-jc) relies on /opt/edg/etc/profile.d/edg-wl-config.sh to get these variables defined in the context of the script (from /etc/sysconfig/globus, the actual place where they are defined). But this script doesn't take care of exporting these variables when tehy are defined in /etc/sysconfig/globus. As a consequence, Condor master doesn't see them. This has been logged into GGUS as ticket 10628.

In the meantime, before the problem is fixed, you need a patched version of edg-wl-config.sh. It is provided as part of LCG RB configuration, by QWG templates. But there is no way to ensure that a further reinstallation of the RPM will not overwrite this patched version.

If CondorG refuses to start, complaining that CONDOR_CONFIG is not defined, you should use the following command to reinstall the patched version :

ncm-ncd --configure filecopy

fetch-crl

gLite templates requires the most fetch-crl version released by EUGRID PMA laste spring 2007 (2.6.0-1). Before gLite-3.0.0-3, RPMs list provided only version 2.0-1 that is not working properly with the configuration set up by templates. As a result, you quickly reach expiration of CRL and nothing works anymore...

Starting with gLite-3.0.0-3, RPMs list requires the right version. But this version is not yet part of gLite distribution, so you need to get it directly from EUGRID PMA site and put it the RPM repository for gLite 3.0 updates.

GlueHostApplicationSoftwareRunTimeEnvironment

This Glue attribute should normally contain a list of tags describing the software / middleware environment available on the CE. This list need to be updated with each new release of the middleware. Previously it was the responsability of the site to update variable CE_RUNTIMEENV. There is now a more flexible method described in gLite3 customization page.

Change Log

ChangeLog build from repository commit messages...

Changelog not available

Last modified 16 years ago Last modified on Oct 26, 2008, 11:39:37 PM