wiki:Obsolete/Doc/OS/Errata

Version 10 (modified by /O=GRID-FR/C=FR/O=CNRS/OU=LAL/CN=Michel Jouvin, 15 years ago) (diff)

--

OS Errata Management and Deployment

Quattor can greatly help with OS errata deployment. QWG templates has a few specific features and tools to help managing them.

Downloading RPM Errata

RPM errata should normally be downloaded from the official public servers for the distribution you use. With Scientitific Linux it is possible to use rsync to syncrhonize a local directory with the official repository. The rsync official URL prefix is:

  • For SL4:
    rsync://rsync.scientificlinux.org/scientific/VERSION/ARCH/errata/SL/RPMS/
    
  • For SL5:
    rsync://rsync.scientificlinux.org/scientific/VERSION/ARCH/updates/security
    

GRIF tends to have a rather up-to-date mirror of these repositories that you can access with http at:

http://quattorsrv.lal.in2p3.fr/packages/os/VERSION/errata

These are just suggestions, there are many way to synchronize with a reference repository, including YUM and a script provided by SCDB, utils/misc/sync-os-errata, to do it in a cron job using rsync or wget.

Generating Templates for OS Errata

After downloading the RPM errata, it is necessary to generate a template that will be used for deploying them. This is done with SCDB script utils/misc/rpmErrata.pl. This script accept one argument which is the name of the local directory containing the errata and configured as a SCDB RPM repository. It will produce on stdout a template with a pkg_ronly entry for the last version of all RPMs found in the directory passed to the script. The output must be redirected to a template.

For example, assuming your RPM errata for SL 4.7 x86_64 are located in directory /www/htdocs/packages/os/sl470-x86_64/errata, the command would be:

utils/misc/rpmErrata.pl /www/htdocs/packages/os/sl470-x86_64/errata/ > cfg/os/sl470-x84_64/rpms/errata.tpl

Note: rpmErrata.pl is very verbose. All the information messages are sent to stderr and can be redirected separatly.

`In the resulting template, due to the specificities of upgrading kernels, kernel entries are are not added. See later for kernel upgrade specificities.

As the template use pkg_ronly() SPMA function, the errata will be included in the configuration only if another version of the same package and architecture is already part of the configuration.

Note: normally, the template generated can be used as it is without any manual edition. Because pkg_ronly only replaces a RPM already part of the configuration, this may not work in the very rare cases where a RPM is renamed. In this case, you need to manually update the template to replace pkg_ronly by pkg_repl (same arguments) and add a line for the old package name (only argument) with pkg_del to remove the old package. This is also necessary for kernel modules where the kernel version is part of the RPM name for the module.

Deploying OS Errata

Errata deployment is controlled through variable PKG_DEPLOY_OS_ERRATA. By default, to avoid any problem at a site, errata deployment is disabled. But sites are strongly encouraged to define this variable to true to enable errata deployment. The most usual places to define this variables are source:templates/trunk/clusters/example-3.1/site/cluster-info.tpl site/cluster-info.tpl] to control it at a cluster level or in [source:templates/trunk/sites/example/site/config.tpl site/config.tpl] in your site-specific templates to control it at the site level. It is recommended to define a default value (using operator ?= `) to allow further redefinition in a node profile.

When enabled, OS errata deployment will use a template rpms/errata.tpl in the templates for the OS version used on a specific node. This default name can be changed using variable PKG_OS_ERRATA_TEMPLATE to define the template to use. It is strongly recommended to use a non default name to avoid any clash with the errata.tpl provided by the standard templates. This also allows to produce a different template for a different version of the errata (using the date in the template name for example) with a finer control over what is deployed when on which machine.

It sometimes happens that one errata RPM causes conflicts or have dependencies difficult to solve in the site context. If this RPM is not used or not critical for the site/node, it is possible to remove it by adding to the errata template something like:

'/software/packages' = pkg_del('myconflictingrpm');

Note: never try to remove the RPM from the base templates used to configure the OS. First it may break some things when the errata are not deployed. Also one specific RPM is often added by several templates. But the main reason is that these templates are entirely generated from distribution official list. You obtain the same result with the line described above, except its effect is only in the context of these errata.

A good place to add this type of modifications to the base template for the errata, as well as to handle kernel upgrade described below, is a dedicated template referred by variable PKG_OS_ERRATA_FIX_TEMPLATE.

Note: if you use the dummy WN feature in gLite templates, you may want to disable it temporarily, else when deploying the errata on the reference profile, all the nodes using will receive the errata too.

Kernel errata

Handling of kernel errata is a bit specific due to some restrictions in the current version of SPMA and because an improper upgrade may lead to a machine not restartable.

The kernel version selection, for the kernel itself and all the kernel modules you may used is using the standard kernel selection method, based on OS_KERNEL_VERSION variable. This variable is typically defined at the beginning of the node profile as it has to match the OS version used (it can also be defined in the the cluster site/cluster_info.tpl if all machines in the cluster sharing the same OS version use the same kernel version. The value must be the kernel RPM version.

But with the current version of SPMA it is not possible to tell SPMA to never uninstall a kernel, even if it is no longer part of the configuration. As a result if you just replace the kernel, the one actually used will be removed at the same time the new one is installed and in case of a problem you may not be able to reboot. A workaround is to add the following lines at the end of the node profile, before the repository configuration), or in any template as part of the errata configuration if you want to avoid editing a large number of profiles (a good place may be the errata fix template, see above). The lines to add are:

'/software/components/spma/userpkgs' = 'yes';
'/software/packages' = pkg_add(PKG_KERNEL_RPM_NAME,'old-kernel-version',PKG_ARCH_KERNEL,"multi");

with old-kernel-version replaced by the kernel RPM version currently installed.

Note: for the kernel, pkg_add must be used with option multi to enable the concurrent installation of several kernel version.