wiki:Doc/BasicConfig/AII

Initial Installation

This is a quick introduction to AII configuration. AII is Quattor component in charge of producing Kickstart configuration file used for initial installation.

Configuration specific to initial installation is made of 2 parts :

  • AII_xxx variables used to configure base environment (keyboard, language...). This works both for v1 and v2.
  • Filesystems and block devices definitions : this is specific to AII v2.

AII Site Configuration

The typical place to customize AII configuration is the site-specific template pointed by variable AII_CONFIG_SITE or site/aii-config.tpl if the variable is not defined. This template is executed quite late in the configuration, after completion of all the base OS configuration. Thus it is possible to make reference to most of the variables built during the configuration, in particular those reflecting OS flavour (version and arch) used, file systems configured...

All AII options may be customized through variables. For the complete list, look at the man pages aii-pxelinux(8) and aii-ks(8) or the source code standard/quattor/aii/ks/config.tpl and standard/quattor/aii/pxelinux/config.tpl.

A typical AII configuration is :

# AII specific parameters
variable AII_OSINSTALL_SRV ?= "quattorsrv.lal.in2p3.fr";
# variable AII_ACKSRV ?= "quattorsrv.lal.in2p3.fr";
variable AII_OSINSTALL_ROOT = '/packages/os';

Controlling disks used by Kickstart

Among all the AII-related variables available, a few ones control disk where the GRUB boot loader is written to, which disks can be used by Kickstart during installation and what are the disks whose existing partition table must be cleared. In particular:

  • AII_OSINSTALL_BOOTDISK_ORDER: list of disk name (without /dev) where to write GRUB boot loader.
  • AII_OSINSTALL_OPTION_CLEARPART: list of disks (without /dev) whose existing partition table must be cleared by Kickstart.
    • Note that, despite the variable name suggests it is used to configure the Kickstart clearpart option, there will be no clearpart option set in the Kickstart configuration file produced by aii-shellfe. The relevant actions are done in the %pre script instead.
  • AII_OSINSTALL_IGNOREDISKS: list of disks (without /dev) that Kickstart should not touch in any way.
  • AII_OSINSTALL_CLEARPART_BOOT_ONLY: if false and nothing is explicitly specified, all disks managed by Quattor (present in /system/blockdevices/physical_devs) have their existing partition table cleared. If true, only disk having property boot defined to true in the hardware configuration (/hardware/harddisks) are cleared. Default is false.

Default value for these variables should be appropriate as long as the hardware description for the machine is correct. In particular, all the disks present in the hardware configuration are added either to the list of disks whose existing partitions must be cleared or to the list of disks to ignore. When a machine has several disks, it is possible to define the property boot to true for the disk entry in /hardware/harddisks corresponding to the system disk (if there is only one disk, it is assume to be the system disk).

Note that, when using AII_OSINSTALL_OPTION_CLEARPART and AII_OSINSTALL_IGNOREDISKS, it is possible that the final list for each kind of disks may be different from what specified to take into account the disks present in the configuration but not specified in the variables. Nevertheless, the final configuration should reflect what was explicitly specified (a disk will not be moved to the other list). A check is also made that a specific disk belongs only to one list.

Kickstart Site-Specific Installation Actions

AII allows to define in a flexible way site-specific actions that must be done at various points in Kickstart/Anaconda. This is done through AII hooks. Writing and configuring hooks is not yet documented...

Configuration of Filesystems: The Recommended Way

QWG Templates provide a generic template to configure file systems on a typical system with 1 or 2 disks. It is based on template standard/filesystem/config.tpl that may be customized according to your site needs with a limited set of variables. In addition to this generic template to define disk layouts, QWG templates provides 3 examples of how to use it :

  • site/filesystems/glite.tpl : define a default system disk layout for a typical grid system running gLite, using LVM for everything except /boot, / and swap. This template also provides some variables to further refine this default layout without redefining everything.
  • site/filesystems/ce_nfs_server.tpl : add to the basic layout provided by the previous template 2 filesystems, one for home directories and one for VO software areas. It is intended to be used on a machine configured as a NFS server to serve these file systems to WNs (and possibly the CE if it is not the NFS server). It illustrates how to configure the basic layout template without redefining every thing.
  • site/filesystems/extended.tpl : define a default system disk layout for a typical grid system running gLite, using logical partitions instead of LVM for everything except /boot, / and swap. This illustrated how to customize the default layout. It is an alternative to site/filesystems/glite.tpl. This template also provides some variables to further refine this default layout without redefining everything.
  • site/filesystems/sw_raid.tpl : define a default system disk layout for a typical grid system running gLite, using software raid (md devices) instead of LVM for everything except swap. This illustrated how to customize the default layout. It is an alternative to site/filesystems/glite.tpl. This template also provides some variables to further refine this default layout without redefining everything.

These templates are just provided as examples on how to use standard/filesystem/config.tpl. But standard/filesystem/config.tpl can be used to handle many other layouts. You can build your own from scratch or mixing the examples provided (there is no restriction to use only LVM, logical partitions or raid volumes in one layout, this was done just for clarity of examples) or using other types of block devices like hardware raid.

The basic idea is to declare in this base layout templates (like site/filesystems/glite.tpl, site/filesystems/extended.tpl, site/filesystems/sw_raid.tpl) all the possible file system that can be found on a system, most of them with zero size meaning they will not be created unless the variable used to define their size is explicitly set to a non-zero value in another template that include the basic layout (as site/filesystems/ce_nfs_server.tpl does with site/filesystems/glite.tpl). The main effort is to maintain the basic layout but it has the advantage to provide a central point for defining any potential configuration, avoiding a lot of duplication.

To use these templates, you need to define the following 2 variables in your node profile or one of your site-specific templates (typically site/cluster_info.tpl ) :

Customizing File System List

The responsibility of the layout template (like the provided examples site/filesystems/glite.tpl, site/filesystems/extended.tpl, site/filesystems/sw_raid.tpl) is to build a variable DISK_VOLUME_PARAMS which is an nlist that contains one entry per file system or block device (the key in the nlist is just an arbitrary identifier used for cross-referencing entries). Both of thems are described the same way in the same variable: they all have a set of attributes declared in a nlist. Possible attributes vary dependending on wether it is a file system or block device or depending on the block device type (partition, software raid, hardware raid, LVM). An entry is considered a file system if it has an attribute mountpoint defined.

When the layout template is executed, DISK_VOLUME_PARAMS already exists with some typical default entries. For this reason, the layout template doesn't create the variable from scratch with function nlist() but uses the function filesystem_mod() to update it. This is also for this reason that in the examples (site/filesystems/glite.tpl, site/filesystems/extended.tpl, site/filesystems/sw_raid.tpl), file system entries like root, usr don't have a mountpoint attribute defined in the layout template. The defaults are defined in the file system configuration template. Look at DISK_VOLUME_PARAMS definition in it if you need to check what the default entries and attributes are.

One important feature of the file system configuration template is that if an entry in DISK_VOLUME_PARAMS has a zero size, it will be removed from the list of file systems or block devices to create. The template will also ensure that if an entry has no explicit size defined but is using an underlying block device with a zero size it is also removed. For example, this may be the case for a software raid block device whose size is derived from the partitions it uses but is not explicitly defined. This is also the case for a file system relying on a raid block device.

The defaults defined in the file system configuration template include the creation of a LVM volume group called vg.01 using the unused part of the system disk (because the default configuration is LVM-based). As a result, if you would like a file system configuration LVM-free, you need to update the entry for vg.01 and define its size to 0. Look at site/filesystems/sw_raid.tpl (a pure software raid configuration) for an example. You can also keep an LVM-based configuration but redefine in your layout template the volume group to use for a given file system (using the volgroup atttribute).

An other important feature demonstrated in some of the examples provided (site/filesystems/extended.tpl, site/filesystems/sw_raid.tpl) is how to control the creation order of partitions on a disk. This is done by defining a list (called DISK_GLITE_PARTS in the examples) that is used to determine the number appended to the base device when creating the partition (this can also be used for MD devices). Look at site/filesystems/sw_raid.tpl for an example.

Note: you will not see this variable in site/filesystems/glite.tpl because it relies on defaults provided by the file system configuration template.

Some useful hints helping to maitain DISK_VOLUME_PARAMS:

  • There is no need to explicitly declare charatacteristics of disk partitions or logical volumes. This is why you will not see entry for them in the examples. If an entry is using a disk partition (eg. a file system) and has a size attribute defined, a block device corresponding to the declared size will be created. If the partition number is greater or equal to 5, an extended partition will automatically be added as the fourth partition on the device. This is true in particular for file systems and software raid 1 (md) devices.The same applies to LVM logical volumes but a file system using a logical volume is required to have an attribute volgroup defined.
  • A special value for the size attribute is -1. This means you want to use all the remaining unused space in the underlying block device (eg. partition or logical volume). The file system configuration template will ensure that this entry is created last in the underlying block device and that there is not 2 entries with a size of -1 in the same block device.

Note: if you get into troubles trying to design your own layout template, it is recommended to enable debugging in the file system configuration template. The output is fairly verbose but self-explanatory and generally helps to locate quickly the problem. As the output is verbose and because there is a high parallelisation in the PAN compiler that may lead to interleaving messages from compilation of different profiles, it is recommended to ensure you compile only one profile when doing the debugging. With SCDB you can easily select a cluster and enable debugging with the following command:`

ant -Dpan.debug.include=filesystem/config -Dcluster.name=your-cluster-name

Defining Default File System Type

Default file system type is ext3. This can be customized by a site defining variable FILESYSTEM_DEFAULT_FS_TYPE variable.

Defining Default Formatting Policy

It is possible to define the site defaults for attributes format and preserve that are used when reinstalling a system with valid partitions. This is done respectively with variables FILESYSTEM_DEFAULT_FORMAT and FILESYSTEM_DEFAULT_PRESERVE. They are both true by default.

File Systems and Block Devices : The Gory Details

In AII v2, block devices and file systems are declared separatly. Sections belows give some details about how to configure each of them. Block devices entries are used only if referenced by a file system or another block device entry.

Note: configuring directly file systems and block devices is not recommended as it may be quite complex to ensure consistency of all parts involved. If possible, use the generic template to configure filesystems. Both methods cannot be mixed.

File Systems

Filesystems are declared as an ordered list allowing to customize different aspect of a file system (mount point, mount options, format, ...). Each file system in the list is described as a nlist. One property in the nlist describing a file system is block device : its value is an entry in the structure describing block devices (physical disk, logical volumes, ...). To add or modify a filesystem on a system, it is recommended to use 'fileystem_mod` function. An example is :

"/system/filesystems" = filesystem_mod(
  list(nlist ("block_device", "partitions/" + DISK_PART_BOOT,
              "mountpoint", "/boot",
              "format", true,
              "mount", true,
              "preserve", false,
              "type","ext2"),
       nlist ("block_device", "partitions/" + DISK_PART_SWAP,
              "format", true,
              "mount", true,
              "preserve", false,
              "type","swap",
              "mountpoint","swap"),
       nlist ("block_device", "partitions/" + DISK_PART_ROOT,
              "format", true,
              "preserve", false,
              "mount", true,
              "type","ext3",
              "mountpoint","/"),
       nlist ("block_device", "logical_volumes/usrvol",
              "format", true,
              "preserve", false,
              "mount", true,
              "type","ext3",
              "mountpoint","/usr"),
);

Block Devices

Block devices define logical and physical devices used by file systems. They can be logical devices, HW or SW raid devices, physical disks... A block device can be made of other block devices, for example a logical device is made of one or more physical devices.

Unlike file systems which are described with an ordered list, block devices are defined as a nlist. At installation time, they are processed (partition creation, logical volume creation...) in the order of file systems who use them.

Several functions are available to help in declaration of block devices, in particular :

  • partition_add(PHYS_DISK,partition_nlist) : allow to add entries efficiently in /system/blockdevices/partitions nlist.
  • lvm_add(VOL_GROUP,logvol_nlist) : allow to add entries efficiently in /system/blockdevices/logical_volumes nlist.

An example of block device definition is :

"/system/blockdevices/physical_devs" = npush (
  DISK_BOOT_DEV, nlist ("label", "msdos")
);

"/system/blockdevices/partitions" = partitions_add (DISK_BOOT_DEV, nlist (DISK_PART_BOOT, 64*MB,
                                                                          DISK_PART_SWAP, 4*GB,
                                                                          DISK_PART_ROOT, 1*GB,
                                                                          DISK_PART_LOGPARTS, -1
                                                                         )
                                                   );
    
"/system/blockdevices/volume_groups" = npush (
    "vg0", nlist ("device_list", list ("partitions/" + DISK_PART_LOGPARTS))
);

"/system/blockdevices/logical_volumes" = lvs_add ("vg0", nlist("usrvol", 5*GB,
                                                               "homevol", 512*MB,
                                                               "tmpvol", 1*GB,
                                                               "varvol", -1,
                                                              )
);

Look at site/filesystems/glite.tpl for more details.

Troubleshooting

Installation failure during %pre script

The so-called %pre% script is a script used early in the installation process by the standard RH/SL/CentOS installer (Anaconda). This script is created by Quattor to handle the disk partitioning based on the disk configuration described in the node profile. If the installer reports an error during this script (generally no reason is given), this means an error during the disk partitioning and the file system initialization. It is generally necessary to go to the alternate consoles to do the troubleshooting.

2 important causes for problems are:

  • Lack of msdos label: the model for file systems and block devices allows to use partition labels others than msdos. It is even possible not to use partition tables at all on non-system disks. However, Anaconda doesn't like this and will stop your installations if it finds disks with non-msdos labels. For this reason, AII will only create filesystems that are fully enclosed in msdos-labeled disks. This means the partition beneath a file system, all members of a given software RAID or whatever combination you can think of. Filesystems on other partition tables must be created after the node is installed, for instance with ncm-filesystems. Use the second alternate console to check the labels on your disk.
  • LVM volumes are not destroyed: when reinstalling a system which was previously installed with a LVM partition using the same disk partitionning, the old LVM information (which is on-disk) is reused and the logical volumes are not recreated. One possible workaround is to use the second alternate console to destroy the volume group or reformat the LVM partition as something different to clear the previous information.

Error creating Fetch object for …

Should you get this message when running aii-shellfe, it indicates that /etc/ccm.conf is missing. This may happen if your Quattor server is not managed with Quattor. To fix the problem, create one with the following contents:

debug 0
force 0
cache_root /var/lib/ccm
get_timeout 30
lock_retries 3
lock_wait 30
retrieve_retries 3
retrieve_wait 30
world_readable 1

Tainted mode warnings (Insecure dependency...)

These messages indicate that you have clearly insecure hooks running, and that a fix is needed. AII runs in "tainted" mode, meaning that all input must be sanitized. On user hooks you'll usually find warnings when attempting to open a file that is given on the profile or when running a command, for instance:

my $filename = $config->getElement (SOME_PATH)->getValue ();
open (FH, ">$filename");

will issue a warning, meaning that $filename must be sanitized. This a sanitized version:

my $filename = $config->getElement (SOME_PATH)->getValue ();
if ($filename =~ m{^(/.+)$}) {
    $filename = $1;
} else {
    throw_error ("Expected an absolute path on $filename");
    return ();
}
open (FH, ">$filename");

Note that the above example assumes you expect an absolute path. If you expected something different (f.i, a path under /osinstall/ks), fix your regular expression accordingly.

The same applies when you run commands:

my $param = $config->getElement (SOME_OTHER_PATH)->getValue ();
# $param is tainted!!!
system ("ls", "$param");

will fail, so you'll have to specify what you are expecting exactly:

my $param = $config->getElement (SOME_OTHER_PATH)->getValue ();
# I expected just a bunch of flags!!
if ($param =~ m{^(-[-=\w]+)$}) {
    $param = $1;
} else {
    throw_error ("Unexpected flags passed to the command");
    return ();
}
system ("ls", $param);

When you get a warning, it will point out the line where the insecure data is used, but please fix it on the place where such insecure data is received. It will reduce a lot your code and efforts.

You'll find more information on the tainted mode on perlsec man page.

Tweaking KS parameters for a specific OS version

From time to time, Kickstart required parameters may change with either the introduction of new required options or the deprecation of some other options. This was for example the case in SL/RHEL6.

This can be generally handled by creating an version-specific template for this OS version.

  • If the tweaking applies to all minor versions of this major version, this is done by:
    1. Going into the directory containing the AII KS plugin templates. In QWG templates, this is in cfg/standard/quattor/aii/ks.
    2. Under this directory, create a directory variants if it doesn't exist yet.
    3. Put the necessary configuration in a template which name is the distribution/major OS in this directory, eg. sl6.tpl.
  • If the specific configuration applies only to one specific OS version, the necessary configuration can be put in template config/quattor/ks.tpl in the OS-specific templates.

When there is a configuration specific to one OS version, the configuration specific to the related major OS version is included first, if it exists.

The OS-specific templates related to KS configuration included in a given profiles can be displayed enabling Pan debug information in template quattor/aii/ks.

Last modified 13 years ago Last modified on Jun 24, 2011, 3:01:31 PM