= Initial Installation = [[TracNav]] [[TOC(inline)]] This is a quick introduction to AII configuration. AII is Quattor component in charge of producing Kickstart configuration file used for initial installation. Configuration specific to initial installation is made of 2 parts : * AII_xxx variables used to configure base environment (keyboard, language...). This works both for v1 and v2. * Filesystems and block devices definitions : this is specific to AII v2. == AII Site Configuration == The typical place to customize AII configuration is the site-specific template pointed by variable `AII_CONFIG_SITE` or [source:templates/trunk/sites/example/site/aii-config.tpl site/aii-config.tpl] if the variable is not defined. This template is executed quite late in the configuration, after completion of all the base OS configuration. Thus it is possible to make reference to most of the variables built during the configuration, in particular those reflecting OS flavour (version and arch) used, file systems configured... All AII options may be customized through variables. For the complete list, look at the man pages ''aii-pxelinux(8)'' and ''aii-ks(8)'' or the source code [source:templates/trunk/standard/quattor/aii/ks/config.tpl standard/quattor/aii/ks/config.tpl] and [source:templates/trunk/standard/quattor/aii/pxelinux/config.tpl standard/quattor/aii/pxelinux/config.tpl]. A typical AII configuration is : {{{ # AII specific parameters variable AII_OSINSTALL_SRV ?= "quattorsrv.lal.in2p3.fr"; # variable AII_ACKSRV ?= "quattorsrv.lal.in2p3.fr"; variable AII_OSINSTALL_ROOT = '/packages/os'; }}} === Controlling disks used by Kickstart === Among all the AII-related variables available, a few ones control disk where the GRUB boot loader is written to, which disks can be used by Kickstart during installation and what are the disks whose existing partition table must be cleared. In particular: * `AII_OSINSTALL_BOOTDISK_ORDER`: list of disk name (without `/dev`) where to write GRUB boot loader. * `AII_OSINSTALL_OPTION_CLEARPART`: list of disks (without `/dev`) whose existing partition table must be cleared by Kickstart. * ''Note that, despite the variable name suggests it is used to configure the Kickstart `clearpart` option, there will be no `clearpart` option set in the Kickstart configuration file produced by `aii-shellfe`. The relevant actions are done in the `%pre` script instead.'' * `AII_OSINSTALL_IGNOREDISKS`: list of disks (without `/dev`) that Kickstart should not touch in any way. * `AII_OSINSTALL_CLEARPART_BOOT_ONLY`: if false and nothing is explicitly specified, all disks managed by Quattor (present in `/system/blockdevices/physical_devs`) have their existing partition table cleared. If true, only disk having property `boot` defined to `true` in the hardware configuration (`/hardware/harddisks`) are cleared. Default is `false`. Default value for these variables should be appropriate as long as the hardware description for the machine is correct. In particular, all the disks present in the hardware configuration are added either to the list of disks whose existing partitions must be cleared or to the list of disks to ignore. When a machine has several disks, it is possible to define the property `boot` to `true` for the disk entry in `/hardware/harddisks` corresponding to the system disk (if there is only one disk, it is assume to be the system disk). Note that, when using `AII_OSINSTALL_OPTION_CLEARPART` and `AII_OSINSTALL_IGNOREDISKS`, it is possible that the final list for each kind of disks may be different from what specified to take into account the disks present in the configuration but not specified in the variables. Nevertheless, the final configuration should reflect what was explicitly specified (a disk will not be moved to the other list). A check is also made that a specific disk belongs only to one list. == Kickstart Site-Specific Installation Actions == AII allows to define in a flexible way site-specific actions that must be done at various points in Kickstart/Anaconda. This is done through `AII hooks`. Writing and configuring hooks is not yet documented... == Configuration of Filesystems: The Recommended Way == QWG Templates provide a generic template to configure file systems on a typical system with 1 or 2 disks. It is based on template [source:templates/trunk/standard/filesystem/config.tpl standard/filesystem/config.tpl] that may be customized according to your site needs with a limited set of variables. In addition to this generic template to define disk layouts, QWG templates provides 3 examples of how to use it : * [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl] : define a default system disk layout for a typical grid system running gLite, using LVM for everything except `/boot`, `/` and `swap`. This template also provides some variables to further refine this default layout without redefining everything. * [source:templates/trunk/sites/example/site/filesystems/ce_nfs_server.tpl site/filesystems/ce_nfs_server.tpl] : add to the basic layout provided by the previous template 2 filesystems, one for home directories and one for VO software areas. It is intended to be used on a machine configured as a NFS server to serve these file systems to WNs (and possibly the CE if it is not the NFS server). It illustrates how to configure the basic layout template without redefining every thing. * [source:templates/trunk/sites/example/site/filesystems/extended.tpl site/filesystems/extended.tpl] : define a default system disk layout for a typical grid system running gLite, using logical partitions instead of LVM for everything except `/boot`, `/` and `swap`. This illustrated how to customize the default layout. It is an alternative to [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl]. This template also provides some variables to further refine this default layout without redefining everything. * [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl] : define a default system disk layout for a typical grid system running gLite, using software raid (md devices) instead of LVM for everything except `swap`. This illustrated how to customize the default layout. It is an alternative to [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl]. This template also provides some variables to further refine this default layout without redefining everything. These templates are just provided as examples on how to use [source:templates/trunk/standard/filesystem/config.tpl standard/filesystem/config.tpl]. But [source:templates/trunk/standard/filesystem/config.tpl standard/filesystem/config.tpl] can be used to handle many other layouts. You can build your own from scratch or mixing the examples provided (there is no restriction to use only LVM, logical partitions or raid volumes in one layout, this was done just for clarity of examples) or using other types of block devices like hardware raid. The basic idea is to declare in this base layout templates (like [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl], [source:templates/trunk/sites/example/site/filesystems/extended.tpl site/filesystems/extended.tpl], [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl]) all the possible file system that can be found on a system, most of them with zero size meaning they will not be created unless the variable used to define their size is explicitly set to a non-zero value in another template that include the basic layout (as [source:templates/trunk/sites/example/site/filesystems/ce_nfs_server.tpl site/filesystems/ce_nfs_server.tpl] does with [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl]). The main effort is to maintain the basic layout but it has the advantage to provide a central point for defining any potential configuration, avoiding a lot of duplication. To use these templates, you need to define the following 2 variables in your node profile or one of your site-specific templates (typically [source:templates/trunk/clusters/example-3.1/site/cluster_info.tpl site/cluster_info.tpl] ) : * `FILESYSTEM_CONFIG_SITE` : define this variable to be [source:templates/trunk/standard/filesystem/config.tpl filesystem/config.tpl]. This is the default in gLite templates. * `FILESYSTEM_LAYOUT_CONFIG_SITE` : define this variable to the name of a template that customize source:templates/trunk/sites/example/site/filesystems/glite.tpl sites/example/site/filesystems/glite.tpl] by defining the appropriate variables. gLite templates define it to be [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl] by default. === Customizing File System List === The responsibility of the layout template (like the provided examples [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl], [source:templates/trunk/sites/example/site/filesystems/extended.tpl site/filesystems/extended.tpl], [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl]) is to build a variable `DISK_VOLUME_PARAMS` which is an nlist that contains one entry per file system or block device (the key in the nlist is just an arbitrary identifier used for cross-referencing entries). Both of thems are described the same way in the same variable: they all have a set of attributes declared in a nlist. Possible attributes vary dependending on wether it is a file system or block device or depending on the block device type (partition, software raid, hardware raid, LVM). An entry is considered a file system if it has an attribute `mountpoint` defined. When the layout template is executed, `DISK_VOLUME_PARAMS` already exists with some typical default entries. For this reason, the layout template doesn't create the variable from scratch with function `nlist()` but uses the function `filesystem_mod()` to update it. This is also for this reason that in the examples ([source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl], [source:templates/trunk/sites/example/site/filesystems/extended.tpl site/filesystems/extended.tpl], [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl]), file system entries like `root`, `usr` don't have a `mountpoint` attribute defined in the layout template. The defaults are defined in the file system [source:templates/trunk/standard/filesystem/config.tpl configuration template]. Look at `DISK_VOLUME_PARAMS` definition in it if you need to check what the default entries and attributes are. One important feature of the file system [source:templates/trunk/standard/filesystem/config.tpl configuration template] is that if an entry in `DISK_VOLUME_PARAMS` has a zero size, it will be removed from the list of file systems or block devices to create. The template will also ensure that if an entry has no explicit size defined but is using an underlying block device with a zero size it is also removed. For example, this may be the case for a software raid block device whose size is derived from the partitions it uses but is not explicitly defined. This is also the case for a file system relying on a raid block device. The defaults defined in the file system [source:templates/trunk/standard/filesystem/config.tpl configuration template] include the creation of a LVM volume group called `vg.01` using the unused part of the system disk (because the default configuration is LVM-based). As a result, if you would like a file system configuration LVM-free, you need to update the entry for `vg.01` and define its size to 0. Look at [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl] (a pure software raid configuration) for an example. You can also keep an LVM-based configuration but redefine in your layout template the volume group to use for a given file system (using the `volgroup` atttribute). An other important feature demonstrated in some of the examples provided ([source:templates/trunk/sites/example/site/filesystems/extended.tpl site/filesystems/extended.tpl], [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl]) is how to control the creation order of partitions on a disk. This is done by defining a list (called `DISK_GLITE_PARTS` in the examples) that is used to determine the number appended to the base device when creating the partition (this can also be used for MD devices). Look at [source:templates/trunk/sites/example/site/filesystems/sw_raid.tpl site/filesystems/sw_raid.tpl] for an example. ''Note: you will not see this variable in [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl] because it relies on defaults provided by the file system [source:templates/trunk/standard/filesystem/config.tpl configuration template].'' Some useful hints helping to maitain `DISK_VOLUME_PARAMS`: * There is no need to explicitly declare charatacteristics of disk partitions or logical volumes. This is why you will not see entry for them in the examples. If an entry is using a disk partition (eg. a file system) and has a size attribute defined, a block device corresponding to the declared size will be created. If the partition number is greater or equal to 5, an extended partition will automatically be added as the fourth partition on the device. This is true in particular for file systems and software raid 1 (md) devices.The same applies to LVM logical volumes but a file system using a logical volume is required to have an attribute `volgroup` defined. * A special value for the size attribute is -1. This means you want to use all the remaining unused space in the underlying block device (eg. partition or logical volume). The file system [source:templates/trunk/standard/filesystem/config.tpl configuration template] will ensure that this entry is created last in the underlying block device and that there is not 2 entries with a size of -1 in the same block device. ''Note: if you get into troubles trying to design your own layout template, it is recommended to enable debugging in the file system [source:templates/trunk/standard/filesystem/config.tpl configuration template]. The output is fairly verbose but self-explanatory and generally helps to locate quickly the problem. As the output is verbose and because there is a high parallelisation in the PAN compiler that may lead to interleaving messages from compilation of different profiles, it is recommended to ensure you compile only one profile when doing the debugging. With SCDB you can easily select a cluster and enable debugging with the following command:` {{{ ant -Dpan.debug.include=filesystem/config -Dcluster.name=your-cluster-name }}} === Defining Default File System Type === Default file system type is `ext3`. This can be customized by a site defining variable `FILESYSTEM_DEFAULT_FS_TYPE` variable. === Defining Default Formatting Policy === It is possible to define the site defaults for attributes `format` and `preserve` that are used when reinstalling a system with valid partitions. This is done respectively with variables `FILESYSTEM_DEFAULT_FORMAT` and `FILESYSTEM_DEFAULT_PRESERVE`. They are both `true` by default. == File Systems and Block Devices : The Gory Details == In AII v2, block devices and file systems are declared separatly. Sections belows give some details about how to configure each of them. Block devices entries are used only if referenced by a file system or another block device entry. ''Note: configuring directly file systems and block devices is not recommended as it may be quite complex to ensure consistency of all parts involved. If possible, use the generic template to configure filesystems. Both methods cannot be mixed.'' === File Systems === Filesystems are declared as an ordered list allowing to customize different aspect of a file system (mount point, mount options, format, ...). Each file system in the list is described as a nlist. One property in the nlist describing a file system is `block device` : its value is an entry in the structure describing block devices (physical disk, logical volumes, ...). To add or modify a filesystem on a system, it is recommended to use 'fileystem_mod` function. An example is : {{{ "/system/filesystems" = filesystem_mod( list(nlist ("block_device", "partitions/" + DISK_PART_BOOT, "mountpoint", "/boot", "format", true, "mount", true, "preserve", false, "type","ext2"), nlist ("block_device", "partitions/" + DISK_PART_SWAP, "format", true, "mount", true, "preserve", false, "type","swap", "mountpoint","swap"), nlist ("block_device", "partitions/" + DISK_PART_ROOT, "format", true, "preserve", false, "mount", true, "type","ext3", "mountpoint","/"), nlist ("block_device", "logical_volumes/usrvol", "format", true, "preserve", false, "mount", true, "type","ext3", "mountpoint","/usr"), ); }}} === Block Devices === Block devices define logical and physical devices used by file systems. They can be logical devices, HW or SW raid devices, physical disks... A block device can be made of other block devices, for example a logical device is made of one or more physical devices. Unlike file systems which are described with an ordered list, block devices are defined as a nlist. At installation time, they are processed (partition creation, logical volume creation...) in the order of file systems who use them. Several functions are available to help in declaration of block devices, in particular : * `partition_add(PHYS_DISK,partition_nlist)` : allow to add entries efficiently in `/system/blockdevices/partitions` nlist. * `lvm_add(VOL_GROUP,logvol_nlist)` : allow to add entries efficiently in `/system/blockdevices/logical_volumes` nlist. An example of block device definition is : {{{ "/system/blockdevices/physical_devs" = npush ( DISK_BOOT_DEV, nlist ("label", "msdos") ); "/system/blockdevices/partitions" = partitions_add (DISK_BOOT_DEV, nlist (DISK_PART_BOOT, 64*MB, DISK_PART_SWAP, 4*GB, DISK_PART_ROOT, 1*GB, DISK_PART_LOGPARTS, -1 ) ); "/system/blockdevices/volume_groups" = npush ( "vg0", nlist ("device_list", list ("partitions/" + DISK_PART_LOGPARTS)) ); "/system/blockdevices/logical_volumes" = lvs_add ("vg0", nlist("usrvol", 5*GB, "homevol", 512*MB, "tmpvol", 1*GB, "varvol", -1, ) ); }}} Look at [source:templates/trunk/sites/example/site/filesystems/glite.tpl site/filesystems/glite.tpl] for more details. == Troubleshooting == === Installation failure during %pre script === The so-called `%pre%` script is a script used early in the installation process by the standard RH/SL/CentOS installer (Anaconda). This script is created by Quattor to handle the disk partitioning based on the disk configuration described in the node profile. If the installer reports an error during this script (generally no reason is given), this means an error during the disk partitioning and the file system initialization. It is generally necessary to go to the alternate consoles to do the troubleshooting. 2 important causes for problems are: * Lack of `msdos` label: the model for file systems and block devices allows to use partition labels others than '''msdos'''. It is even possible not to use partition tables at all on non-system disks. However, Anaconda doesn't like this and will stop your installations if it finds disks with non-msdos labels. For this reason, AII will only create filesystems that are fully enclosed in msdos-labeled disks. This means the partition beneath a file system, all members of a given software RAID or whatever combination you can think of. Filesystems on other partition tables must be created after the node is installed, for instance with ncm-filesystems. Use the second alternate console to check the labels on your disk. * LVM volumes are not destroyed: when reinstalling a system which was previously installed with a LVM partition using the same disk partitionning, the old LVM information (which is on-disk) is reused and the logical volumes are not recreated. One possible workaround is to use the second alternate console to destroy the volume group or reformat the LVM partition as something different to clear the previous information. === Error creating Fetch object for ... === Should you get this message when running `aii-shellfe`, it indicates that `/etc/ccm.conf` is missing. This may happen if your Quattor server is not managed with Quattor. To fix the problem, create one with the following contents: {{{ debug 0 force 0 cache_root /var/lib/ccm get_timeout 30 lock_retries 3 lock_wait 30 retrieve_retries 3 retrieve_wait 30 world_readable 1 }}} === Tainted mode warnings (Insecure dependency...) === These messages indicate that you have clearly insecure hooks running, and that a fix is needed. AII runs in "tainted" mode, meaning that all input must be sanitized. On user hooks you'll usually find warnings when attempting to open a file that is given on the profile or when running a command, for instance: {{{ my $filename = $config->getElement (SOME_PATH)->getValue (); open (FH, ">$filename"); }}} will issue a warning, meaning that $filename must be sanitized. This a sanitized version: {{{ my $filename = $config->getElement (SOME_PATH)->getValue (); if ($filename =~ m{^(/.+)$}) { $filename = $1; } else { throw_error ("Expected an absolute path on $filename"); return (); } open (FH, ">$filename"); }}} Note that the above example assumes you expect an absolute path. If you expected something different (f.i, a path under /osinstall/ks), fix your regular expression accordingly. The same applies when you run commands: {{{ my $param = $config->getElement (SOME_OTHER_PATH)->getValue (); # $param is tainted!!! system ("ls", "$param"); }}} will fail, so you'll have to specify what you are expecting exactly: {{{ my $param = $config->getElement (SOME_OTHER_PATH)->getValue (); # I expected just a bunch of flags!! if ($param =~ m{^(-[-=\w]+)$}) { $param = $1; } else { throw_error ("Unexpected flags passed to the command"); return (); } system ("ls", $param); }}} When you get a warning, it will point out the line where the insecure data is used, but please fix it on the place where such insecure data is received. It will reduce a lot your code and efforts. You'll find more information on the tainted mode on {{{perlsec}}} man page. === Tweaking KS parameters for a specific OS version === #OSSpecificKS From time to time, Kickstart required parameters may change with either the introduction of new required options or the deprecation of some other options. This was for example the case in SL/RHEL6. This can be generally handled by creating an version-specific template for this OS version. * If the tweaking applies to all minor versions of this major version, this is done by: 1. Going into the directory containing the AII KS plugin templates. In QWG templates, this is in `cfg/standard/quattor/aii/ks`. 1. Under this directory, create a directory `variants` if it doesn't exist yet. 1. Put the necessary configuration in a template which name is the distribution/major OS in this directory, eg. `sl6.tpl`. * If the specific configuration applies only to one specific OS version, the necessary configuration can be put in template `config/quattor/ks.tpl` in the OS-specific templates. When there is a configuration specific to one OS version, the configuration specific to the related major OS version is included first, if it exists. The OS-specific templates related to KS configuration included in a given profiles can be displayed [/wiki/Doc/SCDB/Usage#pancdebug enabling Pan debug] information in template `quattor/aii/ks`.