wiki:Development/YumAndSPMA

Using Yum as an SPMA backend

Introduction

This document describes the internals of the integration of Yum and SPMA. User documentation will be provided in Github pages.

The main priority is to make the transition as smooth as possible for users. În the first stage, all changes are on the client side. This means no changes to any Pan schemas. In the second phase, some schemas may add fields for the user's benefit, but it should be backwards-compatible.

Whatever changes we do to SPMA schema or internals, we must port them into AII, which is going to be time consuming.

What ncm-spma does

ncm-spma, essentially, writes two files: /etc/spma.conf and /var/lib/spma-target.cf. Afterwards, it may need to transfer the control to the spma program, who actually deals with the packages.

What spma does

The spma program reads its configuration file and the list of desired packages, potentially manipulates some URLs, and passes control to SPM::RPMPPkgr.

Current problems

First of all, it's the obvious lack of dependency resolution. To avoid any dependency problems in Pan, we install way too much.

Also, there are bugs in some RPM bindings that limit the size of our transactions.

Proposal

We'll fork off ncm-spma completely. Updating the system will require running ncm-ncd --co spma, as is currently done. The new ncm-spma will:

  1. Use /software/repositories to populate /etc/yum.repos.d.
    • In a future iteration, all fields allowed in a Yum repository will be allowed.
  2. For packages with a fixed repository, include them in the includepkgs entry of the relevant repository.
  3. For packages with no fixed repository, include them in the includepkgs entry of /etc/main.conf.
  4. Compute the set difference between rpm -qa and /software/packages.
  5. Execute yum shell to:
    1. Remove the difference (these are outdated packages), unless userpkgs is set.
    2. Install all entries in /software/packages

In future iterations, version and architecture of any given package will be optional.

The spma command will become an alias for ncm-ncd --co spma. The rpmt-py and spma packages will become deprecated.

Users willing to go back to the previous behaviour can just downgrade ncm-spma, and it will just work.

Future work: reverting to a previous state

The only drawback is reverting to a previous state. Imagine that you deployed tag a, which introduced dependency x-1.2. Then, tag n+1 upgrades x to 1.4. For watever reasons, you redeploy a, and all the configuration is replayed. But x is still in 1.4, when we expected 1.2!!.

To solve this, we'll make use of the Yum history feature. We'll store in a file the deployed tags, and associate them with a transaction ID. If ncm-spma finds that the tag had already been deployed, it will just undo all the subsequent transactions!

The map of tags to transactions can be stored in a JSON file, since it's expected to be small (a few hundreds of transaction at most). Something like this:

{
    "2012/04/13-00:01:02" : 74,
    "2012/05/06-00:01:02" : 75,
}

Question: Which portion of the profile contains this tags? At UGent, we use /system/quattorid.

Note: This feature requires Yum 3.2.29, which is not shipped (but can be built) on SL5.

Gabor disputes the need for this:

I'm not sure if going back to a previous tag should result in rolling back the package versions.

  • If the old tag specified an explicit version, then SPMA will do the right thing
  • If there was no explicit package version, then (re)installing the host with the old tag now would pick up the new package anyway. If you roll back to the previous version of the package, then you get a state that cannot be reproduced.

There may be cases where you really want to roll back the packages, but I think such rollback logic is really the job of a backup system instead of Quattor. If ncm-spma can be told the path to run yum, then it would be possible to create a wrapper around yum that maintains the tag/history ID mapping, without having to integrate that functionality into ncm-spma itself.

Splitting up transactions

So, there is a problem right now with our single-transaction behaviour.

Imagine this dependency graph, where green nodes are in the profile, yellow ones are dependencies that are not specified directly in the profile and red nodes are nodes that shall be removed:

Both smithers and mrburns are specified directly in the profile. But the distribution recalls that smithers depends on mrburns, so it adds a dependency. At the same time, we decide mrburns is not a critical part of our infrastructure, so we remove it from the profile. This is what we'd like to see:

When the component kicks in, it looks at leaf packages that are not in the profile. These are the candidates to be removed. Since smithers hasn't been updated, it runs this check against a system that looks like this:

Dependency graph for an example

Thus, we'll schedule mrburns for removal. Our transaction will thus be:

remove mrburns
distro-sync

Which means this dependency graph:

We cannot update smithers because mrburns will be removed!! Curretly this requires manual intervention, or intermediate deployments. Neither option is acceptable.

Making the component discover this type of dependency with this set of steps would be error-prone. And removing a single package would take hours, literally.

To fix this I'd like to ensure the third situation doesn't ever happen.

Running yum distro-sync at the beginning of the component

The problem is that the third graph is derived from incomplete information. What we'll do, thus, is to run yum distro-sync early in the component. After this step, the component will discover that smithers cannot be without mrburns. And it won't try to remove mrburns.

We'll start again:

and distro-sync will leave our system:

Only now we may install new packages and remove old ones reliably, in a single transaction.

Drawbacks

  1. distro-sync may introduce or update packages that will be removed immediately. This is a small overhead, but I don't see the harm to the system.
  2. The next transaction may fail, leaving the system in an intermediate state. You'd have updated versions of packages that you actually wanted removed, and an error message in your logs. It's possible to recover from this one using only Quattor, and it's an easy to predict state.
Last modified 11 years ago Last modified on Apr 2, 2013, 6:47:56 PM

Attachments (4)

Download all attachments as: .zip