wiki:FeatureRequests

CMT feature requests (mainly Atlas-specific patterns)

by V.Garonne
Date: 09.28.2006

Goal of this document

As the LHC will soon begin, the main scope here is to determine what is missing within CMT for having a stable CMT release fully operational (e.g. with project management). So we consider mainly missing features and not optimize already present features. We set priority in regard with this constraint. All the tickets associated to this feature requests page are available here: http://trac.lal.in2p3.fr/CMT/report/10.

1. Support for tags & macros in the cmt/project.cmt file (DavidQ)

The particular motivation for this came from LCGCMT, where LHCb wish to install it at CERN within an InstallArea, whereas ATLAS does. Currently this implies two separate installations, but the ability to do something like:

macro use_strategy "LHCb" without_installarea "ATLAS" with_installarea
build_strategy $(use_strategy)

would allow the use of a single installation. There was also a request for support for the "author" keyword.

CMT team's answer

  • See ticket #28 for extending project syntax
  • See ticket #27 for author keyword support within project

2. Support for native_version use statements (DavidQ)

This goes together with 3. The suggestion is for support for

use Foo * -native_version=1.2.3

This would require that Foo declared the Foo_native_version macro to be set to 1.2.3 (in this example).

CMT team's answer

This implies to overide the macro Foo_native_version within the glue package. Useful probably for testing external package in a fast way. See ticket #35.

3. Wildcard or prioritized project dependency support (DavidQ)

At the moment CMT does not support wildcarded project dependencies. This has two significant consequences:

  1. It is not easily possible to test a new version of a project

within the context of an existing project hierarchy.

  1. In some cases a project changes faster than the version of e.g. a

glue package to an external package. If a package within a higher level project only depends upon the version of the external package, the project needs to be versioned, even though no code changes have taken place.

Take the example of the tdaq-common project. To first order, it only depends upon the Boost, for which there is a glue package within the LCGCMT project. The Boost version is stable over periods of many months, but the LCGCMT project version changes much more frequently because other packages (e.g. ROOT) change more frequently. Thus currently tdaq-common would need to be changed and rebuilt at a rate that's at least as high as that for LCGCMT, even though nothing has changed. My proposal was that if a diamond project dependency existed (in our case AtlasConditions depends upon tdaq-common, which depends upon LCGCMT, and AtlasConditions also depends upon AtlasCore, which depends upon Gaudi and then LCGCMT), then as long as one leg of the diamond fully specified the project versions, the other leg could "weaken" the dependencies with some sort of wildcarding. Thus as long as exact versions of AtlasCore, Gaudi and LCGCMT were specified on one branch, then tdaq-common could specify a weak dependency against LCGCMT, and use the native_version package use statement described in

  1. above to ensure that the correct version of Boost was used.

I believe a similar scheme could be used to insert an updated project version into an existing project tree for test purposes, although I haven't thought it through yet.

CMT team's answer

  • See ticket #29

4. Support for <project> placeholder for patterns (DavidQ)

CMT provides several placeholders for use within patterns, most specifically <package>, which is replaced by the package name at build time. I'd like to see <project> added in the same way. One motivation for this is shown by the existing installed_library pattern. This essentially has:

    macro_append <package>_linkopts " -L$(bin) -l<package> <extras> "

which is anyway incorrect (so needs to be fixed). The correct fix would be something like:

    macro_append <package>_linkopts "" \
                 <project>_without_installarea " -L$(<PACKAGE>ROOT)/$ 
(<package>_tag)" ; \
    macro_append <package>_linkopts            " -l<package> <extras> "

Which would use the library from within the package if no InstallArea is present, and otherwise use the symlinked version from the InstallArea.

CMT team's answer

  • See ticket #30

5. Simplification of the -I<paths> (DavidQ)

Currently we have a -I<path> entry for every used package, even though all the package include directories are symlinked from the InstallArea. This results in very long compilation command paths which are bulky in logfiles and essentially opaque for humans to read. The motivation for this was the use case where someone removes a header file from a package, and wants to test whether another package that has been checked out has been updated not to include the now missing file. Without the long -I<path> the header file will still be located within the underlying base release. At the time we made this decision Christian proposed that the obsolete header file should not be removed, but marked (via a pragma statement) to ensure that it would cause clients to fail to compile. I now believe that he was right and that we should adopt this strategy. I note that once the updated package has been incorporated into the base release the now obsolete header file can be deleted if so desired. I played with bit to try to get this to work some time ago, but never got it completed so now it needs to be resurrected. Note that this has to be done per release rather than intrinsically by CMT since old releases will still have the existing symlink structure.

CMT team's answer

It is clearly a specific Atlas item which should be resolved within Atlas_policy, Atlas fragments/pattern and well documented for future use.

6. Revisit g77 include path (DavidQ)

There's a horrible hack I put in because g77 has a character limit, not only on the overall command line, but also on the -I<path> options. This hack involves some symlinks in the project to other projects and causes problems with search paths and is a real mess. We should revisit this, perhaps in conjunction with 7. below.

CMT team's answer

No comment.

7. Investigate whether we can use gfortran instead of g77(DavidQ)

Currently we use gfortran for the Saclay F90 code, and g77 for everything else. I think we should look at using gfortran for everything, since that would solved 6. above, and provide additional uniformity. We would need to ensure that gfortran was shipped with the kit, but that's done already.

CMT team's answer

No comment.

8. cmt co tag (DavidR)

Another thing not vital but nice to have, which was suggested several time in the past is to be able to specify a cmt co tag like : cmt co Reconstruction/RecExample/RecExCommon-01-02-03

CMT team's answer

Yes, this is clearly not vital and more straight forward to do with Svn as backend than CVS.

9. Make CMT a library rather than a static executable (SébastienB)

This surely is a long-term item but I think it would ease the work of people writting scripts (see b.) or building plugins for CMT.

CMT team's answer

It supposes to wrap cmt with another libraries as Boost/swig/etc and allow interrogating directly in python the data structures. In the short term, the solution in 10. is more evident.

10. Have python bindings for CMT (SébastienB)

Athena (and Gaudi, for that matters) would greatly benefit from having official python bindings for CMT. Just have a look at how many python wrappers have been written for CMT in the Atlas CVS repository.

CMT team's answer

Actually a non-official python wrapper exists, we will distribute it (follow ticket #31)

11. Introduce the concept of a Release (or some equivalent 'meta-data') (SébastienB)

The central concept in CMT, according to the CMT-documentation, is a Package. (Looking at the source code, I'd argue it isn't completely true and would be more inclined to say the central concept in CMT is the 'Use statement', anyway...) I'd like to have the concept of a Release to be introduced. This would be used when one installs a (set of) project(s) with a frozen set of packages with their version and dependencies (clients+uses). Iterrogating this Release object for package dependencies would be much faster than letting CMT rebuild and recompute each time the whole set of dependencies whenever one does 'make'. I believe this would also speed-up the 'cmt show clients CustomerIsTheKing' command.

CMT team's answer

We are thinking of having a snapshot functionality to get an `image' of a configuration. Ideally this snapshot would be in xml and could be translated (via XSLT) in other format (Xcode, Visualx.x, and so on): see ticket #33.

Another point is to convert requirements file into xml by a tool.

12. Separate the 'build' environment from the 'user' environment (SébastienB)

Having CMT running into its own environment without touching the user environment would allow a faster turn-around between nightlies/releases (developer-centric) but also ease bug-reporting (user-centric) : the CMT environment being described by a single file would dramatically reduce confusion (while developers are trying to reproduce users' bugs) and ease spotting mis-configuration.

CMT team's answer

see 11 ?

13. Introduce a mechanism to describe runtime dependencies between packages (SébastienB)

Title says it all.

CMT team's answer

Ideally several components could be split:

  • Building environment
    • Variables
    • Building dependencies
    • etc.
  • Runtime environment
    • Variables
    • Runtime dependencies
    • Etc.
  • Identify the common parts

Probably for the version 2 ;)

14. Use checksums rather than timestamps to decide if a file has been modified (SébastienB)

Hopefully, this will reduce re-compilation time.

Additional remarks (AndreiG)

we agreed that computing checksums can't be faster than testing time stamps; the use case he had in mind is when you inadvertently touched a file, and ended up with the same content but a different time stamp.

Additional remarks (WimL)

the point of checksums isn't to save time, it's to increase accuracy. If I revert my local sources to an earlier version from CVS, my object files and other build products have a time stamp that is newer, even as what they depend on has changed. Checksums will save you from having to remove all build directories.

Additional remarks (AndreiG)

Accuracy is important, but the compilation time is also important. There is a trade off between the risk of forgetting to remove build dirs after going back to an earlier version of a package and increasing compilation for each of the builds. I'd say that going "back in time" happens rarely, and therefore I am willing to accept the inconvenience of removing generated files by hand in such cases instead of increasing compilation time under routine circumstances (which is way too slow as it is).

Additional remarks (WimL)

I'd say that going "back in time" happens rarely

YMMV. I do it a couple of times per day. Depends on work style, I guess.

of increasing compilation time under routine circumstances

There's a leverage effect. If I have to recompile my complete work directory b/c of a backtrack in one package, the total increase is enormous, and may surpass the cost of checksums in the routine circumstance. If I loose a day b/c of a subtlety, the scale goes even further in the direction of checksums.

CMT team's answer

Compilation seems to be clearly a place for improvements but we are rewriting our own patched make version. Have we reached the Make limit, e.g. as it only deals with timestamps ? See ticket #34.

15. Profile and optimise CMT (SébastienB)

It would be great to reduce the time to compile packages in Atlas. Surely a chunk of this time comes from the packages themselves (templates, complicated packages dependencies, unneeded dependencies and stuff like that). But a simple test on UserAnalysis (which falls exactly in the above caveats):

make -s  63.46s user 23.47s system 70% cpu 2:03.38  total (first build)
make -s  10.83s user  7.67s system 96% cpu   19.083 total (no-op build)
make -s  11.31s user  7.37s system 97% cpu   19.083 total (no-op build)

gives ~10 seconds for a no-op build... And for RecExCommon which is a pure python package:

make -s  31.83s user 24.17s system 97% cpu 57.578 total (first build)
make -s  10.98s user  8.02s system 99% cpu 19.050 total (no-op build)
make -s  10.65s user  8.08s system 96% cpu 19.384 total (no-op build)

=> still ~10seconds for a no-op build. (so I'd naively conclude it comes from CMT/Atlas-macros rather than the content of the package)

CMT team's answer

See also ticket #34.

16. have a build-dir location which can be configured (SébastienB)

Hence it would be possible to install all these possibly large .o,.so,.dict files (especially in -dbg mode) under some /tmp/${USER}/build directory without wasting AFS space.

CMT team's answer

Normally this already exists by overriding $BIN. Should check see ticket #36.

17. a mechanism to integrate 'plug-ins'(SébastienB)

Plug-ins are hype. I believe the Atlas package structure and the dependencies between packages would benefit of the integration of various Atlas-specific CMT plug-ins. Here I am thinking about checkreq. If checkreq was integrated during the development process of a package (and not 'just' at the nightly step) it would improve the overall quality (or correctness) of cmt/requirements files content. One could think also to Atlas specific default cmt/requirements file which would be used during the 'cmt create Foo Foo-00-00-00 Path/toFoo' command and check it sticks to Atlas conventions. Another useful plug-in, as we are heading towards gcc-3.4.x, would be a precompiled-header plug-in.

CMT team's answer

We divide plugins/Add-ons in two categories:

  1. general plugins
  2. User plugins

General plugin is for example the tbroadcast utility and this is already installed at the CMT level. For user plugin maybe it should be defined at the AtlasLogin level or we have to see the existing CMT_USERxxxx mechanism.

18. produce makefiles structured according to the "Recursive Make Considered Harmful" prescription (AndreiG)

It would be very useful if cmt could produce makefiles structured according to the "Recursive Make Considered Harmful" prescription

[1] http://www.tip.net.au/~millerp/rmch/recu-make-cons-harm.html

This should give a huge speed up compared to "cmt br gmake" for someone who needs to recompile a bunch of checked out packages after modifying some of them.

I imagine the following work model:

  • a developer checks out a set of packages he is working with,
  • issues a cmt br command to build a set of makefiles from the "topmost" package in his set (or from a specially created test package depending in all the checked out packages),
  • and from this point he only needs to run "gmake" from that same location to recompile exactly what needs to be recompiled after making modifications to any of his packages. Or "gmake -j3" :)

The role of the cmt here is to resolve package versions and create appropriate makefiles. It only needs to be run once, unless you change the set of checked out packages, or the nightly you are working with. Gmake run on the master makefile knows all the dependencies, and does the minimal job to compile just what is needed, and also to re-compute dependencies that are maintained in a granular way (one .d file per one .cxx file). Having all the knowledge in one instance of gmake makes sure everything is updated properly - even if you use a -jN gmake option to build in parallel. (I've heard that cmt develops a custom parallel build solution - but gmake is already well debugged, why not use that?)

The global gmake approach [1] is used by ROOT, see $ROOTSYS/README/BUILDSYSTEM (and perhaps by many other projects).

I also used it for a set of projects, and can assert that

1) It works beautifully,

2) It IS easier to implement than it seems at first.

I expect a large speedup because of a proper handling of dependencies that are stored and not recomputed unless needed. Also, not asking cmt to re-resolve all the used package versions on each "gmake" should save us a lot.

CMT team's answer

Interesting links. Should take time to well understand it/make tries. The logic exists for describing the graph(cmt show uses) , we can imagine to generate a global makefile. See also ticket #34.

19. Avoid creating archive libraries unless explicitly requested (DavidQ)

Currently CMT creates an archive (.a) library as a stepping stone to creating shared (.so) libraries. This wastes a lot of space, which we are now manually saving by checking whether a .so of the same name sits alongside a .a library, and only copying the latter to AFS if that isn't the case.

20. Allow project overrides via CMTPROJECTPATH (DavidQ)

It would be useful to be able to override a project version using CMTPROJECTPATH in the same way as a package version can be overridden in CMPATH. This would facilitate testing of new versions of projects by putting the new version (e.g. AtlasCore/2.3.0) in a location referenced by an earlier CMTPROJECTPATH component.

CMT team's answer

  • See ticket #32

21. Pipe-able output of cmt show clients (SebastienB)

It would be great to be able to pipe the output of cmt show clients Pkg into a cmt co -r Client-v1 Path/To/Client. Or at least a copy-paste friendly output of cmt show clients. Right now, it looks like that:

# SCT_Digitization SCT_Digitization-00-09-13 ${SomePath}/InnerDetector/InDetDigitization (use version SCT_ConditionsAlgs-*)

So it is a pain to check out this particular version to assess upstream changes didn't impact clients.

Hence it would be possible to install all these possibly large .o,.so,.dict files (especially in -dbg mode) under some /tmp/${USER}/build directory without wasting AFS space.

CMT team's answer

See 11 and ticket #33. We could convert XML in everything.

Last modified 18 years ago Last modified on Oct 11, 2006, 8:25:31 PM