wiki:Doc/SCDB/Git

Using Git to Administer SCDB

Note: this page explains why and how to use Git with SCDB. Most of the principles could be applied to other DVCS, such as Mercurial, even though implementation details may change.

Why Git?

Git is one of the new generation Version Control System (VCS). It belongs to the family of Distributed VCS. Compared to SVN, based on the Centralized VCS paradigm, each user has its own copy of the full repository (clone) instead of a working copy of only one revision of the central repository. In fact, in the DVCS model, there is no need of a central repository, this is just a project decision. All users exchange their contributions through fetch and pull operations.

DVCS model is very attractive as a complementary tool to SCDB for the following reasons:

  • Ability to have the full history in its local environment makes easier to work disconnected to prepare changes. Despite a Git repository holds the full history, its size is significantly smaller than a SVN repository holding the same information.
  • It's possible to work on some changes with the advantage of versionning, without publishing them in a central repository, until there are ready or just throwing them away if you change your mind.
  • By design, branching and merging are very easy and efficient in DVCS: this allows to work on several changes in parallel, with the ability of commit incremental changes and merging them in the main branch, the one to be deployed, when they are ready.
  • Some DVCS, like Git, allows to easily pack together a set of revision into only one. This allows to keep a cleaner history in main branches when doing incremental development.

A feature present in many DVCS is the ability to be used to mirror a SVN central repository and act as a replacement of the SVN working copy. One command allows bi-directional synchronization between SVN and the DVCS repository. In Git, this is provided by git svn command. Despite the Git repository holds the full SVN repository history, its size (with the associated working copy) is generally significantly smaller than a SVN working copy of the same SVN repository holding only one revision (30% smaller has been seen on large repository with 35K revisions).

SCDB and SVN

SCDB and its deployment model are tightly coupled with SVN. In particular, SCDB relies on SVN to ensure that there is only one deployment branch (generally called trunk) and that this deployment branch is up-to-date with the working copy used for the deployment (where ant deploy is done). Using a DVCS as a replacement for SVN would make all these checks more difficult (even though not impossible) and SCDB would have to implement several features currently provided by SVN.

On the other hand, there is nothing preventing to use Git as a replacement for the SVN working copy for development of changes. Using git svn command, these changes can be committed back to SVN for deployment.

Configuring a Git Mirror of Configuration Database

Git is available for all platforms. The first step is to install it using the standard procedure for your environment. As with SVN, you can use git help command to get the online help for any Git command.

After installing Git, you need to make a clone of your SCDB configuration database. This is typically done with the following command whose syntax is very close to svn checkout:

git svn clone https://svn.example.org/Quattor/trunk scdb

Note that this command can be fairly long to execute depending on the SCDB history. Read git-svn help (git help svn) for a suggestion using an intermediate Git repository to improve the cloning time if several clones of the SVN repository should be done. You may want to add --username xxx if a specific username must be used to access the SCDB repository.

Note: git svn is using the standard (Perl) SVN API and is thus using the standard SVN credential cache when asked for authentication. You may want to ensure that there is an appropriate credential cached to avoid multiple password prompt. You also need to ensure there is no credential cached for your SVN server that doesn't allow proper access to the configuration database.

The initial branch fetched by git svn clone (generally called master) will become the main branch in your Git repository and should be the only one used for interacting with the SVN repository. See the section about every day operations for information about restrictions.

Before using the Git repository cloned from SVN, you need to configure .gitignore files to achieve the same as what is done with svn:ignore porperty in SVN. The .gitignore file in the main directory can be generated from the SVN property with the following command:

git svn propget svn:ignore .

This .gitignore file can be safely committed to SVN (it is ignored by SVN).

The management workflow when using Git becomes:

  • Edit your changes and commit them to Git (may be using a branch). You may want to use git rebase -i to merge several related commits into one.
  • If you used a branch to develop your changes, merge them into master using git merge
  • Update your branch master with the latest revisions from the SVN repository using git rebase. If not done explicitly, this will be done during the commit to SVN but as this may lead to some conflicts that have to be fixed manually, it generally better to do it as a separate step. In case of conflict, rebase can be continued (--continue) or aborted (--abort).
  • Commit your changes back to SVN using git svn dcommit [--username xxx]. Note that there is not commit message passed to this option: git svn dcommit will add one SVN commit for each Git commit, with the same commit message.
  • Update your SVN working copy (you still need one somewhere...) and run ant deploy.

For the git svn subcommands matching svn commands, refer to the next section.

svn:externals with Git

If you use svn:externals in your configuration database to refer to some common parts or some of the deployment tools, this requires additional steps. Git, as other DVCS, has no support for external references. But, through the submodule feature, it is possible to somewhat emulate the SVN feature.

The solution is a bit tricky but is well described at http://kerneltrap.org/mailarchive/git/2007/5/1/245002. The idea is to fetch each external as a separate branch in the Git repository and create submodules, one for each external, by cloning the repository itself, hacking a few files so that they refer to the same as in the main repository and change the branch used for each submodule to the appropriate one. Typical steps involve:

  • Adding the SVN remote branches to be tracked for each tool. This is typically done by adding the following sections to .git/config (change the versions to whatever is appropriate):
    [svn-remote "panc-8.2.11"]
            url = https://svn.lal.in2p3.fr/LCG/QWG
            fetch = /External/panc-8.2.11:refs/remotes/svn-panc-8.2.11
    [svn-remote "ant-1.7.1"]
            url = https://svn.lal.in2p3.fr/LCG/QWG
            fetch = /External/apache-ant-1.7.1:refs/remotes/svn-ant-1.7.1
    [svn-remote "saxonb-9.1.0.2J"]
            url = https://svn.lal.in2p3.fr/LCG/QWG
            fetch = /External/saxonb-9.1.0.2J:refs/remotes/svn-saxonb-9.1.
    [svn-remote "scdb-ant-utils-8.0.1"]
            url = https://svn.lal.in2p3.fr/LCG/QWG
            fetch = /External/scdb-ant-utils-8.0.1:refs/remotes/svn-scdb-ant-utils-8.0.1
    [svn-remote "svnkit-1.3.2"]
            url = https://svn.lal.in2p3.fr/LCG/QWG
            fetch = /External/svnkit-1.3.2:refs/remotes/svn-svnkit-1.3.2
    
  • Fetch each remote branch with the following command used for each svn-remove entry. Based on previous configuration, the commands will be
    git svn fetch ant-1.7.1
    git svn fetch panc-8.2.11
    git svn fetch saxonb-9.1.0.2J
    git svn fetch scdb-ant-utils-8.0.1
    git svn fetch svnkit-1.3.2
    
  • Create a branch for tracking each of the remote branch:
    git branch ant-1.7.1 svn-ant-1.7.1
    git branch panc-8.2.11 svn-panc-8.2.11
    git branch saxonb-9.1.0.2J svn-saxonb-9.1.0.2J
    git branch scdb-ant-utils-8.0.1 svn-scdb-ant-utils-8.0.1
    git branch svnkit-1.3.2 svn-svnkit-1.3.2
    
  • Create each external/ directory and clone the repository into them (without checking it out). Following commands must be updated based on your actual configuration:
    mkdir external
    for rep in ant panc saxon svnkit scdb-ant-utils; do mkdir external/$rep; git clone -n -s . external/$rep; done
    
  • Go into each directory and do the following using the appropriate branch created previously:
    cd .git
    for filedir in refs logs info description config; do rm -Rf $filedir; ln -s ../../../.git/$filedir; done
    cd ..
    #Use the appropriate branch name for each subdirectory
    git checkout ant-1.7.1
    
  • Add external to .gitignore in the main directory.

Note: on Windows, the ln -s command results in making a file copy as symlinks don't exist on this system. This means that to further update these repository clones may require to redo the rm + ln -s steps.'

Using Git for Everyday Operations

After setting up the initial Git repository as a mirror of your SVN configuration database, you can use all Git commands and features, in particular branch and merge.

As mentioned above, the main restriction is that all synchronization has to be done from the master branch. This is not a strict requirement and there is no way to enforce it. The reason for this is to avoid potential issues with Git rebase operations (history rewriting), something that cannot be implemented with SVN. Because of this restriction, you must avoid doing a rebase in the master branch, in particular any rebase that may impact revisions already committed in SVN.

Another restriction is that you should not clone the Git mirror, except for very special purposes: git svn will not work properly on such a clone. Instead create another mirror of the SVN repository using git svn clone.

If you are new to Git, you may want to read the GIT - SVN Crash Course that introduces Git concepts and commands for people familiar with SVN. You need to be aware of the following important differences in the way Git tracks changes compared to SVN:

  • Output of commands like git status is formatted very differently from svn status. On the other hand, Git always provides the most sensible commands to use in a given state of the Git working copy.
  • No tracking of directory per-se: directory are tracked (including renames) only a part of the tracking of the files they contain. That means that empty directory existing in the SVN repository will not be created in the Git mirror. This should generally not be a problem. Note that this behavior is common to many DVCS.
  • There is no git cp command and git mv is basically the same as standard mv. This comes from the way Git tracks history (based on content changes and not on files): in fact Git achieves the same result as svn cp/rm/mv using the standard shell commands...
  • In Git, changes to tracked files are not automatically added to the next commit. For them to be part of the next commit, you need to use git add (with -A option to add everything) or git commit -a. This is a little bit disturbing at the beginning but when used to this, this is in fact very handy. git status is very explicity about what will be part or not of the next commit.

Command Line

For interaction with SVN, the following table gives the git svn subcommand matching the usual svn commands for interaction with the SVN repository (interaction with the Git repository is done with standard git commands). If you look for some tutorials on how to use Git with SVN, there are plenty of them around the Internet. Here is a good starting point, giving references to more detailed pages.

SVN Git
svn checkout git svn clone
svn update git svn rebase
svn commit git svn dcommit

Note: as already mentioned, git svn is using the standard (Perl) SVN API and is thus using the standard SVN credential cache when asked for authentication. You may want to ensure that there is an appropriate credential cached to avoid multiple password prompt. You also need to ensure there is no credential cached for your SVN server that doesn't allow proper access to the configuration database.

Eclipse

Git integration with Eclipse is still very preliminary at the time of this writing (January 2010). The Git Plugin for Eclipse, Egit, is based on a pure-Java implementation of Git, called JGit. Check EGit site for information about how to install it.

The current version of the plugin, v0.6, only implements basic Git operations, such as commit, branch, fetch, pull. It doesn't support yet advanced features like merge and it doesn't implement git svn command set. That means that if you want to use Eclipse for your edits with Git as the backing repository, you need also to install the standard Git distribution: you will use standard (command line) Git commands for operation not supported by Egit. There is no problem in managing the same Git repository with both Egit and Git.

Note: Git for Windows provides a basic shell to use Git commands with some usual shell commands.

In v0.6, Egit has a problem with symlinks tracked in the Git repository. It tends to change their type to a plain file at each commit, if you don't take care of removing this change from the commit. To avoid potential problems, the work around is to add a .gitignore file in directories containing such symlinks. In the standard SCDB distribution, there is one such symlink: src/utils. Thus you are advised to add (and commit back to SVN) a file src/.gitignore containing the following lines:

# Eclipse plugin for Git (v0.6) tends to corrupt symlinks...
utils
Last modified 14 years ago Last modified on Feb 24, 2010, 7:09:46 PM