Disallow changes in Subversion tags

Because Subversion does not have explicit tags, and everything in the repository is just another folder or file (and thus editable), we sometimes have the need to secure or force our repository layout.

Creating a tag is just making a copy of the trunk (or any branch) so you have a snapshot of how the trunk looked at that given time. If you make changes in it, it is not just a snapshot anymore.

The solution for this problem lies in a pre-commit hook, which uses the --copy-info parameter of theĀ svnlook changed command.

The output of a normal svnlook changed looks like this:

$ svnlook changed -t 670-ja /var/svn/repo
A   tags/0.1.1/

Whereas the the command with --copy-info looks like this:

$ svnlook changed -t 670-ja --copy-info /var/svn/repo
A + tags/0.1.1/
    (from trunk/:r6911)

The + at position 3 indicates a copy, so in combination with the tags/ string, you can block commits to tags without this +.

Keeping track of your config files

I know and use Subversion properties (and the keyword substitution) for quite a while now, but never used all of them and mostly stayed with the Id keyword.

This results in a substituted string like this:

$Id: ProductController.php 227 2010-04-28 08:25:32Z jachim $

Because my colleague Arno and myself do a lot of server maintenance and configuration, we ended up maintaining a lot of configuration files in a dedicated repostitory. The big problem here is the fact that you need discipline to check in the changes in Subversion and exporting them into place (manually or with a hook).

In order to help us pointing out which files are in the repository and where, we’ve added 2 keywords in every file:

// $HeadURL$
// $Id$

Which gets nicely transformed into usefull information on the servers’ filesystem:

// $HeadURL: http://server/trunk/dnsserver/var/named/chroot/etc/zones/dmz.zones $
// $Id: dmz.zones 1889 2010-05-31 12:26:20Z jachim $

Deleting Subversion repository files (for real)

Keeping files and directories in the repository is one of the key principles of Subversion, so once you’ve committed something, it’s there for ever. You can delete files, but they still exist somewhere in the repository, so you can go back in time.

But there is always that time where you’ve (accidentally) committed a password file, a directory full of hi-res images, or some other contents you don’t want other people to see that you want to get rid off. That’s where the hard part starts…

After searching the internet and checking the Subversion FAQ it looks quite hard, but with some guidance, you’ll find out it’s not.

Finding the problems

First you have to do a (complete) checkout of the repository you want to clean:

svn co http://svn.apache.org/repos/asf/ asf

Now you can start to locate the problems and delete the files/directories (not svn delete!):

rm -Rf subversion/trunk/tools/buildbot;
rm -Rf subversion/trunk/README;
rm -Rf subversion/trunk/build;

When you’re done delete files and directories, you can generate a list of ‘missing’ files.

Checking your files:

svn status
!      subversion/trunk/tools/buildbot
!      subversion/trunk/README
!      subversion/trunk/build

Generating that list (outside the working copy):

svn status | sed s/"!      "// > ../filter.txt

Fixing the problems

Now you have a nice list of files to delete (make sure it includes the parent directories, right to the root), you should login on the server hosting the repository.

We first want to make sure there is a backup:

svnadmin dump file:///var/svn/asf > ~/backup_svn/asf.dump

Now we can use that backup file as the input of file for the svndumpfilter command. In combination with the filter list we’ve generated on the client, we can create a filtered dump version:

svndumpfilter exclude `cat filter.txt` < ~/backup_svn/asf.dump > asf_filtered.dump

To load that file back in the repository, we should ‘delete’ the original repository. (The httpd commands are just to make sure no one commits while processing the changes).

/etc/init.d/httpd stop;
mv /var/svn/asf ~/backup_svn/asf;
svnadmin create --fs-type fsfs /var/svn/asf;
svnadmin load /var/svn/asf &lt; asf_filtered.dump;
/etc/init.d/httpd start;

Please note that directories and command line options can be different, but the outcome should be the same.

Now we have the same repository, without the (accidentally) committed files/directories!

New problems

After the filtering, it is possible that complete revisions are empty. It is possible to skip empty revisions, but then all revisions are renumbered, and that could be problematic for other software (e.g. Trac).