Disallow changes in Subversion tags

Because Subversion does not have explicit tags, and everything in the repository is just another folder or file (and thus editable), we sometimes have the need to secure or force our repository layout.

Creating a tag is just making a copy of the trunk (or any branch) so you have a snapshot of how the trunk looked at that given time. If you make changes in it, it is not just a snapshot anymore.

The solution for this problem lies in a pre-commit hook, which uses the --copy-info parameter of theย svnlook changed command.

The output of a normal svnlook changed looks like this:

$ svnlook changed -t 670-ja /var/svn/repo
A   tags/0.1.1/

Whereas the the command with --copy-info looks like this:

$ svnlook changed -t 670-ja --copy-info /var/svn/repo
A + tags/0.1.1/
    (from trunk/:r6911)

The + at position 3 indicates a copy, so in combination with the tags/ string, you can block commits to tags without this +.

Keeping track of your config files

I know and use Subversion properties (and the keyword substitution) for quite a while now, but never used all of them and mostly stayed with the Id keyword.

This results in a substituted string like this:

$Id: ProductController.php 227 2010-04-28 08:25:32Z jachim $

Because my colleague Arno and myself do a lot of server maintenance and configuration, we ended up maintaining a lot of configuration files in a dedicated repostitory. The big problem here is the fact that you need discipline to check in the changes in Subversion and exporting them into place (manually or with a hook).

In order to help us pointing out which files are in the repository and where, we’ve added 2 keywords in every file:

// $HeadURL$
// $Id$

Which gets nicely transformed into usefull information on the servers’ filesystem:

// $HeadURL: http://server/trunk/dnsserver/var/named/chroot/etc/zones/dmz.zones $
// $Id: dmz.zones 1889 2010-05-31 12:26:20Z jachim $

Deleting Subversion repository files (for real)

Keeping files and directories in the repository is one of the key principles of Subversion, so once you’ve committed something, it’s there for ever. You can delete files, but they still exist somewhere in the repository, so you can go back in time.

But there is always that time where you’ve (accidentally) committed a password file, a directory full of hi-res images, or some other contents you don’t want other people to see that you want to get rid off. That’s where the hard part starts…

After searching the internet and checking the Subversion FAQ it looks quite hard, but with some guidance, you’ll find out it’s not.

Finding the problems

First you have to do a (complete) checkout of the repository you want to clean:

svn co http://svn.apache.org/repos/asf/ asf

Now you can start to locate the problems and delete the files/directories (not svn delete!):

rm -Rf subversion/trunk/tools/buildbot;
rm -Rf subversion/trunk/README;
rm -Rf subversion/trunk/build;

When you’re done delete files and directories, you can generate a list of ‘missing’ files.

Checking your files:

svn status
!      subversion/trunk/tools/buildbot
!      subversion/trunk/README
!      subversion/trunk/build

Generating that list (outside the working copy):

svn status | sed s/"!      "// > ../filter.txt

Fixing the problems

Now you have a nice list of files to delete (make sure it includes the parent directories, right to the root), you should login on the server hosting the repository.

We first want to make sure there is a backup:

svnadmin dump file:///var/svn/asf > ~/backup_svn/asf.dump

Now we can use that backup file as the input of file for the svndumpfilter command. In combination with the filter list we’ve generated on the client, we can create a filtered dump version:

svndumpfilter exclude `cat filter.txt` < ~/backup_svn/asf.dump > asf_filtered.dump

To load that file back in the repository, we should ‘delete’ the original repository. (The httpd commands are just to make sure no one commits while processing the changes).

/etc/init.d/httpd stop;
mv /var/svn/asf ~/backup_svn/asf;
svnadmin create --fs-type fsfs /var/svn/asf;
svnadmin load /var/svn/asf &lt; asf_filtered.dump;
/etc/init.d/httpd start;

Please note that directories and command line options can be different, but the outcome should be the same.

Now we have the same repository, without the (accidentally) committed files/directories!

New problems

After the filtering, it is possible that complete revisions are empty. It is possible to skip empty revisions, but then all revisions are renumbered, and that could be problematic for other software (e.g. Trac).

Hostnames in Logwatch reports

Where I work, we have a lot of servers to maintain, and only 2 server admins (me and my colleague). We use Nagios to keep us informed about the server status and Logwatch to analyze to server logs on a daily basis.

We have per server a lot of subdomains/vhosts and these virtual hosts all write into their own log (blog.jachim.be_acces_log, www.jachim.be_error_log, etc…).

The log entries look like this: - - [10/Nov/2009:09:55:41 +0100] "GET /a/i/red_cube.png HTTP/1.0" 200 190 - - [10/Nov/2009:09:55:41 +0100] "GET /a/i/search/search_icon.gif HTTP/1.0" 200 428 - - [10/Nov/2009:09:55:41 +0100] "GET /index.php HTTP/1.0" 200 6541

When Logwatch merges all the httpd log files, the host information (in the log filename) is lost, resulting in Logwatch reports like this:

Requests with error response codes
    401 Unauthorized
       /: 4 Time(s)
       /a/i/blue_cube.png: 1 Time(s)
       /favicon.ico: 2 Time(s)
       /wp/login: 2 Time(s)

We actually want reports like this:

Requests with error response codes
    401 Unauthorized
       www.jachim.be/: 4 Time(s)
       jachim.be/a/i/blue_cube.png: 1 Time(s)
       blog.jachim.be/favicon.ico: 2 Time(s)
       blog.jachim.be/wp/login: 2 Time(s)

Now we have all the information we want and are able to fix the possible problems much easier.

Because this is not possible in Logwatch (see mailinglist), I’ve added it in the Apache logs.

I’ve added a new logformat named logwatch in httpd.conf:

LogFormat "%h %l %u %t \"%m %{Host}i%U%q %H\" %>s %b" logwatch

Now the new format is available and can be used in the Virtual Host:

CustomLog logs/www.jachim.be-access_log logwatch


My personal home server โ€“ part 2


In the previous post I was terribly wrong about the router type. I was talking about a Linksys (by Cisco). My brother hooked me up with an ‘old’ WRT54GL he had laying in the basement.

The cool thing about these routers is that they have built-in DynDNS support. (If you don’t know what it is, Wikipedia has a good article about it). What it basically does is pointing a host name to a dynamic IP address. ISP’s don’t like this, but you’re doing nothing wrong with it.

With the router in place, my network is secured and I’ve added some port forwards to my Centos server.

I’ve added some accounts for some of my friends and opened SSH for them through the internet. Succes!

My personal home server – part 1


A month ago I moved to my new house (yay) and I’d promised Joggink I would set up a home server we could use to play with.

Several weeks later, I’ve managed (as in: finally had time, rather than: it was complex) to do a complete install of CentOS on our ‘server’.

No I’m waiting on my router (some D-Link, I forgot the type) to complete the access to the internet and make sure it’s secure!

Bear with me ๐Ÿ˜‰