Using reStructuredText to generate an ERD

Recently I had to create an ERD for a project, and I didn’t wanted to use a binary (and mostly a proprietary) format. The reasons for this is that I don’t want to bind myself to a specific program, that I want to be able to use the data in different places/formats, and most importantly that I can version control it (and diff it as text).

Next to Visio, Google Draw, Omnigraffle, etc… there are new players like Skipper, that can be used to easily generate Doctrine models, but these are still a bit to limited for me.

Because I like to be verbose in decisions I make when designing models and how they fit together, I usually start with a basic text file listing entities and there properties. Those text files evolved to MarkDown files, and recently I started using reStuctured text (.rst) files, because it is a very powerfull format (and it has a lot of parsers, which will help me achieve my goals I stated in the beginning of this blog post).

A simple file could start like this:

Entity-Relationship Diagram
===========================

This file is used to describe all entities, their properties and how they relate.

User
----

A user entity is used to...

Fields
++++++

:username: The username a user will use to login to his/her account
:email: ...

Group
-----

A group of users can have specific roles in the application.

Fields
++++++

:name: The group name.
       This should be...
:roles: An list of roles

Using this format, we can add a lot of background information and annotations, while keeping a structured format we can use later.

A lot of viewers (GitHub included) can present the reStructuredText quite good already, including references in the document. If your document gets quite large, the link can be very handy:

...

Group
-----

A group of User_ entities can have specific roles in the application.

...

Now that we use references (the User_ format), we can take it one step further and start listing the relations of the entities:

...

User
----

A user entity is used to...

Fields
++++++

:username: The username a user will use to login to his/her account
:email: ...

Relations
+++++++++

:`Group`_: A user can be part of one group.

...

Because we are using the reference notation again, we can already click link to navigate to the different relations.

Now that we have a base document, we can try to generate diagrams of it.

DocUtils

To easily read the information from the RST files, I recommend using DocUtils to generate an XML version of the source file.

rst2xml.py source.rst > ERD.xml

Graphviz DOT files

The Graphviz DOT language is an powerfull format to generate graphs from a text (dot) file.

digraph graphname {
     a -> b -> c;
     b -> d;
 }

The code above will generate the following graph:

A directed graph (source: wikipedia.org)

Putting it all together

Now we can write some code to parse the XML file and generate a dot file out of it.

When we pipe the output to the Graphviz library, we have our final result!

Win!

This image can get quite large, but Graphviz will find a way to position all the entities.

Some notes/thoughts/improvements:

  • Use `Field Lists`_ for fields.
  • Always include “one” or “multiple” in the relation descriptions.
  • Use {timestamp}, {string} notations in the field descriptions to indicate the type.