Using Mercurial Subrepositories

Code reuse is important. As developers, we don’t want to keep reinventing the wheel over and over again. We should leverage code that we’ve already written, and use open source libraries and frameworks where appropriate.

Having the source code of a library that your project depends on is very beneficial.  You can browse through it, debug into it, and make changes to it.  But what is the most efficient way to store and track a piece of code that is shared across dozens of projects? Should you simply copy and paste the library’s source code into a lib folder within each project’s repository? Or should you store the library in its own repository and reference it externally?

The copy/paste method makes propagating changes a real nightmare, and the external reference approach has its own drawbacks:

  • How is this extra step communicated?  Is it in a wiki document somewhere?
  • Should you write a script to fetch the dependencies or some build tool?
  • What version of the library should you clone?
  • Where on my disk does the library need to live?
  • Does the build server know about all this?

That’s a lot of question marks. They’re not intractable issues by any stretch, but you do have to think about them. You don’t want to have to think, you want to write code and get things done.

Subrepositories to the rescue

Subrepositories let you treat a collection of repositories as a group. For example, when you clone a repo, Mercurial will recursively clone all of its subrepositories as well, so the developer (or build server) doesn’t need to know about the dependencies — the source control system handles it all.

When you create (or update) a subrepository, Mercurial takes a snapshot of the subrepo’s state and stores it in the parent repository’s .hgsubstate file. This means that multiple projects can point to a single shared subrepository, yet each one can independently decide which revision of the shared repository to rely on.

Further, the subrepository can be anywhere — on your local disk, in Kiln, on a co-worker’s machine, etc. Heck, it can even be a Subversion repository!

Let’s see an example that illustrates exactly how to use them.

A Tale of Two Résumés: A Case Study in Subrepositories

Fog Creek is always looking to hire great people. In fact, I’m proud to announce our two most recent hires — we just poached Darth Vader from Microsoft and Cthulhu from Facebook (sorry, Zuck).

Whenever someone new joins the team, one of the first things we have them do is create a personal résumé site that showcases their skills. I provide them with a standard HTML template and they fill in all the content. Each team member has their own repository in Kiln.

Here’s what the project looks like right now:

Repository Layout

Vader and Cthulhu have very similar sites at this point — both contain a single static HTML page, CSS, and some images. There aren’t any external dependencies.

Have a look inside Cthulhu’s repo:

File structure

The résumés look nice and clean, albeit plain:

Vader's vanilla CV

Creating a subrepository

I’ve been spending my days on Hacker News reading about how all the JavaScript ninjas are writing jQuery plugins to enhance their websites. So, naturally, I wrote a jQuery plugin that everyone can use to add awesome text shadows to their résumés. The plugin is aptly named awesomejs.

I checked my code into Kiln so my team can grab it.

Adding the awesomejs repository

Now, don’t tell Cthulhu, but Vader is my favorite co-worker. His ability to dominate a galaxy is truly unmatched. Vader happens to be out of the office at the moment. While he’s gone, I’m going to make his résumé just a little more awesome.

First I’ll clone Vader’s repo down to my machine.

c:\code>hg clone https://rob.kilnhg.com/Repo/Subrepo-Demo/Websites/vader
destination directory: vader
adding changesets
adding manifests
adding file changes
added 1 changesets with 4 changes to 4 files
updating to branch default
4 files updated, 0 files merged, 0 files removed, 0 files unresolved

Then, from within the vader repo, I’ll clone awesomejs so that it becomes a nested subdirectory.

c:\code>cd vader

c:\code\vader>hg clone https://rob.kilnhg.com/Repo/Subrepo-Demo/Libraries/awesomejs
destination directory: awesomejs
requesting all changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

Now I have two nested Mercurial repositories, but they are not linked in any way and I can’t operate on them in tandem…yet.

To achieve full-subrepo goodness I have to create the .hgsub file in the vader repository’s root directory. .hgsub is a plain text file with one line per subrepository. On left-hand side of the equals sign is the local directory name where the subrepo will reside. On the right-hand side is where that repo lives.

c:\code\vader>echo awesomejs = https://rob.kilnhg.com/Repo/Subrepo-Demo/Libraries/awesomejs > .hgsub

I save the file, add it, and commit it just like I would any other file. I can tell by the output that Mercurial knows I’m making a subrepo when I commit .hgsub. Further, it automatically creates a .hgsubstate file where it records a snapshot of the subrepo’s state.

c:\code\vader>hg add
adding .hgsub

c:\code\vader>hg commit -m "adding subrepository awesomejs"
committing subrepository awesomejs

With my new subrepo in place, I’m ready to share my changes, so, from within the vader repo, I push my changeset to Kiln.

c:\code\vader>hg push
pushing to https://rob.kilnhg.com/Repo/Subrepo-Demo/Websites/vader
pushing subrepo awesomejs to https://rob.kilnhg.com/Repo/Subrepo-Demo/Libraries/awesomejs
searching for changes
searching for changes
no changes found
searching for changes
searching for changes
remote: kiln: successfully pushed one changeset

Notice how issuing hg push from the parent repository automatically pushed the subrepo as well (though, in our case, there happened not to be any changes in the subrepo). Mercurial will automatically push all subrepositories when the parent repository is being pushed. This ensures new subrepository changes are available when referenced by top-level repositories.

Note: not all Mercurial commands will automatically recurse into subrepos. For instance, hg status does not recurse unless the -S option is specified, and hg pull will not act on subrepositories at all. Type hg help subrepos for more on this.

Now, if I look at the Vader website in Kiln I can see my new changeset. And if I switch to the file browser I can see a little visual cue next to awesomejs to indicate that there’s a subrepo present.

Subrepo indicator icon

If I click on the awesomejs folder, I can see that I don’t have a copy of the awesomejs files within the vader repository; rather, I have a link to the Subrepo Demo -> Libraries -> awesomejs repository. What’s more, Kiln knows exactly which revision of awesomejs to show me based on the data in .hgsubstate.

Clicking on the subrepo

If I were to peer inside the .hgsubstate file in the vader repo, I would see that it simply contains the changeset hash of the revision that it expects awesomejs to be locked on:

.hgsubstate file

Applying some awesomeness

Now that I have my subrepo relationship established, I’ll actually make use of the plugin. I will edit index.html and, right above the closing <body> tag, add references to jQuery and awesomejs, and call the plugin.

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.5.1/jquery.js"></script>
<script src="awesomejs/awesome.js"></script>
<script>
    // make all the h1 and dt tags awesome.
    $(document).ready(function () {
        $('h1, dt').makeAwesome();
     });
</script>

Much better:

Awesome  Vader!

Cthulhu’s turn

Cthulhu decided that, not only is he going to use awesomejs for his own resume, he’s going to contribue some changes to the plugin to make it better. He noticed that I hard-coded the color magenta right into awesome.js and thought it might be better if it were passed as a parameter to the plugin instead. This way, each engineer can use their own personalized color scheme.

To make this change, first Cthulhu clones his repo and repeats the same steps I did to create the subrepository (i.e., hg clone awesomejs, create .hgsub, hg add, hg commit). Then, within the subrepository folder, Cthulhu makes his changes to awesomejs. Lastly, Cthulhu updates index.html within the cthulhu repository to referenence and call the plugin. He likes cyan.

Cthulhu's CV

Now he’s ready to commit the changes in both repositories. Note, however, that when you call hg status from within the parent repo, you won’t see your changes to any subrepos. If you want to see what’s changed in a subrepo, you’ll need to use hg status -S instead. Here’s what hg status -S looks like at this point:

c:\code\cthulhu>hg status -S
M awesomejs\awesome.js
M index.html

IMPORTANT POINT! When he commits, he must do so from within the parent repository, otherwise Mercurial will NOT update its .hgsubstate. As a result, when the next developer comes along and clones the Cthulhu website, they’d get the hard-coded magenta version of awesomejs with it, and that’s not what we want.

First he commits the change to awesomejs:

c:\code\cthulu>hg commit -m "make the color a parameter instead of hard-coding it to 'magenta'" awesomejs
committing subrepository awesomejs

Then the change to index.html in vader:

c:\code\cthulu>hg commit -m "make use of the awesomejs plugin.  I like cyan." index.html

And here’s the log:

c:\code\cthulhu>hg log
changeset:   3:90ccac8e4a0b
tag:         tip
user:        Rob Sobers
date:        Thu Mar 17 14:42:26 2011 -0400
summary:     make use of the awesomejs plugin.  I like cyan.

changeset:   2:71e8c78d1f59
user:        Rob Sobers
date:        Thu Mar 17 14:42:04 2011 -0400
summary:     make the color a parameter instead of hard-coding it to 'magenta'

changeset:   1:7f699cdf3208
user:        Rob Sobers
date:        Thu Mar 17 14:24:43 2011 -0400
summary:     adding awesomejs subrepository

changeset:   0:3517279c43fd
user:        Rob Sobers
date:        Tue Mar 15 23:45:15 2011 -0400
summary:     new resume site

Cthulhu can now push these changes up to Kiln (remember, push is recursive) and carry on…right?

Wait a minute! Cthulhu changed the signature to the makeAwesome function without regard for anyone else! What is that going to do to Vader’s site? Vader isn’t passing a color name, so his shadow color will be undefined, which I’m pretty sure isn’t a valid HTML color. Vader is expecting his drop shadows to be magenta.

No worries! The change Cthulhu made to the subrepo is available to Vader, but not automatically forced on him. Here’s what’s in Vader’s subrepo:

C:\code\vader\awesomejs>hg tip
changeset:   0:d5e13195590d
tag:         tip
user:        Rob Sobers
date:        Tue Mar 15 23:37:48 2011 -0400
summary:     creating new jQuery plugin

And here’s the incoming changeset, should he choose to pull it:

C:\code\vader\awesomejs>hg in

comparing with https://rob.kilnhg.com/Repo/Subrepo-Demo/Libraries/awesomejs
searching for changes
changeset:   1:e281bb75e574
tag:         tip
user:        Rob Sobers
date:        Thu Mar 17 14:42:04 2011 -0400
summary:     make the color a parameter instead of hard-coding it to 'magenta'

When Vader is ready, he can pull the above change, update, and commit the parent repo to record the new version in .hgsubstate.

And that’s the beauty of subrepositories — they give you the benefit of working on a single shared repository across multiple projects while letting you lock each project at a specific version.

All of the code from this tutorial is available for you to play with in a public Kiln repo: https://rob.kilnhg.com

Credits:
Other resources:
  1. http://nerdwords.blogspot.com/2010/10/understanding-mercurial-subrepositories.html
  2. http://kiln.stackexchange.com/questions/2685/how-does-fog-creek-or-other-users-use-sub-repositories
  3. http://kiln.stackexchange.com/questions/1066/problems-using-sub-repositories-subrepos
  4. http://mercurial.aragost.com/kick-start/en/subrepositories