I came across a problem recently. I have a project called xBlip which I’ve described before – it’s an iPhone client for a Polish Twitter-like service Blip. This project has a backend part which I keep in a subdirectory “ObjectiveBlip” and which I’ve tried to keep as separate from the rest as possible, with the intention that it might be one day extracted as a separate project.
Now I got the idea that I could write a desktop application for Mac that does the same – and of course I could reuse that backend part for that. I would also like to create a separate project on Github with just the backend, so that theoretically someone might use it in future for some purpose.
But this means that I would be maintaining three separate copies of the same code, which I’d have to keep in sync somehow. So the question is, how to do this best?
There are a few ways in Git to share code between projects (for example, git submodules) – but most of them are intended only for one-way communication, i.e. downloading updates to a library maintained by someone else into your project. Here, I want to have a two-way communication: I could extend the backend code while working either on the iPhone or the Mac application (working directly on the backend-only project wouldn’t usually make sense), and then broadcast the changes into the other two projects. I also don’t want the solution to be inconvenient to people who download the project, as is the case with git submodules – you have to manually update them once you download the main code, initially their directories are empty.
So I started looking for a way to pull this off – and I found a script called “git subtree” which seems to do exactly what I need (confusingly, there’s another plugin also called git subtree which is completely unrelated to the first one…). It took me some time (and a few emails to the author, Avery Pennarun) to figure out how to use it, so I thought I’d post a tutorial here in case anyone has a similar situation.
So, here’s what we need to do… (grab a coffee, it’s going to be long):
We have one project – xBlip – with the backend code in ObjectiveBlip/ and the UI code in other subdirectories. We want to make a second project with just the backend code, and a third one with a Mac application which reuses it, and set up a way to sync the changes between these three.
There are (at least) two ways to extract the backend project from xBlip: I can either use git subtree to extract whole ObjectiveBlip’s history, or I can copy the files manually. If I chose the first option, I’d do:
git subtree split -P ObjectiveBlip -b export
This would create a new ‘export’ branch in my repo, containing only the commits and changes that had anything to do with the ObjectiveBlip directory, and ignoring anything that happened outside it. That way, the new project would have some kind of history from the beginning. Then, I would create a new repository out of that specific branch (I’ve learned that trick from Avery):
cd ~/Projects mkdir ObjectiveBlip cd ObjectiveBlip git init git fetch ../xblip export git checkout -b master FETCH_HEAD
This looks weird because normally when you create a new repo (
git init), the first thing you do is make the initial commit. Here, we instead fetch existing commits from an existing repo, and only commits from a specific branch (“export”), and then we manually create a master branch out of the fetched commits.
I’ve decided not to do that; the reason is that the commits that would form ObjectiveBlip’s history weren’t created with this separate project in mind – they were done as a part of coding on xBlip. And while it’s possible to extract only the relevant information with git subtree, the commits just wouldn’t always make sense. It would all be a bit artificial.
So instead I extracted the files manually and created a fresh project with no history:
cd ~/Projects cp -R xblip/ObjectiveBlip . cd ObjectiveBlip git init git add . git commit -m "extracted ObjectiveBlip from iPhone xBlip" git remote add origin firstname.lastname@example.org:mackuba/ObjectiveBlip.git git push origin master
Adding ObjectiveBlip back to xBlip as a subproject
In order to move commits around between projects, I need to have ObjectiveBlip repo added as a remote in both application projects. I will then see all ObjectiveBlip commits in a separate branch (objblip/master), and I will decide how to copy commits between that branch and the master branch.
cd ~/Projects/xblip git remote add objblip email@example.com:mackuba/ObjectiveBlip.git git fetch objblip
The graph in GitX looks like this at this point:
(I cheated a bit and did some tricks involving
commit --amend in order to force GitX to draw the graph this way – the author didn’t really foresee such configuration and GitX kind of freaks out sometimes when you work with git subtree, and shows long and messy lines, or even lines that break and continue somewhere else…)
To add ObjectiveBlip into the master branch as a subproject, I need to delete the existing files first:
git rm -r ObjectiveBlip git commit -m "removed ObjectiveBlip files"
Now, to join the subproject I need to use
git subtree add with the option
--prefix ObjectiveBlip (or
-P ObjectiveBlip). There are actually two ways to do that; I can do it either with an additional option
--squash, or without it. Squash means that the subproject commits that you add into your main project are merged into one.
Let’s try the version without squash first:
git subtree add -P ObjectiveBlip -m "readded ObjectiveBlip as a subproject" \ ↪ objblip/master
If you don’t use squash, the commits will be kept intact, so both application projects will contain a complete history, commit by commit, of the changes in ObjectiveBlip code. They will form a separate timeline parallel to your main one, but it will be connected to your main timeline at the points of merges, so if you look at a one-dimensional commit list (e.g. Github “commits” page), it will show the backend commits mixed with frontend commits. What’s worse, any commit you make to the subproject while working on the application’s master and backport to the other timeline (and I certainly will be making commits this way, because it’s easier to develop the backend if I can constantly test it in the actual app), will appear two times on the “commits” page – once in the main timeline, and once in the backend timeline.
I know, you probably didn’t understand any of this. Maybe this graph will clear things up:
This is the state of the xBlip repository after a few commits made in the xBlip repo and in the ObjectiveBlip repo.
The left vertical line is the main (master) timeline, which contains normal code of my project, with ObjectiveBlip in a subdirectory. The right vertical line is the ObjectiveBlip’s timeline which contains its files at the root of the project, and none of the UI code. Note that this isn’t really a direct git merge, and you can’t use plain git merge command to make the joins, or bad things will happen. You have to use git subtree to “translate” the commits for you.
Note also that the commit named “added foo to ObjectiveBlip” appears twice, once in the original version, and once in the “translated” version, and both versions will be visible on the “commits” page on Github. I could prevent that if I used ‘git rebase’ to delete the commit in the left timeline after I copied it to the right timeline, but that’s one extra thing I’d have to remember…
Merging and splitting
After the initial merge with
git subtree add, for subsequent merges you use
git subtree merge (for any command, you need to remember to use the prefix option to tell it the location of the subdirectory). If you make any commits to the subproject inside master, you can use
git subtree split to backport the commits to the right timeline; pass it a
--branch option with a name of a branch to be created or updated to point to the newest commit, and then push it to the external repository. Note that it’s usually better to keep the changes to files inside subproject’s directory and to files outside it in separate commits, e.g. make a commit “added foo to ObjectiveBlip” and then separately “added FooController in the UI”, even if you worked on both parts simultaneously.
Here’s a list of commands that were used to create the graph above:
# add the subproject - creates the first merge point, # adds a ObjectiveBlip/ subdirectory git subtree add -P ObjectiveBlip -m "readded ObjectiveBlip as a subproject" \ ↪ objblip/master # after we create the commit "added readme" # in external ObjectiveBlip repo: git fetch objblip git subtree merge -P ObjectiveBlip -m "merged changes in ObjectiveBlip" \ ↪ objblip/master # after we make the changes to both UI and the backend while working # in xblip master, we backport the relevant commit to the timeline on # the right, and push it to objblip repo; note that the second commit # is ignored, as it contains only changes unrelated to ObjectiveBlip git subtree split -P ObjectiveBlip -b backport git push objblip backport:master # after we update readme in ObjectiveBlip repo: git fetch objblip git subtree merge -P ObjectiveBlip -m "merged changes in ObjectiveBlip" \ ↪ objblip/master
I’ve decided to use the version with
--squash instead. If you use squash, you will actually have 3 timelines (!) in your application repo… First will be the master, second – the subproject one, and the third one will be the squashed one. What’s important is that the squashed timeline will be merged with the master timeline, but the original subproject timeline will be kept completely separate, and you don’t even have to push it to Github with your application project.
Again, a graph will (hopefully) clear this up a bit:
The left vertical line is the master, the right one contains the squashed commits. The subproject timeline – the one that is used to make pushes and fetches from the external repository – will appear separately from the rest, either on top, or at the bottom. You only need it locally and you don’t need to push it to the ‘origin’ repo.
Here are the commands used this time:
git subtree add -P ObjectiveBlip --squash \ ↪ -m "readded ObjectiveBlip as a subproject" objblip/master ... git fetch objblip git subtree merge -P ObjectiveBlip --squash \ ↪ -m "merged changes from ObjectiveBlip" objblip/master ... # note: for split, you don't pass --squash # (there's currently no way to squash the backported commits) git subtree split -P ObjectiveBlip -b backport git push objblip backport:master ... git fetch objblip git subtree merge -P ObjectiveBlip --squash \ ↪ -m "merged changes from ObjectiveBlip" objblip/master
There are two practical differences in the way your commit timelines will look like between the two strategies:
- with squash, there will always be only one commit per merge in the right (squashed) timeline; this may be good or bad, depending on what you expect, but I think most of the time you probably won’t need every single commit from the subproject to appear in your timeline
- with squash, commits backported from master to subproject will not appear second time in the right timeline, because they will be a part of one of the squashed commits
I believe you can use either of the approaches depending of how you want your commit graph to look like. But please pick one at the beginning and stick to it, or bad things will happen…