A walk in the forest of worktrees

Lately, I’ve been working on migrating Doctrine ORM to PHP 8 syntax. To that end, I’ve been using Rector, an automated refactoring tool. It comes with a set of rules called LevelSetList::UP_TO_PHP_81 which makes sure you use the most modern syntax to do something as long as it is supported on PHP 8.1. UP_TO_PHP_81 is equivalent to UP_TO_PHP_80 + PHP-8.1-specific rules. UP_TO_PHP_80 is equivalent to UP_TO_PHP_74 + PHP-8.0-specific rules. UP_TO_PHP_74 is equivalent to… you get the idea, and maybe also marvel at how satisfying this looks. 🤓

I’m doing that work on the 3.0.x branch because we don’t currently plan to drop support for PHP 7.1 on the 2.* branches. While I’m at it, I’m taking this as an opportunity to add type declarations where not already present.
Adding type declarations is very often a breaking change, especially when working on methods that are not private, and classes that are not final,which explains why that was not already done everywhere on lower branches.

Thankfully, we have plenty of phpdoc comments that Rector can use to infer the correct type declarations to add. Here is how such changes might look:

- /**
-  * @param string $foo
-  *
-  * @return int
-  */
- public function doStuff($foo);
+ public function doStuff(string $foo): int;

doctrine/orm is a lot of code though, so I’m trying to work in bite-sized pull requests. First, because it would be awful to review all these changes at once, but also because the phpdoc we have can be imprecise, or plain wrong (we still have to work through our PHPStan and Psalm baselines). Having inaccurate phpdoc might be fine from PHPUnit‘s point of view, but having inaccurate type declarations isn’t, so I need to fix these by hand afterwards.

I like to repeat that we’ve never been closer to releasing doctrine/orm 3.0 than we are today, an information that you can share widely because it’s always true. While that’s a fact you can hardly deny, it is still good to backport any of the fixes and improvements to the 2.x series: it makes difference between
branches smaller, which in turn makes merging up from 2.x to 3.x easier, but also lets the users benefit from those fixes and improvements earlier.

Usually, what I do is pick a class or namespace, apply Rector on it, then review the changes. If I spot phpdoc that is wrong, I fix that on the patch branch (currently 2.13.x). If I spot phpdoc that is correct but a bit vague, I make it more precise on the minor branch (currently 2.14.x). After the PR is merged, I merge 2.13.x up into 2.14.x, and 2.14.x up in 3.0.x, and I try running Rector again, this time with correct phpdoc.

Contributing the right thing to the right branch

That migration is a good use case for the git subcommand I want to introduce to you today, because I need to change branches often. It’s already not unusal to have to do so when maintaining a library, but it’s exacerbated here. Sorry for the Tom Jones earworm.

In this particular case, here is the problem I am facing: imagine you have 10 fixes or improvements to backport from one branch to the others, and that you discover them progressively. How would you proceed? Would you stash changes, switch branches, run composer update for good measure, make your change, commit, then switch back every time? Or would you maybe try to remember several things you need to do, and try to do them all at once? Either solution sounds pretty bad.

The subcommand that can save you from this is git worktree.
It allows you to have ✨several✨ worktrees at once for a single repository.

Creating a throwaway worktree with branch 2.14.x can be done like so:

$ git worktree add /tmp/throwaway 2.14.x

The operation is instant, and in the case of doctrine/orm, there are only 2 steps to be ready to work on that new branch:

$ cd /tmp/throwaway
$ composer update

But I do not want to create throwaway worktrees… I find the idea of having permanent worktrees very appealing: they are like starting points for new branches I want to create. Each one has its own vendor directory, with the right dependencies.

Also, I prefer to have them neatly grouped in a single directory. I could have a normal repository, and then add worktrees inside, but then git would consider the worktrees themselves as new directories that need to be put under version control. To avoid having that main worktree, you can use a bare repository:

$ git clone --bare git@github.com:doctrine/orm.git doctrine-orm.git

You will end up with a directory called doctrine-orm.git, and the contents of that directory will be what you usually find in the .git directory If you use git log, you will see the history of the default branch, which the current HEAD points to (2.13.x in our case).

Doctrine uses a consistent branching model on all of its repositories:

  • 🐛 bugfixes go to the patch branch;
  • 💡 new features, deprecations, improvement go to the minor branch;
  • 💥 breaking changes go to the major branch.

At first, I named directories after branches, but when 2.12.x went unmaintained, I no longer had a use for the corresponding directory, and realized I should have one directory per branch type instead. Here is how to create that workforest 🌳🌳🌳:

mkdir ../doctrine-orm
git worktree add ../doctrine-orm/patch 2.13.x
git worktree add ../doctrine-orm/minor 2.14.x
git worktree add ../doctrine-orm/major 3.0.x

After that, you should end up with something like this

doctrine-orm
├── major # a full 3.0.x doctrine/orm is inside 💥
├── minor # a full 2.14.x doctrine/orm is inside 💡
└── patch # a full 2.13.x doctrine/orm is inside 🐛
doctrine-orm.git # Looks just like a regular .git directory
├── config
├── HEAD
├── hooks
├── objects
├── packed-refs
├── refs
└── worktrees # contains administrative files for your worktrees

Note that I could have created the worktrees directly inside doctrine-orm.git, but I don’t want to, I find that messy.

When inside doctrine-orm/*, git still knows where the repository is stored thanks to a .git file in each worktree. Yes, in this case it’s just a one-line file, with a pointer to a directory that holds administrative data for that worktree.

$ cat .git
gitdir: /path/to/doctrine-orm.git/worktrees/minor

When trying this at first, it broke custom git hooks I had. That’s because when using worktrees, Git will store the administrative data that is common to all three worktrees in the usual directory, but will put worktree-specific administrative data in another directory (here: /path/to/doctrine-orm.git/worktrees/minor). Making the distinction between the git directory specific to a worktree and the git directory shared by all worktrees helped me fix my hooks. It is possible to figure them out from inside a worktree:

$ git rev-parse --git-dir
/path/to/doctrine-orm.git/worktrees/major
$ git rev-parse --git-common-dir
/path/to/doctrine-orm.git

When directly inside doctrine-orm, nothing special happens, and you will get the usual error message when trying to issue a git command.

$ git status
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

When inside a worktree, you can know the status of all worktrees:

$ git worktree list
/path/to/dev/doctrine-orm.git    (bare)
/path/to/dev/doctrine-orm/major  e9f3a43f3 [php8-migration-persisters]
/path/to/dev/doctrine-orm/minor  cc9e456ed [10238--lockMode]
/path/to/dev/doctrine-orm/patch  28cb24b3c [psalm-5-fixes]

What does this unlock?

  1. Switching branches becomes as easy as switching directories. No need to commit or stash anything.
  2. I do not need to run composer update all the time when switching from one worktree to another.
  3. If I want to, I can even compare and edit the same file on two different branches at once. 🤯

I like that git has such hidden gems, like git worktree or git bisect, that are little known and rarely needed, but are still killer features when you do need them.

How to become a good maintainer

There are many ways you can contribute to Open Source: reporting issues, answering support questions, contributing docs, tests or code, and maintaining packages. I’ve been experiencing the last one for a few years, thanks to Thomas Rabaix, of Sonata Project fame.

I got involved in Sonata soon after picking it as the go-to tool for administration interface in my former company. I had been experiencing some bugs, and the code was simple enough for me to change it, and make a PR to contribute the changes back. At some point though, I remember getting a bit frustrated because my PRs were piling up. I looked at the list of PRs and I thought that mine were never going to be reviewed given the list of open PRs. So I proceeded to give reviews to as many PRs as I could, so that they were easier to understand, merge or decline. Then Sullivan Senechal joined the project and gave it, among other things, a much needed vision on versioning
Some time after that, when the project’s health was starting to get better, I got merge permissions too. Since then, I have been a maintainer and I have learnt some things about that job that I think are worth sharing.

The bus factor

This one is of tremendous importance: just like your production systems should avoid Single Points Of Failure, you should avoid them in an open-source organization. Many people should be able to merge pull requests, release new versions, or tweak repository settings. Of course, you should not trust just anyone, but if you notice you get regular contributions from the same person, you should definitely ask them if they would be interested in joining your team. Some might first say they are not worthy, and later on turn out to be great additions to your team. All will feel honored with your trust, and will often contribute even more than they already did. Always be looking for new team members even if you are already many, it puts less pressure on everyone, and allows better decisions to be taken. Some
members will sometimes need to step away from the project for some time, and that should not be an issue for anyone.

Managing releases

Composer is in my opinion the single best thing that happened to the PHP community the last few years, and, because of it, I think the approach people have towards version numbers shifted from marketing to something way more technical: backwards-compatibility. One of the most crucial responsabilities you can have as a maintainer is to make sure patch/minor releases do not contain any BC-breaks. Instead, you will often have to ask contributors to deprecate part of the API of your lib, while recommending newways of using your package.
The issue with the BC layers that are introduced is that they are often ugly, and can make your code harder to understand, and to contribute to. To avoid this, you should release new major versions of your packages as often as necessary, and remove deprecated APIs when doing so.
The corollary of this is that major versions should be boring, ideally just about removing old, ugly things, while minor versions should be the ones to continuously introduce new features.

This “release often” mantra goes for minor and patch versions too:
– The longer you wait for releasing major versions, the more crust you accumulate;
– the longer you wait for releasing minor versions, the bigger wave of bug
reports following them;
– and finally, the longer you wait for releasing patch versions, the longer your users suffer from bugs.

Our policy for Sonata releases is to do the patch and minor releases on demand: people can ask anyone of us for a release and it will be notified in a dedicated Slack channel that release managers monitor.
For major releases, we do not release as much as I would like, and I hope we can improve on that.

Making contributions easy

As a maintainer, you should make sure that the build is as stable as possible. Contributing a typo fix and getting a red build is terrible experience for contributors. To avoid that, you should never ever merge unless the build is green, and most tools will allow you to prevent that with some settings. You should also try to reduce uncertainties by using appropriate version constraints, and pinning versions for software you do not install through Composer. Do not commit your composer.lock though, that would mean your users would discover bugs before you get a chance to. Do add build jobs to test your software with future versions of your dependencies, and a job to test them with the lowest possible versions. Also when everything else fails, remove the tests that fail, even if they only fail sometimes for no apparent reason.
There is nothing worse than flaky tests.
If the tests fail in the CI, your contributors should be able to run them locally easily. Ideally, they should be able to just run make test and look
at the result.
If that fails, your contributors should be able to figure things out by reading the CONTRIBUTING.md and/or README.md files.

Reviewing code

Getting fast feedback on pull requests is always nice, and one of the most important feedback you can provide is that you do not understand the pull request. Require good commit messages, documentation for new features, and unit tests demonstrating how to use new APIs, or confirming that an issue is indeed fixed.
To avoid pestering contributors with style issues, automate away the style review, there are lots of SaaSes for this. Sullivan even wrote one and let us use it on Sonata.
If you can, integrate static analyzers like phpstan or psalm in your CI pipeline, it should detect some non-obvious bugs, or inaccurate phpdoc. With this out of the way, you should be able to focus on the big picture, and tell your contributor as soon as possible if their patch can be merged or not. Doing so as early as possible can avoid a lot of frustration.

I hope these tips will help you if you plan to become a maintainer, which you absolutely should if you want to grow as a developer: it will give you experience in areas that are crucial for deploying projects flawlessly but where you might not be able to train often.