Sometimes reading the source control history is akin to realising just a bit too late that you’ve run out of toilet paper; deep sadness followed by utter despair. What should be an enlightening and educational read about the changes to your code turns into frustration and confusion.

Why

It’s all about readability and understandability.

Riveting messages like these can be seen far too often:

fixed
latest
Various fixes
Updated
wip

It’s impossible to tell what has changed or why with such messages. In a few weeks time the original committer wont be able to tell either. They will look at the diff but still not be entirely sure why that went in or what it does (without a decent mental effort).

Similar to writing code, we write commit messages for others to read and understand. And as much as writing clean code takes practice and conscious effort so does writing good commit messages. The extra time spent writing them will pay dividends later though.

Commit messages are part of the project maintainability as they make it easier to track what’s happening in the project. They also make it easier to find changes that may have introduced new bugs.

Peter Hutterer goes as far as saying:

… a commit message shows whether a developer is a good collaborator.

In short, commit messages serve a distinct purpose: they inform your co-workers, and your future self, about what has changed and why it changed.

How

Searching the Internet for tips on writing good commit messages yields so many results that you’d thought it be common practice to do so! Chris Beams has written a very popular post, as have Tim Pope. The Git Documentation have a section about styling as well.

While those posts mention Git specifically, the idea of clean commit messages are universal for all source control systems, be it Git, SVN, Perforce, CVS, or any of the others out there.

As much as there are coding standards, there should be commit message standards. And as with the coding standards it doesn’t matter so much what they are as long as they are followed by all in the team. However there are some general structures that are sound to follow, and I wholeheartedly agree with the above posters about this form:

Short summary of change at most 50-ish character

Followed by an optional paragraph that gives more details about the
what and why of the change. This text should be limited 72 characters.

- It may include bullet points
- That highlight specific things

That, I think, should form the basis of all commit messages. The ending punctuation mark on the subject line is optional, otherwise the message should be spelled and capitalized correctly. Any dressings (such as linking to bug tracker, etc) should be fitted in to that structure.

What form should the subject line have? Well, there’s a fierce debate about that, almost to the scale of the great Editor War1 - indicative mood vs imperative mood. It’s basically the difference between

Removed unused variables (indicative mood)

and

Remove unused variables (imperative mood)

The Linux Kernel2, for instance, requires all messages to be in imperative mood with the reasoning of

… you are giving orders to the codebase to change its behaviour.

While I prefer the imperative mood as well, it’s sweating the small stuff (like where to place the curly braces) and it’s more important to just have a consistent style that’s clear and easy to read.

Commits should as far as possible be a single atomic commit, related only to the specific fix or change being made. It’s very common to see something like “Fix enemy jump” and when you look at the changes there’s 8 files changed, 7 of them completely separate from the actual change mentioned in the commit message; some probably only changed indentation or whitespace. That makes reviewing a change that much harder.

Example - state what and why

Let’s consider this example:

Fix for enemies not finding closest area

On the surface that seems like a good commit message. But is it really though? What has actually changed? We know why something was changed - the enemies couldn’t find the closest area - but we’d have to look at the diff to figure what was changed. However, to avoid mental overload a diff should only really be used to tell us how something changed; we shouldn’t have to infer the what and why from it.

If we instead change the message to something like:

Fix closest area distance calculation

The enemies would sometimes never find the closest area because the
distance returned never extended beyond the current navigation layer.

Here we can clearly see what has changed, and why it was changed. Now, if we’re interested, we can look at the diff to see how it was changed.

Example - atomic changes

Consider this (contrived) commit:

Fix enemy jump bug

assets/characters/enemy.ma
engine/physics/character_controller.cpp
engine/physics/character_controller.h
game/enemy/enemy_jump.cpp
game/enemy/enemy_jump.h

A warning bell should off not only for the subpar commit message, nor the fact that we’re changing engine and game code in one commit, but we’re also changing art assets! You often see changes like this when the committer is fixing and wanting to close a bug in a bug tracker system. From that system’s point of view this change make sense - it’s a single commit that fixes the bug.

However, from a readability and maintainability point of view this is not so ideal. Imagine weeks later someone is looking at the history for character_controller.cpp and they come across “Fix enemy jump bug”. That’s odd. A character controller doesn’t really have anything to do with enemies jumping.

A more reasonable series of commits could be:

Fix jump rig

The knee joint wasn't rotating properly.

assets/characters/enemy.ma
Fix uninitialised variable

engine/physics/character_controller.cpp
engine/physics/character_controller.h
Change jump input button

Cross is the traditional button for jumping.

game/enemy/enemy_jump.cpp
game/enemy/enemy_jump.h

Yes, it’s more work. No, you can’t tie a bug ticket number to a specific singular commit. But for tracking project changes, and helping your co-workers understand those changes this is a winner.

It’s also a lot easier to revert individual changes that may have broken the build, and doing selective merging of code / features to other branches.

That’s not to say you can never group files together. Sometimes that’s warranted, but care and thought should go into that before doing so - don’t do it just because it’s convenient and easy.

A good question to ask is: how difficult would be for someone who doesn’t know this code to make sense of this commit?

Summary

Commit messages are much like code - written for others to make sense of. They are essential for tracking changes to a project, and the overall maintainability of said project. It’s important that they are clear, concise, and descriptive of the changes going in.

  • A commit message should answer what and why, a diff should answer the how
  • Where possible aim for atomic commits
  • Don’t let commit messages just happen, be deliberate - have a commit guide
  • Pick a commit style for your project and stick with it
  • You write commit messages for others and your future self

  1. The war between Vi and Emacs ↩︎

  2. An interesting read, a similar guide document should exist for every repository ↩︎