Tuesday, November 8, 2011

Helping the kernel workflow redux

[edit: this is now used in the wild]

 The goal is still to give the kernel developers and its users a better way to validate the authenticity of changes that eventually land on Linus's tree.

The "signed commit" mechanism discussed in a previous post may be useful in some workflows, but not necessarily so in an environment where you would push a commit out, and then decide that the commit is worth including in the upstream history after a long while. If you forgot to sign the commit when pushed it out but otherwise the commit is in good shape, it feels a bit dirty that you would have to either amend it or cap it with a signed empty commit.

The latest round after a lengthy discussion across three mailing lists is to allow the integrator to run "git pull" against a signed tag, e.g.

$ git pull git://.../rusty.git/ rusty-for-linus

When 'rusty-for-linus' is a tag, the above syntax does not work with the current git (and it won't change in the upcoming 1.7.8 as we are deep in the pre-release feature freeze period), but you can instead say 'tags/rusty-for-linus' to do the same thing.

When recording the merge result of such a pull that names a tag, Git will open an editor and ask the integrator to give a merge commit message. So far, 'git merge' never asked for commit log message to be edited, and histories of many projects, especially when 'merge.log' configuration variable is not enabled, are littered with one-liner messages, such as "Merge from origin" that does not tell anything useful - why was this merge made, what changes were brought in, etc. That is going to change as well, as a side effect of this topic.

The integrator will see the following in the editor when recording such a merge:
  • The one-liner merge title (e.g 'Merge tag rusty-for-linus of git://.../rusty.git/');
  • The message in the tag object (either annotated or signed). This is where the contributor tells the integrator what the purpose of the work contained in the history is, and helps the integrator describe the merge better;
  • The output of GPG verification of the signed tag object being merged. This is primarily to help the integrator validate the tag before he or she concludes the pull by making a commit, and is prefixed by '#', so that it will be stripped away when the message is actually recorded; and
  • The usual "merge summary log", if 'merge.log' is enabled.
The contents of the signed tag is also recorded in the header field of the resulting commit object, so that anybody can later retrieve it from the history and validate the signature. The signed tag that was pulled is not stored in the integrator's repository, nor pushed out to the integrator's publishing point.

The primary reason the new mechanism records this information inside the commit instead of leaving the tag around is for convenience. Recent kernel history contains about 400 merges by Linus within 3 months (4 to 5 pulls per day), and that counts only the pulls by Linus. To make the whole merge fabric more trustworthy, the integration made by his lieutenants by pulling from their sub-lieutenants need to be made verifyable the same way, which would (1) make the number of signed tags even larger and (2) make it more likely somebody in the foodchain gets lazy and refuses to push out the signed tags after he or she used them for their own verification.


No comments: