How To Use Git

Table of Contents

Documents

git is a Distributed Version Control System (DVCS).

This is mainly a random collection of my personal notes about git.

Online Git Book

Composing Git Commit Messages

The git commit message normally consists of the following:

  • The Commit Subject Line is a short one-line summary of the change
  • The Commit Body is a description of the change
  • Optional signed off by line

A single blank line separates the commit subject from the body.

The following descriptions were made by Junio Hamano the Git project maintainer on the git mailing list.

The Commit Subject Line

The subject is primarily to help people who look at
shortlog (or "gitk") to get the overview of recent changes,
or in general "changes within a given range".

Readers are most interested in what areas are affected
(e.g. the command from the end-user's point of view, or the
internal implementation) and what the nature of the change
was (e.g. bugfix vs enhancement).  To help them, the
Subject: line summarizes "what the change is about".

Your Subject: line is _perfect_.  It identifies the area as
"git-commit" instead of "builtin-commit.c", because it is
not about fixing internal implementation of that file, but
about the end-user experience interacting with the command.
It also makes it clear that it is a fix by saying that we
failed to exit with non-zero status code upon some failure.

The Commit Body

The body of the commit log message is primarily to help
people who look at this particular commit 6 months down the
road to see why things got there that way.  

Reason behind the logic in the code _after_ the change can
be left in in-code comments.  The reason behind the change
itself (why the logic behind the code _before_ the change
was faulty or insufficient, and the logic behind the new
code is better) is not captured well in such a comment (and
we do not want to clutter the code comments with a long "in
ancient versions we used to do this but then we updated it
to do that but now we do it this way instead." --- I made
that mistake earlier and I suspect some of the older source
files still have them).

The commit log message should describe why the change was
needed (e.g. "The earlier code assumed X because it knew Y
won't happen, but that is not the case anymore since commit
Z, so this code stops relying on that assumption and
implements the logic this way instead"), why the proposed
implementation was thought to be the best one to choose
(e.g. "We alternatively could do W and it may have some
performance edge, but this way the code is simpler and in
my benchmark with real life data I did not see significant
gain from the added complexity").

How the code was changed in this commit does not need to be
described; that can be seen in "git show $this_commit"
output easily.

Signed-Off-By Line

The meaning of what 'signed-off-by' means is project specific.

In addition, at the end of the body, there is expected to be
your S-o-b: line, so it will never be "1-liner".

Dealing with Conflicts

If you end up doing a merge with conflicts the default commit message will have something like

Conflicts:

      filename_goes_here.extension

Either describe how you resolved the conflict or remove these lines.

Preventing Whitespace Breakage in Commits

If you enable the precommit hook in the repository then git will not allow you to commit code which has whitespace breakage.

Enable the hook as follows:

chmod a+x .git/hooks/pre-commit

Fixing Whitespace in Code

Here is a simple recipe to fix up whitespace breakage in code you are about to commit.

If git diff --check HEAD shows whitespace problems you can fix it with git wsfix. You need to set up the git wsfix command once as in the example below.

git config --global alias.wsfix '!git diff HEAD >P.diff && git reset --hard HEAD && git apply --index --whitespace=fix P.diff && rm -f P.diff'

Submodules

What is a submodule and why would you use it?

Why use detached heads in submodules?

The following excerpt is from the git mailing list by Junio.

Eyvind Bernhardsen <eyvind-git@orakel.ntnu.no> writes:

> One solution that occurred to me was to have a branch in each
> submodule for every main module and branch.  A branch name would be
> provided for each submodule in .gitmodules, used by "submodule push"
> but not "submodule update".  In this case, if the push to the branch
> fails, the main module branch is probably behind too.
>
> This seemed like a good idea, but it's racy.

It's not just racy, but I think it's wrong to limit to _one_ branch in
each submodule..

A submodule is an independent project on its own.

Suppose the commit DAG in the submodule looked like this:

                 o---o
                /     \
     --o---o---o---o---o---X---o---Z
            \                 /
             o---o---o---o---o---o
                  \     /
                   o---o

and the superproject points at commit X. You may need to tweak the
submodule to make it work better with the change you are making to the
superproject.

You have two choices:

 (1) update to some "stable" branch head that is descendant of X first,
     and make sure it works with the superproject.  Then develop on top of
     it, and bind the tip of suc development trail to the superproject:

                 o---o
                /     \
     --o---o---o---o---o---X---o---Z---o---o---Y (your changes are Z..Y)
            \                 /
             o---o---o---o---o---o
                  \     /
                   o---o

I think this is what you are suggesting.  But the superproject may not be
ready to use the submodule with the history from the lower side branch
merged in.  You would

 (2) fork off of X and develop; bind the tip of such development trail to
     the superproject.  IOW, you make the submodule DAG like this, and
     then "git add" commit Y in superproject.

                 o---o       o---o---Y (your changes)
                /     \     /
     --o---o---o---o---o---X---o---Z
            \                 /
             o---o---o---o---o---o
                  \     /
                   o---o

Sometimes forked branches need to be maintained until it proves stable
(and then your "tip" Y may be merged back to the tip of a public branch
Z).  So you would at least need to allow a set of topic branches in
submodules that corresponds to a single lineage of superproject history.

Then when both Z (with the changes from the lower side branch) and Y (your
changes) prove stable, the submodule project may decide to make a merge
between Y and Z.

                 o---o       o---o---Y (your changes)
                /     \     /         \
     --o---o---o---o---o---X---o---Z---W
            \                 /
             o---o---o---o---o---o
                  \     /
                   o---o

The superproject may decide to "git add" the result of such a merge, but
that decision is done separately (and obviously after such a merge is
made).

If everybody involved in the superproject forked from whatever happened to
be bound to the superproject, however, the submodule will have
uncontrolled number of unmerged "tips".  For the submodule to stay viable
as an independent project, some management has to be done to clean up this
mess.  To manage the forked development inside submodule properly, I do
not think you can autocruise from the toplevel superproject and leave the
random branches or unnamed commits unmerged.

That is why I suggested to:

 * Leave the HEAD in submodule detached, if you are not working in it;

 * Have a project policy regarding the use of branches in the submodule.
   When you need to work on submodule, first switch to the branch (the
   policy may allow/encourage you to create a new topic branch here), and
   commit to it.

 * The policy should also say when these forked branches should be merged;
   keep them tidy by following that policy.

And by pushing from submodule and then in toplevel, you will never have
"superproject names a commit unreachable from any of the branch tips of
submodules" problem.  Nor there is any raciness issue -- only after you
push out the submodule successfully, you push out the toplevel (and if the
former fails, you may need to redo the toplevel commit, but that happens
before you push it out so you can afford to rebase or amend).

How to get the status of all submodules

The following except is from Johan Herland <johan@herland.net> on the git mailing list

git submodule foreach "git status; true"

If the above is too cumbersome to type, one can easily wrap an alias 
around it:

git config alias.substatus 'submodule foreach "git status; true"'
git substatus

References

Git for Computer Scientists shows basic git internal data structure concepts. Git Wiki

Author: Bernt Hansen