Git rebase: A gentle introduction

Understanding Git The Git Rebase

Git rebase is easily one of the most misunderstood commands in Git. Beginners often tend to mess up their working tree using this command the wrong way or fail to find the difference between git rebase and merge. So in this article, we try to explain in detail using graphical representations what actually happens and what does not happen in git-rebase. This article is in a form of a discussion on rebasing rather than a simple tutorial on how to rebase.

Recommended read: Git stash command

What is a rebase?

Rebase in literal terms means to change the base of an object. Git rebase does the same with git branches – it changes the base of the branch.

But what exactly is the base of the branch and why do we do it? We answer these questions the next few sections of this articles.

What is the base of a branch?

The base of a branch is the commit from where it bifurcated from it’s parent branch. It can also be defined as the latest commit which is in common with the branch and its parent branch. Here is a simple visualization to understand it.

Git Base 1
Fig 1: A visual representation of git commit-graph showing the base of the feature branch as blue.

Feature 1 is the commit from which the feature branch deviated from the master branch. So it is the base commit of the feature branch.

In git rebase we are interested to change this base to another commit which means that the Feature 2 commit will no longer point to Feature 1 but some other commit.

What does git rebase do?

Before we go into the details of the command, here is a quick look at the syntax of the git command:

git rebase <target branch>

This command rebases the current branch onto the target branch. This means the current branch is now based on the last commit on the target branch. Consequently, the git commit-graph will look like the branch never separated from the original branch, and the commits were based on the current base. This gives the commit-graph a linear-looking commit-graph.

The command syntax is not the most important part, the important part is how the command functions. If you mess up a git rebase, you can get some unwanted commits and complex commit graphs instead of linear clean commit graphs that you wished for. So let us understand git rebase in a little more detail.

Git rebase as we discussed at the beginning of the article changes the base of the branch. But this line is only partially true.

These are the things that we might expect git rebase to do

  • Change the base of the branch to the latest commit on the target branch.
  • Keep the commits in my current branch unaltered.

If git rebase followed this principle the git commit graph from figure 1 after the command git rebase master should look like:

Git Rebase Imagine
Fig 2: A imaginary commit-graph where the commits would be actually rebased to the target branch. Comparing with the commit-graph given in Fig 1, it seems like the branch never bifurcated from the master.

But there is a small catch we are missing here:

  • Following the above method will essentially destroy the commit history of the commits made to that branch. Any command that deletes or rewrites the git history can be a potential cause of the problem.
  • If the commits on the feature branch are dated before the newer commit on the target branch, the rebased branch will have newer commits that are older(with respect to time) than the commits it is based on, creating an absolute paradox.

To avoid these problems what git rebase actually does is:

  • Relocates the branch head pointer to the target branch pointer. (feature now points to the master)
  • Makes new duplicate commits on the branch. These commits are essentially the same have the same code changes as the original commits, but have different SHA1 values.
  • The older commits are still present on the graph in a detached state.

So let us finally visualize what happens in actual git-rebase,

Git Reabse Actual
Fig 3: The commit-graph after we rebase the feature branch on master. Notice how the two commits have the same messages(duplicate) but they have different SHA1 meaning they are different commits. The original commits are present in the commit-graph as detached nodes.

When do I use git rebase?

Git rebase is used only to have a clean looking linear commit-graph unlike the messy graph that git merge creates. So here is the tradeoff between these two

  • Git merge allows preserves commit history but the cost of an extra commit and convoluted commit graphs
  • Git rebase keeps the number of commits the same but at the cost of abstracting away commit history.

So every branch that you could merge can be possibly rebased and vice-versa, but it recommended not to rebase public/shared branches. While using rebase the user must keep in mind the tradeoff between the preserving the history and clean commit-graph.

Conclusion

This brings us to the end of the article on git-rebase. Visualizing git was used for the graphical representation of git commands. Stay tuned for more such articles on git and other open-source programs.