How does one effectively combine Ruby with Git for source control?
24/Nov 2010
This guest post is by Erik Andrejko, a software developer living in San Francisco who spends his days working on web applications and solving data mining and machine learning problems integrating Ruby, Clojure and Java. When not spending time with his family, he blogs about web development and user experience at railsillustrated.com and can found on Twitter at @eandrejko.
Introduction
Source control is one of the primary tools of a developer’s toolbox and can be one of the most powerful. Git is one of the most popular source control systems used by Ruby developers and one of the most powerful available. Following good Ruby development practices, such as good use of unit tests, and disciplined code organization allow one to easily make use of some of most powerful features of Git. With good habits, Ruby and Git are a potent combination.
This article assumes some basic familiarity with Git concepts and commands and discusses effective use of Git with Ruby projects. Git can take some time to learn well, a good introduction can be found at gitref.org and the canonical comprehensive reference can be found at Pro Git.
Use Feature Branches
Git is remarkably capable of creating separate development branches and later merging those branches. A good developer can use this to their advantage by using feature branches. A feature branch is a branch of the codebase used to implement a single feature.
To create a feature branch in Git just create a branch:
git checkout -b feature
Once you get into the habit of using feature branches it is not uncommon to have many of them simultaneously, so it is helpful to name the feature branch with some descriptive name.
Doing your development work on a feature branch provides many benefits:
- You can develop the feature over time and switch back to the master branch at any time to have a known good version of the code.
- You can abandon the feature at any time without having to carefully revert a partial implementation.
- You can change your commits on the feature branch before committing them to the master branch.
When the feature is complete you can move the commits to the master
branch with either a git merge
:
git checkout master
git merge feature
or alternatively with a git rebase
, depending on how you prefer the
history to be maintained:
git checkout master
git rebase feature
When doing a merge, an additional commit will be created on the master
branch which will show that two branches have merged at that point. Thus
the history will forever reflect the existence of your feature branch.
Some developers prefer to avoid having these merge commits in the
history. To avoid these additional merge commits use git rebase
instead, which will move the commits from the feature branch onto the
master branch without creating an additional commit keeping the history
of the master branch linear.
Rewrite to Keep Your Commits Clean
When using a feature branch it is generally a good practice to take the opportunity to rewrite the commits before placing them on the master branch. Generally, changing commits is often discouraged in Git and many Git guides advise against changing history by rewriting commits. This is good advice, as rewriting commits on a shared branch will cause headaches for other developers. However, on your private feature branch you can change any commits you like before they become published on the master branch.
To rewrite your commits, from the feature
branch use git rebase
with
the --interactive
flag:
git rebase --interactive master
which will start an interactive rebase. Using an interactive rebase will enable you to:
- Merge commits together (known as squashing).
- Change the order of the commits.
- Change the content of commits and their messages.
Another important task before publishing your feature branch is to ensure that all the unit tests pass after each commit. As we shall see later, this will make it easier to use another powerful feature of Git.
Ruby and Git play well together
There is one disadvantage to using feature branches instead of just placing all commits on master: merge conflicts. Generally Git is very adept of avoiding merge conflicts but some conflicts are unavoidable.
For example, suppose that on the feature branch we add the has_coupon?
method to the User
class:
class User
def name
@name
end
def has_coupon?
!coupons.empty?
end
end
If on the master branch, another developer were to also add a method to
User
after the name
method there will be a merge conflict when
merging or rebasing the feature branch onto the master branch. This type
of merge conflict often occurs when classes get large and are often
changed. o prevent this kind of conflict use Ruby modules to organize
the methods in the User
class.
On the feature branch change the User
object as follows:
class User
include CouponMethods
def name
@name
end
end
and add a new file coupon_methods.rb
containing the new methods:
module CouponMethods
def has_coupon?
!coupons.empty?
end
end
Since we have added a new file with we will likely avoid any merge
conflict if another new methods are added to User
in the master
branch. Of course, if two developers both add include
statements to
the top of the User
class there will again be a merge conflict.
However fixing a conflict which involves only a single line is easier
than a conflict involving many lines of method definitions.
Ruby is flexible enough to use a pattern that guarantees no merge
conflicts. If we change coupon_methods.rb
to:
module CouponMethods
def has_coupon?
!coupons.empty?
end
end
User.send(:include, CouponMethods)
then there cannot be a merge conflict as no two developers will change
the same file. Generally, it may not be a good practice to extend the
User
object in this way. It does illustrate that Ruby can be made to
play very well with Git when the need arises.
Using submodules to extend classes in a unobtrustive way is particularly useful when modifying third-party code. In these cases avoiding merge conflicts is particularly important as it is usually more difficult to resolve the conflict when the other developer is not readily available to help or when test coverage is lacking. By modifying third-party code in Ruby ouside of the original source files you can avoid the need to resolve any merge conflicts when updating that third-party code. Typically in these cases a good suite of units tests testing both the original third part code and the modified code is essential to ensure that the behavior of the updated third-party code with your changes stays the same.
When Something Goes Wrong
Bugs are inevitable and Git has several features designed to help
debugging. When a bug is discovered, the first place to look to find the
commit that introduces the bug is to use git log
to look for modified
files and search for the obvious candidates.
The git log --stat
command will show each commit message together with
a list of files modified by that commit.
git log --stat
If you want to isolate the output of git log --stat
to a particular
subdirectory or file, you can provide a path:
git log --stat lib/
It is preferable to avoid adding bugs in the first place and Git has
some tools to help. In Ruby it is common to use descriptive method names
in place of comments in the source code. Commit messages provide an
alternative source of comments for each source file. To see the commits
which are responsible for the current state of every line of code in a
particular file use the git blame
command.
git blame lib/user.rb
The first column of the output will be the SHA of the commit. To view
the commit message for this commit pass the SHA to git log
which will
show the commit message at the top of the log:
git log 1d35a63e
Often, the commit message will provide very useful information about the intent behind a particular line of code.
Sometimes the source of a bug is not found in the expected place. In
this case finding the commit which introduces a bug can be difficult.
Fortunately git provides a tool which can be incredibly useful and is
guaranteed to find the commit which introduced the bug: git bisect
.
The git bisect
command is designed to perform a binary search of the
commit history to find the first ‘bad’ commit. To start a bisect,
checkout the commit which is known to contain the bug, mark this as
‘bad’, check out any commit known not to have the bug and mark this as
‘good’:
git checkout known_bad_commit
git bisect start
git bisect bad
git checkout known_good_commit
git bisect good
After marking a commit as ‘good’, Git will start a binary search of the
commit history between the ‘good’ and ‘bad’ commit by checking out a
series of commits. After each checkout, mark the commit as ‘good’ or
‘bad’ with the git bisect
command. Eventually Git will report the
first ‘bad’ commit along with its commit message. Once completing the
bisect you can restore Git to its original state by running
git bisect reset
. Alternatively if you make a mistake marking a
particular commit or otherwise want to start over you can also use
git bisect reset
.
Bisecting a long-range of commits can be time-consuming and fortunately
this process can be automated. If you are using good testing practices,
you will likely start after identifying the bug by adding a failing
test. You can use this failing test to run git bisect automatically
using git bisect run
:
git checkout known_bad_commit
git bisect start
git bisect bad
git checkout know_good_commit
git bisect good
git bisect run ruby test/unit/unit_test.rb
Using git bisect run
will checkout a series of commits and run
ruby test/unit/unit_test.rb
after each checkout. If the test passes
the commit will be marked ‘good’, if it fails the commit will be marked
‘bad’. Assuming that you simply edit test/unit/unit_test.rb
you may
run into trouble if any of the intermediate commits also change
test/unit/unit_test.rb
as Git may not be able to resolve the conflict.
In this case, you can temporarily add the failing test to a new file
before using git bisect run
.
It is should now be clear why it is important to keep the unit tests
running after each commit. If not, using git bisect run
may find the
wrong commit.
Any program can be used with git bisect run
, not just a Ruby test. For
some bugs it may not be possible to add a failing test because not
enough information about the source of the bug is known. In this case,
as long as the bug can be detected by some automated means you can still
use git bisect run
to find it. To use any Ruby program with
git bisect run
use the exit
method to pass the correct state back to
Git. f the test passes: use exit(0)
to exit the Ruby program and
indicate to git bisect run
the commit is ‘good’, if the test fails use
exit(1)
to exit the Ruby program and indicate to git bisect run
that
the commit is bad.
Git bisect is one of the best features of Git and has been used to find
some difficult to trace bugs. Even better, by automating the search with
git bisect run
, no matter where the bug was introduced, it will be
found.
I hope you found this article valuable. Feel free to ask questions and give feedback in the comments section of this post. Thanks!
If you are new to Git then the Using Git & GitHub eBook would be useful to you.