Working with Git

By Nathan Donaldson in Development Other on July 05, 2010

Image 0560 full 2x

During a brief slow period on a Friday afternoon I started pondering how much work I actually do, and if it was even useful knowing. Obviously all our code is stored in a version control system (git), so in a way all of the data for finding out the quantity of work is readily available. A little investigation and I found that it’s quite easy to pull a list of commits from git showing total lines added and deleted per file:

git log --oneline --numstat

I’ve committed a lot of code that I didn’t write, such as plugins, the Rails framework etc. So a quick and dirty ruby script later I could get a list of all unique files in all repositories that I’ve committed to. It was pretty easy to go through the list and create an exclusion list. I then broke out Ruport to aggregate everything by extension. That gave me the following table:

I’ve cleaned this up a little and collapsed some alternative extensions down.

Commits per week

Just over 110,000 lines added and 50,000 deleted, of which about 100,000 are to Ruby files. Now I’m not claiming to have written all those lines myself, any part of any line changed counts towards the total. All this does is illustrate the general balance of work that I do. There have been two lines added for every line deleted. This year has seen a lot of refactoring work, so it’ll be interesting to run the same exercise next year and see if the results are similar (of course git holds historical data, but we only started using it about 18 months ago, and previously had everything stored in subversion).

It’s interesting to see that the proportion of additons to deletions is much higher in view (rhtml/haml) files than in ruby code. This could point to the way things look being changed much more than the way things work.

Now if only there was a way to measure the quality of work. (Actually there are tools; metric_fu is a good starting point and we use it a lot at Boost. However, that’s going a little too far for this post).

Another interesting bit of data I extracted from git is the number of commits I’ve done per week over the last 52 weeks.

I’ve posted my script as a github gist. You ┬ácan run it by modifying the @repositories array with a list of git repositories, @author with your email address and @excludes with a list of regular expressions for excluding files. Run the script as ruby gitcount.rb. If it is run with the argument “files” then it will list individual files, making it easier to build the exclude list.