Progress through the years, and Ruby awesomeness

We’ve all have noticed that when looking back at code you wrote years ago, you realize how much you have progressed–or how much you sucked in the past, depending on your perspective–so it’s always a fair assumption to say, that in the future you will look back at the code you are working now, and feel ashamed. Perhaps this becomes logarithmically less of an issue as time goes on, but at least I still experience this.

I want to share what I think is a good example of this. Ruby code is particularly special in this respect since there’s always tons of ways to write code that do exactly the same.

Back in 2007 when I started msn-pecan, I wanted a script to create a ChangeLog, not because I thought it was needed, but mainly to shut up the people that said the git log wasn’t good enough, and that you still needed to update the ChangeLog in every commit. Few of those people remain today, but my script is still here.

The objective was to output something like this:

== Release 0.0.2 ==

2007-11-30	Felipe Contreras 
	* Add readme.
	* Update copyright stuff.
	* Fix to allow aliases with special characters.

== Release 0.0.1 ==

2007-11-29	Felipe Contreras 
	* Fix warnings.
	* Add display name support.

2007-11-28	Felipe Contreras 
	* Add .gitignore file.
	* Update info.
	* Fix compilation.
	* Stuff required to compile.
	* Initial import.

For some reason I was forced to look at it:

#!/usr/bin/env ruby

require "stringio"

revs=`git rev-list master`
refs=`git for-each-ref --format="%(objectname) %(refname)" refs/tags`

commits = revs.map { |e| e.chomp }

tags = {}

list = {}
list_order = []
$tag = ""

refs.each do |l|
  tag, name = l.chomp.split(" ")
  name = name.slice(/refs\/tags\/v(.*)/, 1)
  tags[tag] = name
end

commits.each do |e|
  commit = StringIO.new(`git show --pretty="format:%an ::%ai::%s\n%b" --quiet #{e}`)
  author, date, subject = commit.readline.chomp.split("::")
  date = date.split(" ")[0]
  if tags[e]
    $tag = tags[e]
    list_order << {:id => $tag}
  end
  id = "#{date}\t#{author}"
  uid = "#{date}\t#{author}\t#{$tag}"
  if not list[uid]
    list[uid] = []
    list_order << {:id => id, :value => list[uid]}
  end
  list[uid] << subject
end

list_order.each do |i|
  id = i[:id]
  value = i[:value]

  if value
    puts "#{id}"
    puts value.map { |e| "\t* #{e}" }.join("\n")
    puts "\n"
  else
    puts "== Release #{id} ==\n\n"
  end
end

If you are familiar with Ruby, you would see that there’s tons of things that don’t make sense there, unnecessary complexity, simpler idioms, etc. So I decided to clean it up:

#!/usr/bin/env ruby

require 'date'

tags = {}
tag = nil
commits = []

class Commit
  attr_reader :author, :date, :tag

  def initialize(id, subject, author, date, tag)
    @id = id
    @author = author
    @date = Date.parse(date)
    @subject = subject
    @tag = tag[1..-1] if tag
  end

  def to_s
    return @subject
  end

end

open("|git for-each-ref --format='%(objectname) %(refname:short)' refs/tags").each do |l|
  commit, id = l.chomp.split(" ")
  tags[commit] = id
end

open("|git log --oneline --format='%H%x00%an %x00%ai%x00%s'").each do |l|
  commit, author, date, subject = l.chomp.split("")
  tag = tags[commit] if tags[commit]
  commits << Commit.new(commit, subject, author, date, tag)
end

commits.group_by(&:tag).each do |tag, commits|
    puts "== Release #{tag} ==\n\n"
    commits.group_by(&:date).sort.reverse.each do |date, commits|
      commits.group_by(&:author).each do |author, commits|
        puts "#{date}\t#{author}\n"
        puts commits.map { |e| "\t* #{e}" }.join("\n")
        puts "\n"
      end
    end
end

So we went from code that took more than 10s to complete, to one that barely took 100ms, it’s much easier to read, smaller, and it’s actually more correct (there was a bug in sorted dates). Not bad!

I was particularly surprised by the method group_by, which takes any Enumeratable and does what the name says:

[
  {:id => 0, :name => 'foo'},
  {:id => 0, :name => 'bar'},
  {:id => 1, :name => 'roo'} ].group_by { |e| e[:id] }
=> {
  0=>[{:id=>0, :name=>"foo"}, {:id=>0, :name=>"bar"}],
  1=>[{:id=>1, :name=>"roo"}] }

And then, in the functional tradition, you can sort, reverse, etc. the result.

Moreover, instead of getting the full output of a command with `git command`, it’s much better to pipe it with open("|git command"), something obvious, but a bit tedious to do in other languages.

The rest is git magick–I’ve been progressing on that front as well :)

Did I mention I love ruby?

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s