Escaping The Tar Pit: Introducing Skunk v0.3.2 at RubyConf 2019
This year I had the honor to speak at RubyConf in Nashville.
It was my second time attending the conference and first time as a speaker. I
skunk, a gem to calculate the SkunkScore
of a module or set of modules.
Since its inception,
skunk has changed quite a bit based on real usage in
our productized service for Rails upgrades. As a matter
of fact, the night before my talk I realized there was a BIG error in our
Here is a description of the problem and solution.
I was doing some tests the night before my talk and noticed unexpected results using Skunk version 0.3.0. In my example, I wanted to show that simplifying a messy piece of code should show me an improvement in the skunk score average.
The problem was that as much as I improved the code, the skunk score average would only go up. This meant that the metric was lying to me. (And I was like WTF?!?!)
This is the project that I used for my example: https://github.com/fastruby/gggiiifff.com. It is an open source Roda application that basically consumes the Giphy API to find the gifs you are looking for.
When I ran
skunk on this project, I got this summary:
SkunkScore Total: 0.71 Modules Analysed: 6 SkunkScore Average: 0.11833333333333333 Worst SkunkScore: 0.71 (models/query.rb)
In my test, I was trying to reduce complexity for the most complicated file
models/query.rb) by removing code from that file (yes, I know that if I remove these
lines the application stops working, but it was a change to prove a concept)
In theory, removing methods from a module will improve its skunk score because then we have less code to maintain. (read more here to know how it works)
In practice, I was seeing the SkunkScore get worse with every
Base branch (master) average skunk score: 62.771 Feature branch (skunk-v-0-3-0) average skunk score: 62.791 Score: 62.79
The formula for
AnalysedModule#skunk_score in Skunk v0.3.0
looks like this:
def skunk_score return churn_times_cost.round(2) if coverage == PERFECT_COVERAGE (churn_times_cost * (PERFECT_COVERAGE - coverage.to_i)).round(2) end # @return [Integer] def churn_times_cost safe_churn = churn.positive? ? churn : 1 @churn_times_cost ||= safe_churn * cost end
The problem with this is that
churn should not be considered when you are trying
to improve the skunk score, because
churn has too much weight when calculating
the skunk score. If we were to include
churn in the formula, we should make
that value weigh less.
One Example: Reducing Complexity
It might be easier to see this with an example. Let's say that we have a module with a cost of 1000 and code coverage at 50%. Let's say that we make 7 changes to the module and we reduce the cost by 100 each time.
This is what the
skunk_score over time looks like:
Every change that you make, reducing the cost by 100, is actually making the skunk score worse.
So, the night before my RubyConf talk, I decided it
would be best to remove
churn from the skunk score formula.
This is what
AnalysedModule#skunk_score looks like in Skunk v0.3.1:
def skunk_score return cost.round(2) if coverage == PERFECT_COVERAGE (cost * (PERFECT_COVERAGE - coverage.to_i)).round(2) end
It is pretty straight-forward: There is only one penalty factor which is relative to the lack of code coverage in the module.
Using this new formula and the same example, this is what the
time looks like:
Now that looks more like what I expected!
Code Coverage Improvements
What does it look like if we make 4 changes to increase code coverage by 10% in every change?
If we are using version 0.3.0, this is what the
skunk_score looks like:
If we are using version 0.3.1, this is what the
skunk_score looks like:
Now the skunk score average makes more sense. 🤓
Fortunately I managed to fix my slides in time for my talk! You can watch my presentation at RubyConf over here:
I believe there is a lot of value in comparing skunk scores between one branch and another. Getting out of the tar pit is no easy feat. You want to make sure that you're always making changes that get you one step closer to a truly easy-to-maintain project.
In the roadmap for Skunk, I'm really excited about a feature that will keep track of the skunk score average over time. This metric can be our compass in this adventure out of the tar pit. If you see the average going up over time, you will know there is something wrong with your team.
In order to keep improving the tool, I plan to run it using well-established, open source, Ruby and Rails codebases. Which skunk score do you want to see first? Let me know in the comments below! ❤️
The presentation I gave at SolidusConf was based on Skunk v0.3.0, so there is a mistake in this presentation:
I'm sorry about that, @SolidusConf and attendees! The good news is that it is a small mistake. It doesn't change much about the core of my talk.
Here are some of the resources that I used for this article:
- Skunk v0.3.0: https://github.com/fastruby/skunk/tree/v0.3.0
- Skunk v0.3.1: https://github.com/fastruby/skunk/tree/v0.3.1
- gggiiifff.com repository using Skunk v0.3.0: https://github.com/fastruby/gggiiifff.com/tree/skunk-v-0-3-0
- gggiiifff.com repository using Skunk v0.3.2: https://github.com/fastruby/gggiiifff.com/tree/skunk-v-0-3-2
- Skunk Score Playground (Spreadsheet): https://docs.google.com/spreadsheets/d/1yQ2pI5J9XhpI4G884ndIH1NR0g-JCjoitpXS4yaZGFA/edit?usp=sharing