Every time I try to refactor some code I try to decrease the total number of lines. Every time I have at least 2 identical pieces of code I try to extract them into a method. Somehow every time I try to do so i think the total code will decrease by like 10 or 20 lines, only to find out the number of lines is pretty much unchanged. Why does that happen?
Let's say you have some code. Let's say some parts of that code are identical. You would like to take those lines of code, put them into a single function and call that function instead. Let's call this process "code compression". Some people call it extract method, but "compressing code" is actually more generic, you might be also deleting useless code. For this post though, "code compression" and "extract method" basically mean the same thing.
What happens when you compress code? Ideally you would like the total number of lines to decrease, so you have to maintain less. But you also have to add some lines right? You have the function definition, maybe a line to close the function block, maybe some comments, some empty lines, maybe the code is not trivial to compress so you have to slightly edit it, all sorts of things. These are only some of the factors that fight against you when you try to compress it, and so I'll call them "waste".
So when is it actually good to compress code? Well, if you have
As long as this inequality holds true, then you'll actually reduce code instead of increasing it.
Let's see where the formula actually comes from.
Again, we have
Before compressing it we have
We want the code to decrease in lines, or at least to stay the same, so we want
With some manipulations we have
Which is the formula from before.
Ok, so now we have a formula, but what is it actually telling us?
Let's try plugging in some numbers.
Say we want to compress some pieces of code, each 2 lines long.
When I write code a new function adds me 4 lines of waste (2 empty lines for readability, 1 for function declaration + function block start and 1 for function block end).
How many places do I need to compress to decrease the total number of lines? In other words,
So I would need to call that function at least 6 times before it's actually worth it.
This is... strangely high right?
Let's see if it's correct.
Lines of code before compression =
Now the first realization. What if the piece of code I want to compress is 6 lines long? Well the formula is symmetric, I can switch
What about single line functions? Well here's the problem, you will always increase the number of lines.
Which is impossible since we know
So every time you have a function with 1 line of code, you are increasing the number of lines of the whole program. Maybe it's not such a good idea having many one-liners right?
I actually suggest trying this formula for your specific code. How many lines of waste do you create when you add a function? How many times do you have to call a 2-line function before the number of lines decreases? What can you change about your writing to improve the situation? You might find some really interesting results.