Skip to content

Code Quality Principles

dwightguth edited this page Mar 13, 2014 · 2 revisions

There seems to be some disagreement with respect to what constitutes "clean" and "high quality" code. I would like to clarify my own beliefs which form the foundation of my code reviews. To this end, I have created the following set of foundational principles to elucidate what I believe good code is.

1. Code is designed to be read and changed

What this means is simple, but its implications are profound and long-reaching. Ultimately, in this one sentence, I have captured the single highest priority we must hold for what constitutes good code. Good code is easy to read and easy to change. Obviously functional correctness is vital, but good code achieves functional correctness according to sound design principles. Most bad code is not functionally wrong, simply designed improperly. This type of flaw eventually leads to functional incorrectness because of unmaintainability and sloppy design, but ultimately the functional incorrectness is usually a secondary result of improperly implemented functionally correct code evolving over time.

All the following principles eventually flow logically from this one foundation, once we properly consider this first principle and operationalize it.

2. Good code is modular

This is the primary way that we as developers act to achieve the goal that good code is easy to change. A modular change is one which does not duplicate information. Consider the following two code samples:

public Foo(Integer bar) { this.bar = bar; }

...

new Foo(null);
new Foo(1);
new Foo(null);

versus:

public Foo() { 
    this(null);
}
public Foo(Integer bar) { this.bar = bar; }

...

new Foo();
new Foo(1);
new Foo();

These two code samples are functionally equivalent. A Foo can have an integer, or it can not have an integer. Viewed only from the perspective of functional correctness, one might make arguments in favor of either one of these over the other.

However, considered through the principle that Good code is easy to change, the second design is clearly advantageous over the first. Imagine that bar is a 0-based index of a word in a string, and null represents the entire string together. Then one day we decide we want the entire string to be represented by the number -.

With the first of these two code fragments, doing this requires the following diff:

-new Foo(null);
+new Foo(-1);
new Foo(1);
-new Foo(null);
+new Foo(-1);

With the second of these two code fragments, the /second/ of these two code changes becomes only 1 line of code:

-    this(null);
+    this(-1);

Viewed in this context, the second is clearly cleaner code than the first. Obviously this is a toy example, but it represents a real issue in code. Imagine if we really wanted to make a change to a default value that was passed a hundred different times in a hundred different boilerplate method invocations? A disaster, that takes forever to write and is hard to code review. If we assume that code should be easy to read and easy to change, then suddenly our code fails both of these tests.

3. Good code is self-explanatory

This is the primary way that we as developers act to achieve the goal that good code is easy to read. Obviously some comments are fundamentally necessary, and commenting code is a great way of making it easier to understand. But not all code that is hard to read should be explained with a comment. Sometimes the question to ask is, why is the code hard to read?

Examples where this occurs include bad variable names and indecipherable expressions. To understand the first of these, consider the following two examples:

private static final int I = 5;

...

if (foo.getDate().isAfter(bar.plusDays(I) && baz) { 
   //do something
}

Here we have some code that compares two dates, foo and bar. But what the hell is it doing? I have no idea! It runs the body of the if clause if foo is more than 5 days after bar, and baz is true, but why? What's going on? I can't tell.

Now read the code again with better variable names:

private static final int DAYS_IN_DEFAULT_TO_EMAIL = 5;

if (today.getDate().isAfter(dateLastPaymentWasDue.plusDays(DAYS_IN_DEFAULT_TO_EMAIL)) && customerIsUnpaid) {
  //do something
}

Suddenly I can tell exactly what the developer was thinking when they wrote this: they wanted to send an email five days after a late payment. Anyone reading this can understand that, no comment necessary.

The second problem is just as bad. Consider the case where we are trying to construct an AST:

public Node(List<Node> children, String val) {
   this.children = children;
   this.val = val;
}

...

new Node(Arrays.asList(new Node(new ArrayList<>(), "foo"), new Node(new ArrayList<>(), "bar")), "baz");

You can probably read this AST, because it's still quite simple, but it's difficult for two reasons. The first is that we have a bunch of boilerplate array construction code in this expression making it harder to see the signal for the noise. And the second is because there is code which is nonlocal in nature. "baz" pertains to the root node, but is at the end of the statement, completely separate from the rest of the code that constructs this node. If we had hundreds of nested subtrees in this AST, it would be very hard for me to place each string with the node it corresponds to.

Now we change the API just slightly, and watch what happens:

public Node(String val, Node... children) {
    this.val = val;
    this.children = Arrays.asList(children);
}

new Node("baz", new Node("foo"), new Node("bar"));

The code got shorter, easier to read, and more modular. Clearly a profound advantage.

4. Good code takes time and requires ownership

This final piece is an addendum which explains some priorities that are important if we want to be able to achieve goals 1-3. Most significant of these is, don't be lazy. "It would be more work" or "we haven't done it that way before" is almost never a valid reason not to do something. If a change would require a large-scale refactoring or redesign in order to be effective, and therefore would delay the accomplishment of the primary task by a figure of several weeks or months, this is less true. But if you can take only a couple extra days to achieve cleaner code, don't wait. Do it as part of the commit that writes the feature.

More significantly, do it even if it requires making a modification to someone else's code. Code review exists in order to help us in this case. Make the change to the other person's code, and submit it for code review. Make sure to tag the owner of the code so they know to review it (for example @dwightguth). But don't be afraid to make changes outside the immediate scope of your change if it's needed for your code to be clean. If there's a problem with the change and its larger implications, that's what the code review process is for. A good owner of code accepts patches to their code by others when it is clearly needful.

One final note here as a warning, however: making a change to someone else's code to generalize or add a feature that is needed by your own code makes sense however. Making changes to someone else's code that introduce elements of your own domain into code properly treating another domain does not. This is a violation of principle #2: Good code is modular. Ultimately you need to work with the owner of the code you depend on to figure out the right general way to provide the functionality you need to make your own code work, without bogging other parts of the codebase down in details they should not be privy to. But as long as the changes you make to core libraries are general and reusable, go for it!

References

Please note I have not had the opportunity to read or look over these texts. I reserve the right to disagree with the principles therein if I am able to construct a meaningful example which demonstrates the reason why I believe my own approach is superior. However, if you want to read these books and follow their principles, I am perfectly willing to engage in discussions about specific portions of these books as part of a disagreement about design in a code review.