https://github.com/johnousterhout/aposd-vs-clean-code

A discussion between John Ousterhout and Robert Martin about differences between John's book "A Philosophy of Software Design" and Bob's book "Clean Code".
https://github.com/johnousterhout/aposd-vs-clean-code
Last synced: 3 months ago
JSON representation
A discussion between John Ousterhout and Robert Martin about differences between John's book "A Philosophy of Software Design" and Bob's book "Clean Code".
Host: GitHub
URL: https://github.com/johnousterhout/aposd-vs-clean-code
Owner: johnousterhout
Created: 2024-09-04T17:52:55.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-02-24T17:21:42.000Z (3 months ago)
Last Synced: 2025-02-24T18:27:36.455Z (3 months ago)
Size: 168 KB
Stars: 125
Watchers: 8
Forks: 6
Open Issues: 1
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

my-awesome-github-stars - johnousterhout/aposd-vs-clean-code - A discussion between John Ousterhout and Robert Martin about differences between John's book "A Philosophy of Software Design" and Bob's book "Clean Code". (Others)
README

        *(This document is the result of a series of discussions, some online and

some in person, held between Robert "Uncle Bob" Martin and John Ousterhout between

September 2024 and February 2025. If you would like to comment on anything

in this discussion, we recommend that you do so on the [Google group

associated with APOSD](https://groups.google.com/g/software-design-book))*

## Introductions

**JOHN:**

Hi (Uncle) Bob! You and I have each written books on software design.

We agree on some things, but there are some pretty big differences of

opinion between my recent book *A Philosophy of Software Design*

(hereafter "APOSD") and your classic book *Clean Code*. Thanks for

agreeing to discuss those differences here.

**UB:**

My pleasure John.  Before we begin, let me say that I've carefully read through your book and I found it very enjoyable, and full of valuable insights.  There are some things I disagree with you on, such as TDD, and Abstraction-First incrementalism, but overall I enjoyed it a lot.

**JOHN:**

I'd like to discuss three topics with you: method length, comments,

and test-driven development. But before getting into these,

let's start by comparing overall philosophies. When you hear about a

new idea related to software design, how do you decide whether or not

to endorse that idea?

I'll go first. For me, the fundamental goal of software design is

to make it easy to understand and modify the system. I use the term

"complexity" to refer to things that make it hard to understand and

modify a system. The most important contributors

to complexity relate to information:

* How much information must a developer have in their head in order to carry out a task?

* How accessible and obvious is the information that the developer needs?

The more information a developer needs to have, the harder it will be

for them to work on the system. Things get even worse if the required

information isn't obvious. The worst case is when there is a crucial

piece of information hidden in some far-away piece of code

that the developer has never heard of.

When I'm evaluating an idea related to software design, I ask whether

it will reduce complexity. This usually means either reducing the amount

of information a developer has to know, or making the required information

more obvious.

Now over to you: are there general principles that you use when deciding

which ideas to endorse?

**UB:**

I agree with your approach. A discipline or technique should make the job of programmers easier. I would add that the programmer we want to help most is not the author.  The programmer whose job we want to make easier is the programmer who must read and understand the code written by others (or by themself a week later).  Programmers spend far more hours reading code than writing code, so the activity we want to ease is that of reading.

## Method Length

**JOHN:**

Our first area of disagreement is method length.

On page 34 of *Clean Code* you say "The first rule of functions is that

they should be small. The second rule of functions is that

*they should be smaller than that*." Later on, you say "Functions

should hardly ever be 20 lines long" and suggest that functions

should be "just two, three, or four lines long". On page 35, you

say "Blocks within `if` statements, `else` statements, `while` statements,

and so on should be one line long. Probably that line should be a function

call." I couldn't find anything in *Clean Code* to suggest that a function

could ever be too short.

I agree that dividing up code into relatively small units ("modular design")

is one of the most important ways to reduce the amount of information a

programmer has to keep in their mind at once. The idea, of course, is to take a

complex chunk of functionality and encapsulate it in a separate method

with a simple interface. Developers can then harness the functionality

of the method (or read code that invokes the method) without learning

the details of how the method is implemented; they only need to learn its

interface. The best methods are those that provide a lot of functionality

but have a very simple interface: they replace a large cognitive load

(reading the detailed implementation) with a much smaller

cognitive load (learning the interface). I call these methods "deep".

However, like most ideas in software design, decomposition can be taken too far.

As methods get smaller and smaller there is less and less

benefit to further subdivision.

The amount of functionality hidden behind each interface

drops, while the interfaces often become more complex.

I call these interfaces "shallow": they don't help much in terms of

reducing what the programmer needs to know. Eventually, the point is

reached where someone using the method needs

to understand every aspect of its implementation. Such methods

are usually pointless.

Another problem with decomposing too far is that it tends to

result in *entanglement*. Two methods

are entangled (or "conjoined" in APOSD terminology) if, in order to

understand how one of them works internally, you also need to read the

code of the other. If you've ever found yourself flipping back and forth

between the implementations of two methods as you read code, that's a

red flag that the methods might be entangled. Entangled methods

are hard to read because the information you need to have in your head

at once isn't all in the same place. Entangled methods can usually

be improved by combining them so that all the code is in one place.

The advice in *Clean Code* on method length is so extreme that it encourages

programmers to create teeny-tiny methods that suffer from both shallow

interfaces and entanglement.  Setting arbitrary numerical limits such

as 2-4 lines in a method and a single line in the body of an

`if` or `while` statement exacerbates this problem.

**UB:**

While I do strongly recommend very short functions, I don't think it's fair to say that the book sets arbitrary numerical limits. The 2-4 line functions that you referred to on page 34 were part of the _Sparkle_ applet that Kent Beck and I wrote together in 1999 as an exercise for learning TDD. I thought it was remarkable that most of the functions in that applet were 2-4 lines long because it was a Swing program; and Swing programs tend to have very long methods.

As for setting limits, on page 13 I make clear that although the recommendations in the book have worked well for me and the other authors, they might not work for everyone.  I claimed no final authority, nor even any absolute "rightness". They are offered for consideration.

**JOHN:**

I think these problems will be easiest to understand if we look at

specific code examples. But before we do that, let me ask you, Bob:

do you believe that it's possible for code to be over-decomposed, or

is smaller always better? And, if you believe that over-decomposition

is possible, how do you recognize when it has occurred?

**UB:**

It is certainly possible to over-decompose code.  Here's an example:

```java

void doSomething() {doTheThing()} // over-decomposed.

```

The strategy that I use for deciding how far to take decomposition is the old rule that a method should do "*One Thing*".  If I can *meaningfully* extract one method from another, then the original method did more than one thing.  "Meaningfully" means that the extracted functionality can be given a descriptive name; and that it does less than the original method.

**JOHN:**

Unfortunately the One Thing approach will lead to over-decomposition:

 1. The term "one thing" is vague and easy to abuse. For example, if a method has two lines of code, isn't it doing two things?

 2. You haven't provided any useful guardrails to prevent over-decomposition. The example you gave is too extreme to be useful, and the "can it be named" qualification doesn't help: anything can be named.

 3. The One Thing approach is simply wrong in many cases. If two things are closely related, it might well make sense to implement them in a single method. For example, any thread-safe method will first have to acquire a lock, then carry out its function. These are two "things", but they belong in the same method.

**UB:**

Let me tackle the last thing first.  You suggested that locking the thread, and performing a critical section should be together in the same method.  However, I would be tempted to separate the locking from the critical section.

```java

void concurrentOperation() {

    lock()

    criticalSection();

    unlock()

}

```

This decouples the critical section from the lock and allows it to be called at times when locking isn't necessary (e.g. in single thread mode) or when a lock has already been set by someone else.

Now, on to the "ease of abuse" argument.  I don't consider that to be a significant concern. `If` statements are easy to abuse.  `Switch` statements are easy to abuse.  Assignment statements are easy to abuse.  The fact that something is easy to abuse does not mean that it should be avoided or suppressed.  It simply means people should take appropriate care. There will always be this thing called: _judgment_.

So when faced with this snippet of code in a larger method:

```java

...

amountOwed=0;

totalPoints=0;

...

```

It would be poor judgement to extract them as follows, because the extraction is not meaningful.  The implementation is not more deeply detailed than the interface.

```java

void clearAmountOwed() {

    amountOwed=0;

}

void clearTotalPoints() {

  totalPoints=0;

}

```

However, it may be good judgement to extract them as follows because the interface is abstract, and the implementation has deeper detail.

```java

void clearTotals() {

    amountOwed=0;

    totalPoints=0;

}

```

The latter has a nice descriptive name that is abstract enough to be meaningful without being redundant.  And the two lines together are strongly related so as to qualify for doing _one thing_: initialization.

**JOHN:**

Of course anything can be abused. But the best approaches to design

encourage people to do things the right way and discourage abuse.

Unfortunately, the One Thing Rule encourages abuse for the reasons I

gave above.

And of course software designers will need to use judgment: it isn't

possible to provide precise recipes for software design.

But good judgment requires principles and guidance. The

*Clean Code* arguments about decomposition, including the One Thing

Rule, are one-sided. They give strong, concrete, quantitative

advice about when to chop things up, with virtually no guidance for

how to tell you've gone too far. All I could find is a 2-sentence

example on page 36 about Listing 3-3 (which is pretty trivial),

buried in the middle of exhortations to "chop, chop, chop".

One of the reasons I use the deep/shallow characterization is that it

captures both sides of the tradeoff; it will tell you when a decomposition

is good and also when decomposition makes things worse.

**UB:**

You make a good point that I don't talk much, in the book, about how to make the judgement call.  Back in 2008 my concern was breaking the habit of the very large functions that were common in those early days of the web.  I have been more balanced in the 2d ed.

Still, if I must err, I'd rather err on the side of decomposition.  There is value in considering, and visualizing decompositions.  They can always be inlined if we judge them to have gone too far.

**JOHN:**

Coming back to your `clearTotals` example:

* The `clearTotals` method seems to contradict the One Thing Rule: the

  variables `amountOwed` and `totalPoints` don't seem particularly related, so

  initializing them both is doing two things, no? You say that both

  statements are performing initialization, which makes it just one thing

  (initialization). Does that mean it would also be okay to have a single

  method that initializes two completely independent objects with nothing in

  common? I suspect not. It feels like you are struggling to create a clean

  framework for applying the One Thing Rule; that makes me think it isn't

  a good rule.

* Without seeing more context I'm skeptical that the `clearTotals`

  method makes sense.

**UB:**

I hope you agree that between these two examples, the former is a bit better.

```java

public String makeStatement() {

    clearTotals();

    return makeHeader() + makeRentalDetails() + makeFooter();

}

```

---

```java

public String makeStatement() {

    amountOwed=0;

    totalPoints=0;

    return makeHeader() + makeRentalDetails() + makeFooter();

}

```

**JOHN:**

Well, actually, no. The second example is completely clear and obvious:

I don't see anything to be gained by splitting it up.

**SPOCK (a.k.a UB):**

Fascinating.

**JOHN:**

I think it will be easier to clarify our differences if we consider

a nontrivial code example. Let's look at the `PrimeGenerator` class from

*Clean Code*, which is Listing 10-8 on pages 145-146. This Java class

generates the first N prime numbers:

```java

package literatePrimes;

import java.util.ArrayList;

public class PrimeGenerator {

    private static int[] primes;

    private static ArrayList multiplesOfPrimeFactors;

    

    protected static int[] generate(int n) {

        primes = new int[n];

        multiplesOfPrimeFactors = new ArrayList();

        set2AsFirstPrime();

        checkOddNumbersForSubsequentPrimes();

        return primes;

    }

    

    private static void set2AsFirstPrime() {

        primes[0] = 2;

        multiplesOfPrimeFactors.add(2);

    }

    

    private static void checkOddNumbersForSubsequentPrimes() {

        int primeIndex = 1;

        for (int candidate = 3; primeIndex < primes.length; candidate += 2) {

            if (isPrime(candidate))

                primes[primeIndex++] = candidate;

        }

    }

    

    private static boolean isPrime(int candidate) {

        if (isLeastRelevantMultipleOfLargerPrimeFactor(candidate)) {

            multiplesOfPrimeFactors.add(candidate);

            return false;

        }

        return isNotMultipleOfAnyPreviousPrimeFactor(candidate);

    }

    

    private static boolean isLeastRelevantMultipleOfLargerPrimeFactor(int candidate) {

        int nextLargerPrimeFactor = primes[multiplesOfPrimeFactors.size()];

        int leastRelevantMultiple = nextLargerPrimeFactor * nextLargerPrimeFactor;

        return candidate == leastRelevantMultiple;

    }

    

    private static boolean isNotMultipleOfAnyPreviousPrimeFactor(int candidate) {

        for (int n = 1; n < multiplesOfPrimeFactors.size(); n++) {

            if (isMultipleOfNthPrimeFactor(candidate, n))

                return false;

        }

        return true;

    }

    

    private static boolean isMultipleOfNthPrimeFactor(int candidate, int n) {

        return candidate == smallestOddNthMultipleNotLessThanCandidate(candidate, n);

    }

    

    private static int smallestOddNthMultipleNotLessThanCandidate(int candidate, int n) {

        int multiple = multiplesOfPrimeFactors.get(n);

        while (multiple < candidate)

            multiple += 2 * primes[n];

        multiplesOfPrimeFactors.set(n, multiple);

        return multiple;

    }

}

```

Before we dive into this code, I'd encourage everyone reading

this article to take time to read over the code and draw your own conclusions

about it. Did you find the code easy to understand? If so, why? If not, what

makes it complex?

Also, Bob, can you confirm that you stand by this code (i.e. the code

properly exemplifies the design philosophy of *Clean Code* and this

is the way you believe the code should appear if it were used in

production)?

**UB:**

Ah, yes.  The `PrimeGenerator`.  This code comes from the 1982 paper on [*Literate Programming*](https://www.cs.tufts.edu/~nr/cs257/archive/literate-programming/01-knuth-lp.pdf) written by Donald Knuth.  The program was originally written in Pascal, and was automatically generated by Knuth's WEB system into a single very large method which I translated into Java.

Of course this code was never meant for production.  Both Knuth and I used it as a pedagogical example.  In *Clean Code* it appears in a chapter named *Classes*.  The lesson of the chapter is that a very large method will often contain many different sections of code that are better decomposed into independent classes.

In the chapter I extracted three classes from that function: `PrimePrinter`, `RowColumnPagePrinter` and `PrimeGenerator`.

One of those extracted classes was the `PrimeGenerator`. It had the following code (which I did not publish in the book.)  The variable names and the overall structure are Knuth's.

```java

public class PrimeGenerator {

    protected static int[] generate(int n) {

        int[] p = new int[n];

        ArrayList mult = new ArrayList();

        p[0] = 2;

        mult.add(2);

        int k = 1;

        for (int j = 3; k < p.length; j += 2) {

            boolean jprime = false;

            int ord = mult.size();

            int square = p[ord] * p[ord];

            if (j == square) {

	            mult.add(j);

            } else {

                jprime=true;

                for (int mi = 1; mi < ord; mi++) {

                    int m = mult.get(mi);

                    while (m < j)

                        m += 2 * p[mi];

                    mult.set(mi, m);

                    if (j == m) {

                        jprime = false;

                        break;

                    }

                }

            }

            if (jprime)

                p[k++] = j;

        }

        return p;

    }

}

```

Even though I was done with the lesson of the chapter, I didn't want to leave that method looking so outdated.  So I cleaned it up a bit as an afterthought.  My goal was not to describe how to generate prime numbers.  I wanted my readers to see how large methods, that violate the Single Responsibility Principle, can be broken down into a few smaller well-named classes containing a few smaller well-named methods.

**JOHN:**

Thanks for the background. Even though the details of that code weren't

the main point of the chapter, presumably the code represents what you think

is the "right" and "cleanest" way to do things, given the algorithm at hand.

And that's where I disagree.

There are many design problems with `PrimeGenerator`, but for now I'll

focus on method length. The code is chopped up so much (8 teeny-tiny methods)

that it's difficult to read. For starters, consider the

`isNotMultipleOfAnyPreviousPrimeFactor` method. This method invokes

`isMultipleOfNthPrimeFactor`, which invokes

`smallestOddNthMultipleNotLessThanCandidate`. These methods are shallow

and entangled:

in order to understand

`isNot...` you have to read the other two

methods and load all of that code into your mind at once. For example,

`isNot...` has side effects (it modifies `multiplesOfPrimeFactors`) but

you can't see that unless you read all three methods.

**UB:**

I think you have a point.  Eighteen years ago, when I was in the throes of this refactoring, the names and structure made perfect sense to me.  They make sense to me now, too -- but that's because I once again understand the algorithm.  When I returned to the algorithm for the first time a few days ago, I  struggled with the names and structure.  Once I understood the algorithm the names and structure made perfect sense.

**JOHN:**

Those names are problematic even for someone who understands the algorithm;

we'll talk about them a bit later, when discussing comments. And, if code

no longer makes sense to the writer when the writer returns to the code later,

that means the code is problematic. The fact that code can eventually

be understood (with great pain and suffering) does not excuse its entanglement.

**UB:**

Would that we had such a crystal ball that we could help our future selves avoid such "_great pain and suffering_".  ;-)

**JOHN:**

There is no need for a crystal ball. The problems with `PrimeGenerator` are

pretty obvious, such as the entanglement and interface complexity; maybe you

were surprised that it is hard to understand, but I am not. Said another

way, if you are unable to predict whether your code will be easy to

understand, there are problems with your design methodology.

**UB:**

Fair enough.  I will say, however, that I had equal "_pain and suffering_" interpreting your rewrite (below).  So, apparently, neither of our methodologies were sufficient to rescue our readers from such struggles.

**JOHN:**

Going back to my introductory remarks about complexity, splitting up

`isNot...` into three methods doesn't reduce the amount of information

you have to keep in your mind. It just spreads it out, so it isn't as

obvious that you need to read all three methods together. And, it's harder

to see the overall structure of the code because it's split up: readers have

to flip back and forth between the methods, effectively reconstructing a

monolithic version in their minds. Because the pieces are all related,

this code will be easiest to understand if it's all together in one place.

**UB:**

I disagree.  Here is `isNotMultipleOfAnyPreviousPrimeFactor`.

```java

private static boolean isNotMultipleOfAnyPreviousPrimeFactor(int candidate) {

    for (int n = 1; n < multiplesOfPrimeFactors.size(); n++) {

        if (isMultipleOfNthPrimeFactor(candidate, n))

            return false;

    }

    return true;

}

```

If you trust the `isMultipleOfNthPrimeFactor` method, then this method stands alone quite nicely.  I mean we loop through all n previous primes and see if the candidate is a multiple.  That's pretty straight forward.

Now it would be fair to ask the question how we determine whether the candidate is a multiple, and in that case you'd want to inspect the `isMultiple...` method.

**JOHN:**

This code does appear to be simple and obvious.

Unfortunately, this appearance is deceiving.

If a reader trusts the name `isMultipleOfNthPrimeFactor` (which suggests

a predicate with no side effects) and doesn't bother to read its code, they

will not realize that it has side effects, and that the side effects

create a constraint on the `candidate` argument to `isNot...`

(it must be monotonically non-decreasing from invocation

to invocation). To understand these behaviors, you have to

read both `isMultiple...` and `smallestOdd...`. The current decomposition

hides this important information from the reader.

If there is one thing more likely to result in bugs than not understanding code,

it's thinking you understand it when you don't.

**UB:**

That's a valid concern.  However, it is tempered by the fact that the functions are presented in the order they are called.  Thus we can expect that the reader has already seen the main loop and understands that `candidate` increases by two each iteration.

The side effect buried down in `smallestOddNth...` is a bit more problematic. Now that you've pointed it out I don't like it much.  Still, that side effect should not confound the basic understanding of `isNot...`.

In general, if you trust the names of the methods being called then understanding the caller does not require understanding the callee.  For example:

```java

for (Employee e : employees)

    if (e.shouldPayToday())

        e.pay();

```

This would not be made more understandable if we replaced those two method calls with their implementations.  Such a replacement would simply obscure the intent.

**JOHN:**

This example works because the called methods are relatively independent of

the parent. Unfortunately that is not the case for `isNot...`.

In fact, `isNot...` is not only entangled with the methods it calls, it's also

entangled with its callers. `isNot...` only works if it is invoked in

a loop where `candidate` increases monotonically. To convince yourself

that it works, you have to find the code that invokes `isNot...` and

make sure that `candidate` never decreases from one call to the next.

Separating `isNot...` from the loop that invokes it makes it harder

for readers to convince themselves that it works.

**UB:**

Which, as I said before, is why the methods are ordered the way they are.  I expect that by the time you get to `isNot...` you've already read `checkOddNumbersForSubsequentPrimes` and know that `candidate` increases by twos.

**JOHN:**

Let's discuss this briefly, because it's another area where I

disagree with *Clean Code*. If methods are entangled, there is no

clever ordering of the method definitions that will fix the problem.

In this particular situation two other methods intervene between the

loop in `checkOdd...` and `isNot...`, so readers will have forgotten

the loop context before they get to `isNot...`. Furthermore, the actual

code that creates a dependency on the loop isn't in `isNot...`: it's in

`smallestOdd...`, which is even farther away from `checkOdd...`.

**UB:**

I sincerely doubt anyone is going to forget that `candidate` is being increased by twos.  It's a pretty obvious way to avoid waste.

**JOHN:**

In my opening remarks I talked about how it's important to reduce the

amount of information people have to keep in their minds at once.

In this situation, readers have to remember that loop while they read

four intervening methods that are mostly unrelated to the loop. You apparently think

this will be easy and natural (I disagree). But it's even worse than

that. There is no indication which parts of `checkOdd...` will be important

later on, so the only safe approach is to remember *everything*, from *every*

method, until you have encountered every other method that could possibly

descend from it. And, to make the connection between the pieces, readers

must also reconstruct the call graph to notice that, even through

4 layers of method call, the code in `smallestOdd...` places constraints

on the loop in `checkOdd...`. This is an unreasonable cognitive burden to

place on readers.

If two pieces of code are tightly related, the solution is to bring

them together. Separating the pieces, even in physically adjacent methods,

makes the code harder to understand.

To me, all of the methods in `PrimeGenerator` are entangled: in order to

understand the class I had to load all of them into my mind

at once. I was constantly flipping back and forth between the methods

as I read the code. This is a red flag indicating

that the code has been over-decomposed.

Bob, can you help me understand why you divided the code into such

tiny methods?

Is there some benefit to having so many methods that I have missed?

**UB:**

I think you and I are just going to disagree on this.  In general, I believe in the principle of small well-named methods and the separation of concerns. Generally speaking if you can break a large method into several well-named smaller methods with different concerns, and by doing so expose their interfaces, and the high level functional decomposition, then that's a good thing.

* Looping over the odd numbers is one concern.

* Determining primality is another.

* Marking off the multiples of primes is yet another.

It seems to me that separating and naming those concerns helps to expose the way the algorithm works -- even at the expense of some entanglement.

In your solution, which we are soon to see below, you break the algorithm up in a similar way.  However, instead of separating the concerns into functions, you separate them into sections with comments above them.

You mentioned that in my solution readers will have to keep the loop context in mind while reading the other functions.  I suggest that in your solution, readers will have to keep the loop context in mind while reading your explanatory comments.  They may have to "flip back and forth" between the sections in order to establish their understanding.

Now perhaps you are concerned that in my solution the "flipping" is a longer distance (in lines) than in yours.  I'm not sure that's a significant point since they all fit on the same screen (at least they do on my screen) and the landmarks are pretty obvious.

### Method Length Summary

**JOHN:**

It sounds like it's time to wrap up this section. Is this a reasonable

summary of where we agree and disagree?

* We agree that modular design is a good thing.

* We agree that it is possible to over-decompose, and that *Clean Code 1st ed.*

  doesn't provide much guidance on how to recognize over-decomposition.

* We disagree on how far to decompose: you recommend decomposing

  code into much smaller units than I do. You believe that

  the additional decomposition you recommend makes code easier to

  understand; I believe that it goes too far and actually makes code

  more difficult to understand.

* You believe that the One Thing Rule, applied with judgment, will

  lead to appropriate decompositions. I believe it lacks guardrails

  and will lead to over-decomposition.

* We agree that the internal decomposition of `PrimeGenerator` into

  methods is problematic. You point out that your main goal in writing

  `PrimeGenerator` was to show how to decompose into classes, not

  so much how to decompose a class internally into methods.

* Entanglement between methods in a class doesn't bother you

  as much as it bothers me. You believe that the benefits of decomposing

  methods can compensate for problems caused by entanglement.

  I believe they can't: when decomposed methods are entangled,

  they are harder to read than if they were not decomposed, and this

  defeats the whole purpose of decomposition.

* You believe that ordering the methods in a class can help to

  compensate for entanglement between methods; I don't.

**UB:**

I think this is a fair assessment of our agreements and disagreements.  We both value decomposition,

and we both avoid entanglement; but we disagree on the relative weighting of those two values.

## Comments

**JOHN:**

Let's move on to the second area of disagreement: comments. In my opinion,

the *Clean Code* approach to commenting results in code with

inadequate documentation, which increases the cost of software development.

I'm sure you disagree, so let's discuss.

Here is what *Clean Code* says about comments (page 54):

> The proper use of comments is to compensate for our failure to express

  ourselves in code. Note that I use the word failure. I meant it.

  Comments are always failures. We must have them because we cannot always

  figure out how to express ourselves without them, but their use is not

  a cause for celebration... Every time you write a comment, you should

  grimace and feel the failure of your ability of expression.

I have to be honest: I was horrified when I first read this text, and it

still makes me cringe. This stigmatizes writing comments. Junior developers

will think "if I write comments, people may think I've failed, so the

safest thing is to write no comments."

**UB:**

That chapter begins with these words:

>*Nothing can be quite so helpful as a well placed comment.*

It goes on to say that comments are a *necessary* evil.

The only way a reader could infer that they should write no comments is if they hadn't actually read the chapter.  The chapter walks through a series of comments, some bad, some good.

**JOHN:**

*Clean Code* focuses a lot more on the "evil" aspects of comments than the

"necessary" aspects. The sentence you quoted above is followed by two

sentences criticizing comments. Chapter 4 spends 4 pages talking about good

comments, followed by 15 pages talking about bad comments. There are snubs

like "the only truly good comment is the comment you found a way

not to write". And "Comments are always failures" is so catchy

that it's the one thing readers are most likely to remember from the

chapter.

**UB:**

The difference in page count is because there are just a few ways to write good comments, and so many more ways to write bad ones.

**JOHN:**

I disagree; this illustrates your bias against comments. If you look at

Chapter 13 of APOSD, it finds a lot more

constructive ways to use comments than *Clean Code*. And if you compare

the tone of Chapter 13 of APOSD with Chapter 4 of *Clean Code*, the hostility

of *Clean Code* towards comments becomes pretty clear.

**UB:**

I'll leave you to balance that last comment with the initial statement, and the final example, in the _Comments_ chapter. They do not communicate "hostility".

I'm not hostile to comments in general.  I _am_ very hostile to gratuitous comments.

You and I likely both survived through a time when comments were absolutely necessary.  In the '70s and '80s I was an assembly language programmer.  I also wrote a bit of FORTRAN. Programs in those languages that had no comments were impenetrable.

As a result it became conventional wisdom to write comments by default.  And, indeed, computer science students were taught to write comments uncritically.  Comments became _pure good_.

In _Clean Code_ I decided to fight that mindset.  Comments can be _really bad_ as well as good.

**JOHN:**

I don't agree that comments are less necessary today than they were

40 years ago.

Comments are crucially important and add enormous value to software.

The problem is that there is a lot of important information that simply

cannot be expressed in code. By adding comments to fill in this missing

information, developers can make code dramatically easier to read.

This is not a "failure of their ability to express themselves", as you

put it.

**UB:**

It's very true that there is important information that is not, or cannot be, expressed in code.  That's a failure.  A failure of our languages, or of our ability to use them to express ourselves.  In every case a comment is a failure of our ability to use our languages to express our intent.

And we fail at that very frequently, and so comments are a necessary evil -- or, if you prefer, _an unfortunate necessity_.  If we had the perfect programming language (TM) we would never write another comment.

**JOHN:**

I don't agree that a perfect programming language would

eliminate the need for comments. Comments and code serve very different

purposes, so it's not obvious to me that we should use the same

language for both. In my experience, English works quite well

as a language for comments.

Why do you feel that information about a program should

be expressed entirely in code, rather than using a combination of code

and English?

**UB:**

I bemoan the fact that we must sometimes use a human language instead of a programming language.  Human languages are imprecise and full of ambiguities. Using a human language to describe something as precise as a program is very hard, and fraught with many opportunities for error and inadvertent misinformation.

**JOHN:**

I agree that English isn't always as precise as code, but it can still be

used in precise ways and comments typically don't need the same

degree of precision as code.

Comments often contain qualitative information such

as *why* something is being done, or the overall idea of something.

English works better for these than code because it is a more

expressive language.

**UB:**

I have no argument with that statement.

**JOHN:**

Are you concerned that comments will be incorrect or

misleading and that this will slow down software development?

I often hear people complain about stale comments (usually as an excuse

for writing no comments at all) but

I have not found them be a significant problem

over my career. Incorrect comments do happen, but I don't encounter them

very often and when I do, they rarely cost me much time. In contrast, I waste

*enormous* amounts of time because of inadequate documentation; it's not

unusual for me to spend 50-80% of my development time wading through

code to figure out things that would be obvious if the code was properly

commented.

**UB:**

You and I have had some very different experiences.

I have certainly been helped by well-placed comments.  I have also, just as certainly, (and within this very document) been distracted and confused by a comment that was incorrect, misplaced, gratuitous, or otherwise just plain bad.

**JOHN:**

I invite everyone reading this article to ask yourself the following questions:

* How much does your software development speed suffer because of

  incorrect comments?

* How much does your software development speed suffer because of

  missing comments?

For me the cost of missing comments is easily 10-100x the cost of incorrect

comments. That is why I cringe when I see things in *Clean Code* that

discourage people from writing comments.

Let's consider the `PrimeGenerator` class. There is not a single comment

in that code; does this seem appropriate to you?

**UB:**

I think it was appropriate for the purpose for which I wrote it. It was an adjunct to the lesson that very large methods can be broken down into smaller classes containing smaller methods. Adding lots of explanatory comments would have detracted from that point.

In general, however, the commenting style I used in Listing 4-8 is more appropriate.  That listing, at the very end of the *Comments* chapter, describes yet another `PrimeGenerator` with a slightly different algorithm, and a better set of comments.

**JOHN:**

I disagree that adding comments would have distracted from your point,

and I think Listing 4-8 is also woefully undercommented.

But let's not argue about either of those issues. Instead, let's discuss

what comments the `PrimeGenerator` code *should* have if it were used in production.

I will make some suggestions, and you can agree or disagree.

For starters, let's discuss your use of megasyllabic names like

`isLeastRelevantMultipleOfLargerPrimeFactor`.  My understanding is that

you advocate using names like this instead of using shorter names

augmented with descriptive comments: you're effectively moving the

comments into code. To me, this approach is problematic:

* Long names are awkward. Developers effectively have to retype

  the documentation for a method every time they invoke it, and the long

  names waste horizontal space and trigger line wraps in the code. The names are

  also awkward to read: my mind wants to parse every syllable every time

  I read it, which slows me down. Notice that both you and I resorted to

  abbreviating names in this discussion: that's an indication that

  the long names are awkward and unhelpful.

* The names are hard to parse and don't convey information as effectively

  as a comment.

  When students read `PrimeGenerator` one of the first things they

  complain about is the long names (students can't make sense of them).

  For example, the name above is

  vague and cryptic: what does "least relevant" mean, and what is a

  "larger prime factor"? Even with a complete understanding of the code in

  the method, it's hard for me to make sense of the name.  If this name

  is going to eliminate the need for a comment, it needs to be even longer.

In my opinion, the traditional approach of using shorter names with

descriptive comments is more convenient and conveys the required information

more effectively. What advantage is there in the approach you advocate?

**UB:**

"_Megasyllabic_": Great word!

I like my method names to be sentence fragments that fit nicely with keywords and assignment statements.  It makes the code a bit more natural to read.

```java

if (isTooHot)

    cooler.turnOn();

```

I also follow a simple rule about the length of names.  The larger the scope of a method, the shorter its name should be and vice versa -- the shorter the scope the longer the  name.  The private methods I extracted in this case live in very small scopes, and so have longish names.  Methods like this are typically called from only one place, so there is no burden on the programmer to remember a long name for another call.

**JOHN:**

Names like `isTooHot` are totally fine by me.

My concern is about names like `isLeastRelevantMultipleOfLargerPrimeFactor`.

It's interesting that as methods get smaller and narrower, you recommend

longer names.

What this says to me is that the interfaces for those functions are

more complex, so it takes more words to describe them. This provides

supporting evidence for

my assertion a while back that the more you split up a method,

the shallower the resulting methods will be.

**UB:**

It's not the functions that get smaller, it's the scope that gets smaller.  A private function has a smaller scope than the public function that calls it.  A function called by that private function has an even smaller scope.  As we descend in scope, we also descend in situational detail.  Describing such detail often requires a long name, or a long comment.  I prefer to use a name.

As for long names being hard to parse, that's a matter of practice.  Code is full of things that take practice to get used to.

**JOHN:**

I don't accept this. Code may be full of things that take practice to get used

to, but that doesn't excuse it.

Approaches that require more practice are worse than

those that require less.

If it's going to take a lot of work to get comfortable with the long names

then there had better be some compensating benefit; so far I'm not seeing any.

And I don't see any reason to believe that practice will make those names

easier to digest.

In addition, your comment above violates one of my fundamental rules, which

is "complexity is in the eye of the reader". If you write code that someone

else thinks is complicated, then you must accept that the code is probably

complicated (unless you think the reader is completely incompetent). It

is not OK to make excuses or suggest that it is really the reader's problem

("you just don't have enough practice"). I'm going to have to live by this

same rule a bit later in our discussion.

**UB:**

Fair enough.  As for the meaning of "leastRelevant", that's a much larger problem that you and I will encounter shortly.  It has to do with the intimacy that the author has with the solution, and the reader's lack of that intimacy.

**JOHN:**

You still haven't answered my question: why is it better to use super-long names

rather than shorter names augmented with descriptive comments?

**UB:**

It's a matter of preference for me.  I prefer long names to comments.  I don't trust comments to be maintained, nor do I trust that they will be read.  Have you ever noticed that many IDEs paint comments in light grey so that they can be easily ignored?  It's harder to ignore a name than a comment.

(BTW, I have my IDE paint comments in bright fire-engine red)

**JOHN:**

I don't see why a monster name is more likely to be "maintained" than

a comment, and I don't agree that IDEs encourage people to ignore

comments (this is your bias coming out again). My current IDE (VSCode)

doesn't use a lighter color for comments.

My previous one (NetBeans) did, but the color scheme didn't hide the comments; it

distinguished them from the code in a way that made both code and comments

easier to read.

Now that we've discussed the specific issue of comments vs. long method

names, let's talk about comments in general. I think there are two major reasons

why comments are needed. The first reason for comments is abstraction.

Simply put, without comments there is no way to have abstraction or modularity.

Abstraction is one of the most important components of good software design.

I define an abstraction as "a simplified way of thinking about something

that omits unimportant details." The most obvious example of an abstraction

is a method. It should be possible to use a method without reading its code.

The way we achieve this is by writing a header comment that describes

the method's *interface* (all the information someone needs in order

to invoke the method). If the method is well-designed, the interface will be

much simpler than the code of the method (it omits implementation details),

so the comments reduce the amount of information people must have in

their heads.

**UB:**

Long ago, in a 1995 book, I defined abstraction as:

>*The amplification of the essential and the elimination of the irrelevant.*

I certainly agree that abstraction is of importance to good software design.  I also agree that well-placed comments can enhance the ability of readers to understand the abstractions we are attempting to employ.  I disagree that comments are the _only_, or even the _best_, way to understand those abstractions.  But sometimes they are the only option.

But consider:

```java

addSongToLibrary(String title, String[] authors, int durationInSeconds);

```

This seems like a very nice abstraction to me, and I cannot imagine how a comment might improve it.

**JOHN:**

Our definitions of abstraction are very similar; that's good to see.

However, the `addSongToLibrary` declaration is not (yet) a good abstraction

because it omits information

that is essential. In order to use `addSongToLibrary`, developers

need answers to the following questions:

* Is there any expected format for an author string, such as "LastName, FirstName"?

* Are the authors expected to be in alphabetical order? If not, is the order

  significant in some other way?

* What happens if there is already a song in the library with the given title

  but different authors? Is it replaced with the new one, or will the library

  keep multiple songs with the same title?

* How is the library stored (e.g. is it entirely in memory? saved on disk?)?

  If this information is documented somewhere else, such as the

  overall class documentation, then it need not be repeated here.

Thus `addSongToLibrary` needs quite a few comments.

Sometimes the signature of a method (names and types of the method, its

arguments, and its return value) contains all the information

needed to use it, but this is pretty rare. Just skim through the documentation

for your favorite library package: in how many cases could you understand how

to use a method with only its signature?

**UB:**

Yes, there are times when the signature of a method is an incomplete abstraction and a comment

is required.  This is especially true when the interface is part of a public API, or an API intended

for use by a separate team of developers.  Within a single development team, however, long descriptive

comments on interfaces are often more of an impediment than a help.  The team has intimate knowledge of the

internals of the system, and will generally be able to understand an interface simply from its

signature.

**JOHN:**

In one of our in-person discussions you argued that interface comments

are unnecessary because when a group of developers is working on a body

of code they can collectively keep the entire code "loaded" in their

minds, so comments are unnecessary: if you have a question, just ask the

person who is familiar with that code. This creates a huge cognitive load

to keep all that code mentally loaded, and it's hard for me to imagine

that it would actually work. Maybe your memory is better than mine, but I

find that I quickly forget code that I wrote just a few weeks ago. In

a project of any size, I think your approach would result in developers

spending large amounts of time reading code to re-derive the interfaces,

and probably making mistakes along the way. Spending a few minutes to

document the interfaces would save time, reduce cognitive load, and

reduce bugs.

**UB:**

I think that certain interfaces need comments, even if they are private to the team.  But I think it is more often the case that the team is familiar enough with the system that well named methods and arguments are sufficient.

**JOHN:**

Let's consider a specific example from `PrimeGenerator`: the `isMultipleOfNthPrimeFactor`

method. When someone reading the code encounters the call to `isMultiple...`

in `isNot...` they need to understand enough about how `isMultiple...` works

in order to see how it fits into the code of `isNot...`.

The method name does not fully document the interface, so if there

is no header comment then readers will have to read the code of `isMultiple`.

This will force readers to load more information into their

heads, which makes it harder to work in the code.

Here is my first attempt at a header comment for `isMultiple`:

```java

/**

 * Returns true if candidate is a multiple of primes[n], false otherwise.

 * May modify multiplesOfPrimeFactors[n].

 * @param candidate

 *      Number being tested for primality; must be at least as

 *      large as any value passed to this method in the past.

 * @param n

 *      Selects a prime number to test against; must be

 *      <= multiplesOfPrimeFactors.size().

 */

```

What do you think of this?

**UB:**

I think it's accurate.  I wouldn't delete it if I encountered it.  I don't think it should be a javadoc.

The first sentence is redundant with the name `isMultipleOfNthPrimeFactor` and so could be deleted.  The warning of the side effect is useful.

**JOHN:**

I agree that the first sentence is largely redundant with the name,

and I debated with myself about whether to keep it. I decided to keep it

because I think it is a bit more precise than the name; it's also easier

to read. You propose to eliminate the redundancy between the comment and

the method name by dropping the comment; I would eliminate the redundancy by

shortening the method name.

By the way, you complained earlier about comments being less precise than

code, but in this case the comment is *more* precise (the method

name can't include text like `primes[n]`).

**UB:**

Fair enough.  There are times when precision is better expressed in a comment.

Continuing with my critique of your comment above: The name `candidate` is synonymous with "Number being tested for primality".

In the end, however, all the words in a comment are just going to have to sit in my brain

until I understand why they are there.  I'm also going to have to worry if

they are accurate.  So I'm going to have to read the code to understand and

validate the comment.

**JOHN:**

Whoah. That loud sound you just heard was my jaw hitting the floor.

Help me understand this a bit better: approximately what

fraction of comments that you encounter in practice are you willing to

trust without reading the code to verify them?

**UB:**

I look at every comment as potential misinformation.  At best they are a way to crosscheck the author's intent against the code. The amount of credence I give to a comment depends a lot on how easy they make that crosscheck.  When I read a comment that does not cause me to crosscheck, then I consider it to be of no value.  When I see a comment that causes me to crosscheck, and when that crosscheck turns out to be valuable, then that's a really good comment.

Another way to say this is that the best comments tell me something surprising and verifiable about the code.  The worst are those that waste my time telling me something obvious, or incorrect.

**JOHN:**

It sounds like your answer is 0%: you don't trust any comment unless it has

been verified against the code. This makes no sense to me. As I said above, the vast

majority of comments are correct. It's not hard to write comments; the students

in my software design class are doing this pretty well within a few weeks.

It's also not hard to keep comments up to date as code evolves. Your refusal

to trust comments is another sign of your irrational bias against comments.

Refusing to trust comments incurs a very high cost. In order to understand

how to invoke a method, you will have to read all of the code of that method;

if the method invokes other methods, you will

also have to read them, and the methods they invoke, recursively. This is

an enormous amount of work in comparison to reading (and trusting) a

simple interface comment like the one I wrote above.

If you choose not to write an interface comment for methods, then you

leave the interface of that method undefined. Even if someone reads the

code of the method, they won't be able to tell which parts of the

implementation are expected to remain the same and which parts may

change (there is no way to specify this "contract" in code). This will

result in misunderstanding and more bugs.

**UB:**

Well, I guess I've just been burned more than you have.  I've gone down too many false comment induced rabbit holes, and wasted too much time on worthless word salads.

Of course my trust in comments is not a binary thing.  I read them if they are there; but

I don't implicitly trust them.  The more gratuitous I feel the author was, or the less adept at english the author is, the less I trust the comments.

As I said above, our IDEs tend to paint comments in an ignorable color.  I have my IDE paint comments in bright fire engine red because when I write a comment I intend for it to be read.

By the same token I use long names as a substitute for comments because I intend for those long names to be read; and it is very hard for a programmer to ignore names.

**JOHN:**

I mentioned earlier that there are two general reasons why comments are

needed. So far we've been discussing the first reason (abstraction).

The second general reason for comments is for important information

that is not obvious from the code. The algorithm in `PrimeGenerator`

is very non-obvious, so quite a few comments are needed to help readers

understand what is going on and why. Most of the algorithm's complexity

arises because it is designed to compute primes efficiently:

* The algorithm goes out of its way to avoid divisions, which were quite

  expensive when Knuth wrote his original version (they aren't that expensive

  nowadays).

* The first multiple for each new prime number is computed by squaring the

  prime, rather than multiplying it by 3. This is mysterious: why is it safe

  to skip the intervening odd multiples? Furthermore, it might seem that this

  optimization only has a small impact on performance, but in fact it makes an

  *enormous* difference (orders of magnitude). Using the square has the

  side effect that when

  testing a candidate, only primes up to the square root of the

  candidate are tested. If 3x were used as the initial multiple, primes

  within a factor of 3 of the candidate would be tested; that's a *lot*

  more tests.

  This implication of using the square is so non-obvious that I only realized

  it while preparing material for this discussion; it never occurred to me in

  the many times I have discussed the code with students.

Neither of these issues is obvious from the code; without

comments, readers are left to figure them out on their own. The students

in my class are generally unable to figure out either of them in the

30 minutes I give them, but I think that comments would have

allowed them to understand in a few minutes. Going back to my

introductory remarks, this is an example where information is important,

so it needs to be made available.

Do you agree that there should be comments to explain each of these

two issues?

**UB:**

I agree that the algorithm is subtle.  Setting the first prime multiple as the square of the prime was deeply mysterious at first.  I had to go on an hour-long bike ride to understand it.

Would a comment help?  Perhaps.  However, my guess is that no one who has been reading our conversation has been helped by it, because you and I are now too intimate with the solution.  You and I can talk about that solution using words that fit into that intimacy; but our readers likely do not yet enjoy that fit.

One solution is to paint a picture -- being worth a thousand words.  Here's my attempt.

	                                                                X

	                                                    1111111111111111111111111

	       1111122222333334444455555666667777788888999990000011111222223333344444

	   35791357913579135791357913579135791357913579135791357913579135791357913579

	   !!! !! !! !  !!  ! !! !  !  !!  ! !!  ! !  !   ! !! !! !

	 3 |||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-||-

	 5 |||||||||||-||||-||||-||||-||||-||||-||||-||||-||||-||||-||||-

	 7 |||||||||||||||||||||||-||||||-||||||-||||||-||||||-||||||-||||||-

	11 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||-||||||||||-

	13 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

	...

	113||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

I expect that our readers will have to stare at this for some time, and also look at the code.  But then there will be a _click_ in their brains and they'll say "Ohhh!  Yes!  I see it now!"

**JOHN:**

I found this diagram very hard to understand.

It begs for supplemental English text to explain the ideas being

presented. Even the syntax is non-obvious: what does

`1111111111111111111111111` mean?

Maybe we have a fundamental difference of philosophy here. I get the sense

that you are happy to give readers a few clues and leave it to them to put

the clues together. Perhaps you don't mind if people have to stare at something

for a while to figure it out? I don't agree with this approach: it results

in wasted time, misunderstandings, and bugs.

I think software should be totally *obvious*, where readers don't need to

be clever or "stare at this for some time" to figure things out.

Suffering followed by catharsis is great for Greek tragedies, but not

for reading code. Every question

a reader might have should be naturally answered, either in the code or

in comments. Key ideas and important conclusions should be stated explicitly,

not left for the reader to deduce. Ideally, even if a reader is in a hurry

and doesn't read the code very carefully, their first guesses about how

things work (and why) should be correct. To me, that's clean code.

**UB:**

I don't disagree with your sentiment.  Good clean code should be as easy as possible to understand.  I want to give my readers as many clues as possible so that the code is intuitive to read.

That's the goal.  As we are about to see, that can be a tough goal to achieve.

**JOHN:**

In that case, do you still stand by the "picture" you painted above? It doesn't

seem consistent with what you just said. And if you really wanted to give

your readers as many clues as possible, you'd include a lot more comments.

**UB:**

I stand by the picture as far as it's accuracy is concerned.  And I think it

makes a good crosscheck.  I have no illusions that it is easy to understand.

This algorithm is challenging and will require work to comprehend.  I finally

understood it when I drew this picture in my mind while on that bike ride.  When I got home I drew it for real and presented it in hopes that it might help

someone willing to do the work to understand it.

### Comments Summary

**JOHN:**

Let's wrap up this section of the discussion. Here is my summary of

where we agree and disagree.

* Our overall views of comments are fundamentally different. I see more

value in comments than you do, and I believe that they play a fundamental

and irreplaceable role in system design. You agree that there are places

where comments are necessary, but that comments don't always make it

easier to understand code, so you see far fewer places where comments are

needed.

* I would probably write 5-10x more lines of comments for a given piece of

code than you would.

* I believe that missing comments are a much greater cause of lost

productivity than erroneous or unhelpful comments;

you believe that comments are a net negative, as generally practiced:

bad comments cost more time than good comments save.

* You view it as problematic that comments are written in English

rather than a programming language. I don't see this as particularly

problematic and think that in many cases English works better.

* You recommend that developers should take information that I would

represent as comments and recast it into code if at all possible. One

example of this is super-long method names. I believe that super-long names

are awkward and hard to understand, and that it would be better to use

shorter names supplemented with comments.

* I believe that it is not possible to define interfaces and create

abstractions without a lot of comments. You agree for public APIs, but see little need to comment

interfaces that are internal to the team.

* You are unwilling to trust comments until you have read code to

verify them. I generally trust comments; by doing so, I don't need to read

as much code as you do. You think this exposes me to too much risk.

* We agree that implementation code only needs comments when the code is

  nonobvious. Although neither of us argues for a large number of implementation

  comments, I'm more likely to see value in them than you do.

Overall, we struggled to find areas of agreement on this topic.

**UB:**

This is a fair assessment of our individual positions; which I assume are based on our

different individual experiences.  Over the years I have found the vast majority

of comments, as generally practiced in the industry, to be unhelpful. You seem to have found more

help in the comments you have encountered.

## John's Rewrite of PrimeGenerator

**JOHN:**

I mentioned that I ask the students in my software design class to rewrite

`PrimeGenerator` to fix all of its design problems. Here is my rewrite

(note: this was written before we began our discussion; given what I

have learned during the discussion, I would now change several of the

comments, but I have left this in its original form):

```java

package literatePrimes;

import java.util.ArrayList;

public class PrimeGenerator2 {

    /**

     * Computes the first prime numbers; the return value contains the

     * computed primes, in increasing order of size.

     * @param n

     *      How many prime numbers to compute.

     */

    public static int[] generate(int n) {

        int[] primes = new int[n];

        // Used to test efficiently (without division) whether a candidate

        // is a multiple of a previously-encountered prime number. Each entry

        // here contains an odd multiple of the corresponding entry in

        // primes. Entries increase monotonically.

        int[] multiples = new int[n];

        // Index of the last value in multiples that we need to consider

        // when testing candidates (all elements after this are greater

        // than our current candidate, so they don't need to be considered).

        int lastMultiple = 0;

        // Number of valid entries in primes.

        int primesFound = 1;

        primes[0] = 2;

        multiples[0] = 4;

        // Each iteration through this loop considers one candidate; skip

        // the even numbers, since they can't be prime.

        candidates: for (int candidate = 3; primesFound < n; candidate += 2) {

            if (candidate >= multiples[lastMultiple]) {

                lastMultiple++;

            }

            // Each iteration of this loop tests the candidate against one

            // potential prime factor. Skip the first factor (2) since we

            // only consider odd candidates.

            for (int i = 1; i <= lastMultiple; i++) {

                while (multiples[i] < candidate) {

                    multiples[i] += 2*primes[i];

                }

                if (multiples[i] == candidate) {

                    continue candidates;

                }

            }

            primes[primesFound] = candidate;

            // Start with the prime's square here, rather than 3x the prime.

            // This saves time and is safe because all of the intervening

            // multiples will be detected by smaller prime numbers. As an

            // example, consider the prime 7: the value in multiples will

            // start at 49; 21 will be ruled out as a multiple of 3, and

            // 35 will be ruled out as a multiple of 5, so 49 is the first

            // multiple that won't be ruled out by a smaller prime.

            multiples[primesFound] = candidate*candidate;

            primesFound++;

        }

        return primes;

    }

}

```

Everyone can read this and decide for themselves whether they think

it is easier to understand than the original. I'd like to mention a

couple of overall things:

* There is only one method. I didn't subdivide it because I felt the method already divides naturally into pieces that are distinct and understandable. It didn't seem to me that pulling out methods would improve readability significantly. When students rewrite the code, they typically have 2 or 3 methods, and those are usually OK too.

* There are a *lot* of comments. It's extremely rare for me to write code with this density of comments. Most methods I write have no comments in the body, just a header comment describing the interface. But this code is subtle and tricky, so it needs a lot of comments to make the subtleties clear to readers. The long length of some of the comments is a red flag indicating that I struggled to find a clear and simple explanation for the code. Even with all the additional explanatory material this version is a bit shorter than the original (65 lines vs. 70).

**UB:**

I presume this is a complete rewrite.  My guess is that you worked to understand the algorithm from *Clean Code* and then wrote this from scratch.  If that's so, then fair enough.

In _Clean Code_ I *refactored* Knuth's algorithm in order to give it a little structure.  That's not the same as a complete rewrite.

Having said that, your version is much better than either Knuth's or mine.

I wrote that chapter 18 years ago, so it's been a long time since I saw and understood this algorithm.  When I first saw your challenge I thought: "Oh, I can figure out my own code!"  But, no.  I could see all the moving parts, but I could not figure out why those moving parts generated a list of prime numbers.

So then I looked at your code.  I had the same problem.  I could see all the moving parts, all with comments, but I still could not figure out why those moving parts generated a list of prime numbers.

Figuring that out required a lot of staring at the ceiling, closing my eyes, visualizing, and riding my bike.

Among the problems I had were the comments you wrote.  Let's take them one at a time.

```java

/**

 * Computes the first prime numbers; the return value contains the

 * computed primes, in increasing order of size.

 * @param n

 *      How many prime numbers to compute.

 */

public static int[] generate(int n) {

```

It seems to me that this would be better as:

```java

public static int[] generateNPrimeNumbers(int n) {

```

or if you must:

```java

//Return the first n prime numbers

public static int[] generate(int n) {

```

I'm not opposed to Javadocs as a rule; but I write them only when absolutely necessary. I also have an aversion for descriptions and `@param` statements that are perfectly obvious from the method signature.

The next comment cost me a good 20 minutes of puzzling things out.

```java

// Used to test efficiently (without division) whether a candidate

// is a multiple of a previously-encountered prime number. Each entry

// here contains an odd multiple of the corresponding entry in

// primes. Entries increase monotonically.

```

First of all I'm not sure why the "division" statement is necessary.  I'm old school, so I expect that everyone knows to avoid division in inner loops if it can be avoided.  But maybe I'm wrong about that...

Also, the *Sieve of Eratosthenes* does not do division, and is a lot easier to understand *and explain* than this algorithm.  So why this particular algorithm?  I think Knuth was trying to save _memory_ -- and in 1982 saving memory was important.  This algorithm uses a lot less memory than the sieve.

Then came the phrase: `Each entry here contains an odd multiple...`.  I looked at that, and then at the code, and I saw: `multiples[0] = 4;`.

"That's not odd" I said to myself.  "So maybe he meant even."

So then I looked down and saw: `multiples[i] += 2*primes[i];`

"That's adding an even number!" I said to myself.  "I'm pretty sure he meant to say 'even' instead of 'odd'."

I hadn't yet worked out what the `multiples` array was.  So I thought it was perfectly reasonable that it would have even numbers in it, and that your comment was simply an understandable word transposition.  After all, there's no compiler for comments so they suffer from the kinds of mistakes that humans often make with words.

It was only when I got to `multiples[primesFound] = candidate*candidate;` that I started to question things.  If the `candidate` is prime, shouldn't `prime*prime` be odd in every case beyond 2?  I had to do the math in my head to prove that.  (2n+1)(2n+1) = 4n^2+4n+1 ... Yeah, that's odd.

OK, so the `multiples` array is full of odd multiples, except for the first element, since it will be multiples of 2.

So perhaps that comment should be:

```java

// multiples of corresponding prime.

```

Or perhaps we should change the name of the array to something like `primeMultiples` and drop the comment altogether.

Moving on to the next comment:

```java

// Each iteration of this loop tests the candidate against one

// potential prime factor. Skip the first factor (2) since we

// only consider odd candidates.

```

That doesn't make a lot of sense.  The code it's talking about is:

```java

for (int i = 1; i <= lastMultiple; i++) {

    while (multiples[i] < candidate) {

```

The `multiples` array, as we have now learned, is an array of *multiples* of prime numbers.  This loop is not testing the candidate against prime *factors*, it's testing it against the current prime _multiples_.

Fortunately for me the third or fourth time I read this comment I realized that you really meant to use the word "multiples".  But the only way for me to know that was to understand the algorithm.  And when I understand the algorithm, why do I need the comment?

That left me with one final question.  What the deuce was the reason behind:

```java

multiples[primesFound] = candidate*candidate;

```

Why the square?  That makes no sense.  So I changed it to:

```java

multiples[primesFound] = candidate;

```

And it worked just fine.  So this must be an optimization of some kind.

Your comment to explain this is:

```java

// Start with the prime's square here, rather than 3x the prime.

// This saves time and is safe because all of the intervening

// multiples will be detected by smaller prime numbers. As an

// example, consider the prime 7: the value in multiples will

// start at 49; 21 will be ruled out as a multiple of 3, and

// 35 will be ruled out as a multiple of 5, so 49 is the first

// multiple that won't be ruled out by a smaller prime.

```

The first few times I read this it made no sense to me at all.  It was just a jumble of numbers.

I stared at the ceiling, and closed my eyes to visualize. I couldn't see it.  So I went on a long contemplative bike ride during which I realized that the prime multiples of 2 will at one point contain 2\*3 and then 2\*5.  So the `multiples` array will at some point contain multiples of primes *larger* than the prime they represent.  _And it clicked!_

Suddenly it all made sense. I realized that the `multiples` array was the equivalent of the array of booleans we use in the *Sieve of Eratosthenes* -- but with a really interesting twist.  If you were to do the sieve on a whiteboard, you _could_ erase every number less than the candidate, and only cross out the numbers that were the next multiples of all the previous primes.

That explanation makes perfect sense to me -- now, but I'd be willing to bet that those who are reading it are puzzling over it.  The idea is just hard to explain.

Finally, I went back to your comment and could see what you were saying.

### A Tale of Two Programmers

The bottom line here is that you and I both fell into the same trap.  I refactored that old algorithm 18 years ago, and I thought all those method and variable names would make my intent clear -- *because I understood that algorithm*.

You wrote that code awhile back and decorated it with comments that you thought would explain your intent -- *because you understood that algorithm*.

But my names didn't help me 18 years later.  They didn't help you, or your students either.  And your comments didn't help me.

We were inside the box trying to communicate to those who stood outside and could not see what we saw.

The bottom line is that it is very difficult to explain something to someone who is not intimate with the details you are trying to explain. Often our explanations make sense only after the reader has worked out the details for themself.

**JOHN:**

There's a lot of stuff in your discussion above, but I think it all boils down

to one thing: you don't like the comments that I wrote. As I mentioned earlier,

complexity is in the eye of the reader: if you say that my comments were

confusing or didn't help you to understand the code, then I have to take that

seriously.

At the same time, you have made it clear that you don't see much value in

comments in general. Your preference is to have essentially no

comments for this code (or any code). You argue above that there is simply nothing that

comments can do to make the code easier to understand; the only way to

understand the code is to read the code. That is a cop-out.

**UB:**

Sorry to interrupt you; but I think you are overstating my position.  I certainly never said that comments can never be helpful.  Sometimes, of course, they are.  What I said was that I only trust them if the code validates them.  Sometimes a comment will make that validation a lot easier.

**JOHN:**

You keep saying that you sometimes find use for comments, but the reality

is that "sometimes" almost never occurs in your code. We'll see this when

we look at your revision of my code.

Now back to my point. In order to

write our various versions of the code, you and I had to accumulate a lot of

knowledge about the algorithm, such as why it's OK for the first multiple

of a prime to be its square. Unfortunately, not all of that knowledge can

be represented in the code. It is our professional responsibility to do

the best we can to convey

that knowledge in comments, so that readers do not

have to reconstruct it over and over. Even if the resulting comments are

imperfect, they will make the code easier to understand.

If a situation like this occurred in real life I would work with

you and others to improve my comments. For example, I would ask you

questions to get a better sense of

why the "squared prime" comment didn't seem to help you:

* Are there things in the comment that are misleading or confusing?

* Is there some important piece of information you acquired on your

  bike ride that suddenly made things clear?

I would also show the comment to a few other people to get their takes

on it. Then I would rework the comment to improve it.

Given your fundamental disbelief in comments, I think it's likely that

you would still see no value in the comment, even after my reworking.

In this case I would show the comment to other people, particularly those

who have a more positive view of comments in general, and get

their input. As long as the comment is not misleading and at least a few

people found it helpful, I would retain it.

Now let me discuss two specific comments that you objected to. The

first comment was the one for the `multiples` variable:

```java

// Used to test efficiently (without division) whether a candidate

// is a multiple of a previously-encountered prime number. Each entry

// here contains an odd multiple of the corresponding entry in

// primes. Entries increase monotonically.

```

There is a bug in this comment that you exposed (the first entry is not odd);

good catch! You then argued that most of the information in the comment

is unnecessary and proposed this as an alternative:

```java

// multiples of corresponding prime.

```

You have left out too much useful information here. For example, I don't think

it is safe to assume that readers will figure out that the motivation is

avoiding divisions. It's always better to state these assumptions and

motivations clearly so that there will be no confusion. And I think it's

helpful for readers to know that these entries never decrease.

I would simply fix the bug, leaving all of the information intact:

```java

// Used to test efficiently (without division) whether a candidate

// is a multiple of a previously-encountered prime number. Each entry

// (except the first, which is never used) contains an odd multiple of

// the corresponding entry in primes. Entries increase monotonically.

```

The second comment was this one, for the `for` loop:

```java

// Each iteration of this loop tests the candidate against one

// potential prime factor. Skip the first factor (2) since we

// only consider odd candidates.

```

You objected to this comment because the code of the loop doesn't actually

test the candidate against the prime factor; it tests it against a multiple.

When I write implementation comments like this, my goal is not to restate

the code; comments like that don't usually provide much value. The goal here was

to say *what* the code is doing in a logical sense, not *how* it does it.

In that sense, the comment is correct.

However, if a comment causes confusion in the reader, then it is not a

good comment. Thus, I would rewrite this comment to make it clear that

it describes the abstract function of the code, not its

precise behavior:

```java

// Each iteration of this loop considers one existing prime, ruling

// out the candidate if it is a multiple of that prime. Skip the

// first prime (2) since we only consider odd candidates.

```

To conclude, I agree with your assertion "it is very difficult to explain

something to someone who is not intimate with the details you are trying

to explain." And yet, it is our responsibility as programmers to do exactly

that.

**UB:**

I'm glad we agree.  We also agree about getting others to review the code and make recommendations on the code _and_ the comments.

## Bob's Rewrite of PrimeGenerator2

**UB:**

When I saw your solution, and after I gained a good understanding of it.  I refactored it just a bit.  I loaded it into my IDE, wrote some simple tests, and extracted a few simple methods.

I also got rid of that *awful* labeled `continue` statement.  And I added 3 to the primes list so that I could mark the first element as *irrelevant* and give it a value of -1.  (I think I was still reeling from the even/odd confusion.)

I like this because the implementation of the `generateFirstNPrimes` method describes the moving parts in a way that hints at what is going on.  It's easy to read that implementation and get a glimpse of the mechanism.  I'm not at all sure that the comment helps.

I think it is just the reality of this algorithm that the effort required to properly explain it, and the effort required for anyone else to read and understand that explanation is roughly equivalent to the effort needed to read the code and go on a bike ride.

```java

package literatePrimes;

public class PrimeGenerator3 {

    private static int[] primes;

    private static int[] primeMultiples;

    private static int lastRelevantMultiple;

    private static int primesFound;

    private static int candidate;

    // Lovely little algorithm that finds primes by predicting

    // the next composite number and skipping over it. That prediction

    // consists of a set of prime multiples that are continuously

    // increased to keep pace with the candidate.

    public static int[] generateFirstNPrimes(int n) {

        initializeTheGenerator(n);

        for (candidate = 5; primesFound < n; candidate += 2) {

            increaseEachPrimeMultipleToOrBeyondCandidate();

            if (candidateIsNotOneOfThePrimeMultiples()) {

                registerTheCandidateAsPrime();

            }

        }

        return primes;

    }

    private static void initializeTheGenerator(int n) {

        primes = new int[n];

        primeMultiples = new int[n];

        lastRelevantMultiple = 1;

        // prime the pump. (Sorry, couldn't resist.)

        primesFound = 2;

        primes[0] = 2;

        primes[1] = 3;

        primeMultiples[0] = -1;// irrelevant

        primeMultiples[1] = 9;

    }

    private static void increaseEachPrimeMultipleToOrBeyondCandidate() {

        if (candidate >= primeMultiples[lastRelevantMultiple])

            lastRelevantMultiple++;

        for (int i = 1; i <= lastRelevantMultiple; i++)

            while (primeMultiples[i] < candidate)

                primeMultiples[i] += 2 * primes[i];

    }

    private static boolean candidateIsNotOneOfThePrimeMultiples() {

        for (int i = 1; i <= lastRelevantMultiple; i++)

            if (primeMultiples[i] == candidate)

                return false;

        return true;

    }

    private static void registerTheCandidateAsPrime() {

        primes[primesFound] = candidate;

        primeMultiples[primesFound] = candidate * candidate;

        primesFound++;

    }

}

```

**JOHN:**

This version is a considerable improvement over the version in *Clean Code*.

Reducing the number of methods made the code easier to read and resulted

in cleaner interfaces. If it were properly commented, I think this version

would be about as easy to read as my version (the additional methods you

created didn't particularly help, but they didn't hurt either). I suspect

that if we polled readers, some would like your version better and some

would prefer mine.

Unfortunately, this revision of the code creates a serious performance

regression: I measured a factor of 3-4x slowdown compared to either

of the earlier revisions. The problem is that you changed the processing of a

particular candidate from a single loop to two loops (the `increaseEach...` and

`candidateIsNot...` methods). In the loop from earlier revisions, and in

the `candidateIsNot`

method, the loop aborts once the candidate is disqualified (and

most candidates are quickly eliminated). However,

`increaseEach...` must examine every entry in `primeMultiples`.

This results in 5-10x as many loop iterations and a 3-4x overall slowdown.

Given that the whole reason for the current algorithm (and its complexity)

is to maximize performance, this slowdown is unacceptable. The two

methods must be combined.

I think what happened here is that you were so focused on something

that isn't actually all that important (creating the tiniest possible methods)

that you dropped the ball on other issues that really are important.

We have now seen this twice. In the original version of `PrimeGenerator`

you were so determined to make tiny methods that you didn't notice that the

code was becoming incomprehensible. In this version you were so eager to

chop up my single method that you didn't notice that you were blowing up the

performance.

I don't think this was just an unfortunate combination of oversights.

One of the most important things

in software design is to identify what is important and focus on that;

if you focus on things that are unimportant, you're likely to mess up the

things that are important.

The code in your revision is still under-commented. You believe

that there is no meaningful way for comments to assist the reader in

understanding the code. I think this stems from your general disbelief in

the value of comments; you are quick to throw in the towel.

This algorithm is unusually difficult to explain,

but I still believe that comments can help. For example, I believe you

must make some attempt to help readers understand why the first multiple

for a prime is the square of the prime. You have taken a lot of time to

develop your understanding of this; surely there must be some way to convey

that understanding to others? If you had included that information in

your original version of the code you could have saved yourself that long

bike ride.

Giving up on this is an abdication of professional responsibility.

The few comments that you included in your revision are of little value.

The first comment is too cryptic to provide much help: I can't

make any sense of the phrase "predicting the next composite number and

skipping over it" even though I completely understand the code it purports

to explain. One of the comments is just a joke; I was surprised to see

this, given your opposition to extraneous comments.

Clearly you and I live in different universes when it comes to comments.

Finally, I don't understand why you are offended by the labeled `continue`

statement in my code. This is a clean and elegant solution to the problem

of escaping from nested loops. I wish more languages

had this feature; the alternative is awkward code where you set a variable,

then exit one level of loop, then check the variable and exit the next

level.

**UB:**

Good catch!  I would have caught that too had I thought to profile the solution.  You are right that separating the two loops added some unnecessary iteration.  I found a nice way to solve that problem without using the horrible `continue`.  My updated version is now faster than yours!  A million primes in 440ms as opposed to yours which takes 561ms.  ;-) Below are just the changes.

```java

public static int[] generateFirstNPrimes(int n) {

    initializeTheGenerator(n);

    

    for (candidate = 5; primesFound < n; candidate += 2)

        if (candidateIsPrime())

            registerTheCandidateAsPrime();

    

    return primes;

}

private static boolean candidateIsPrime() {

    if (candidate >= primeMultiples[lastRelevantMultiple])

        lastRelevantMultiple++;

    for (int i = 1; i <= lastRelevantMultiple; i++) {

        while (primeMultiples[i] < candidate)

            primeMultiples[i] += 2 * primes[i];

        if (primeMultiples[i] == candidate)

            return false;

    }

    return true;

}

```

**JOHN:**

Yep, that fixes the problem. I note that you are now down to 4 methods,

from 8 in the *Clean Code* version.

## Test-Driven Development

**JOHN:**

Let's move on to our third area of disagreement, which is Test-Driven

Development. I am a huge fan of unit testing. I believe that unit tests are

an indispensable part of the software development process and pay for

themselves over and over. I think we agree on this.

However, I am not fan of Test-Driven Development (TDD), which dictates

that tests must be written before code and that code must be written

and tested in tiny increments. This approach has serious problems

without any compensating advantages that I have been able to identify.

**UB:**

As I said at the start I have carefully read _A Philosophy of Software Design_. I found it to be full of worthwhile insights, and I strongly agree with most of the points you make.

So I was surprised to find, on page 157, that you wrote a very short, dismissive, pejorative, and inaccurate section on _Test Driven Development_.  Sorry for all the adjectives, but I think that's a fair characterization.  So my goal, here, is to correct the misconceptions that led you to write the following:

>"Test-driven development is an approach to software development where programmers write unit tests before they write code.  When creating a new class, the developer first writes unit tests for the class, based on its expected behavior.  None of these tests pass, since there is no code for the class.  Then the developer works through the tests one at a time, writing enough code for that test to pass.  When all of the tests pass, the class is finished."

This is just wrong.  TDD is quite considerably different from what you describe.  I describe it using three laws.

 1. You are not allowed to write any production code until you have first written a unit test that fails because that code does not exist.

 2. You are not allowed to write more of a unit test than is sufficient to fail, and failing to compile is failing.

 3. You are not allowed to write more production code than is sufficient to make the currently failing test pass.

A little thought will convince you that these three laws will lock you into a cycle that is just a few seconds long.  You'll write a line or two of a test that will fail, you'll write a line or two of production code that will pass, around and around every few seconds.

A second layer of TDD is the Red-Green-Refactor loop.  This loop is several minutes long.  It is comprised of a few cycles of the three laws, followed by a period of reflection and refactoring.  During that reflection we pull back from the intimacy of the quick cycle and look at the design of the code we've just written.  Is it clean?  Is it well-structured?  Is there a better approach?  Does it match the design we are pursuing?  If not, should it?

**JOHN:**

Oops! I plead "guilty as charged" to inaccurately describing TDD.

I will fix this in the next revision of APOSD. That said, your definition

of TDD does not change my concerns.

Let's discuss the potential advantages and disadvantages

of TDD; then readers can decide for themselves whether they think TDD is a

good idea overall.

Before we start that discussion, let me clarify the approach I prefer as an

alternative to TDD. In your online videos you describe the alternative to

TDD as one where a developer writes the code, gets it fully working

(presumably with manual tests), then goes back and writes the unit tests.

You argue that this approach would be terrible: developers

lose interest once they think code is working, so they wouldn't actually

write the tests. I agree with you completely. However, this isn't the only

alternative to TDD.

The approach I prefer is one where the developer works in somewhat

larger units than in TDD, perhaps a few methods or a class. The developer

first writes some code (anywhere from a few tens of lines to a few hundred

lines), then writes unit tests for that code. As with TDD, the

code isn't considered to be "working" until it has comprehensive unit

tests.

**UB:**

How about if we call this technique "bundling" for purposes of this

document?  This is the term I use in _Clean Code 2d ed._

**JOHN:**

Fine by me.

The reason for working in larger units is to encourage design

thinking, so that a developer can think about a collection of related

tasks and do a bit of planning to come up with a good overall design

where the pieces fit together well.

Of course the initial design ideas will have flaws and refactoring

will still be necessary, but the goal is to center the development

process around design, not tests.

To start our discussion, can you make a list of the advantages you

think that TDD provides over the approach I just described?

**UB:**

The advantages I usually attribute to TDD are:

* Very little need for debugging.  After all, if you just saw everything working a minute or two ago, there's not much to debug.

* A stream of reliable low level documentation, in the form of very small and isolated unit tests.  Those tests describe the low level structure and operation of every facet of the system.  If you want to know how to do something in the system, there are tests that will show you how.

* A less coupled design which results from the fact that every small part of the system must be designed to be testable, and testability requires decoupling.

* A suite of tests that you trust with your life, and therefore supports fearless refactoring.

However, you asked me which of these advantages TDD might have over _your_ preferred method.  That depends on how big you make those larger units you described.  The important thing to me is to keep the cycle time short, and to prevent entanglements that block testability.

It seems to me that working in small units, and then immediately writing after the fact tests, can give you all the above advantages, so long as you are very careful to test every aspect of the code you just wrote.  I think a disciplined programmer could effectively work that way.  Indeed, I think such a programmer would produce code that I could not distinguish from code written by another programmer following TDD.

Above you suggested that bundling is to encourage design.  I think encouraging design is a very good thing.  My question for you is: Why do you think that TDD does not encourage design?  My own experience is that design comes from strategic thought, which is independent of the tactical behavior of either TDD or Bundling.  Design is taking one step back from the code and envisioning structures that address a larger set of constraints and needs.

Once you have that vision in your head it seems to me bundling and TDD will yield similar results.

**JOHN:**

First, let me address the four advantages you listed for TDD:

* Very little need for debugging? I think any form of unit testing can

  reduce debugging work, but not for the reason you

  suggested. The benefit comes because unit tests expose bugs earlier

  and in an environment where they are easier to track down. A

  relatively simple bug to fix in development can be very painful to

  track down in production. I'm not convinced by your argument that

  there's less debugging because "you just saw everything working a

  minute ago": it's easy to make a tiny change that exposes a really

  gnarly bug that has existed for a long time but hasn't yet been

  triggered. Hard-to-debug problems arise from the accumulated complexity

  of the system, not from the size of the code increments.

	>**UB:** True.  However, when the cycles are very short then the cause

	of even the gnarliest of bugs have the best chance of being tracked down.

	The shorter the cycles, the better the chances.

	>**JOHN:** This is only true up to a point. I think you believe

	that making units smaller and smaller continues to provide benefits,

	with almost no limit to how small they can get. I think that there

	is a point of diminishing returns, where making things even smaller

	no longer helps and actually starts to hurt. We saw this disagreement

	over method length, and I think we're seeing it again here.

* Low level documentation? I disagree: unit tests are a poor form

  of documentation. Comments are a much more

  effective form of documentation, and you can put them right next to the

  relevant code. Trying to learn a method's

  interface by reading a bunch of unit tests seems much more difficult

  than just reading a couple of sentences of English text.

	>**UB:** Nowadays it's very easy to find the tests for

	a function by using the "where-used" feature of the IDE.  As for comments

	being better, if that were true then no one would publish example code.

* A less coupled design? Possibly, but I haven't experienced this myself.

  It's not clear to me that designing for testability will produce the

  best design.

	>**UB:** Generally the decoupling arises because the test requires a mock

	of some kind.  Mocks tend to force abstractions that might otherwise not exist.

	>**JOHN:** In my experience, mocking virtually never changes interfaces;

	it just provides replacements for existing (typically immovable)

	interfaces.

	>**UB:** Our experiences differ.

* Enabling fearless refactoring? BINGO! This is the where almost all of the

  benefits from unit testing come from, and it is a really, really big deal.

	>**UB:** Agreed.

I agree with your conclusion that TDD and bundling are about the

same in terms of providing these benefits.

Now let me explain why I think TDD is likely to result in bad designs.

The fundamental problem with TDD is that it forces developers to work

too tactically, in

units of development that are too small; it discourages design

thinking.  With TDD the basic unit of

development is one test: first the test is written, then the code to

make that test pass. However, the natural units for design are larger

than this: a class or method, for example. These units

correspond to multiple test cases. If a developer thinks only about

the next test, they are only considering part of a design problem at

any given time. It's hard to design something well if you don't think

about the whole design problem at once. TDD explicitly

prohibits developers from writing more code than is needed to pass

the current test; this discourages the kind of strategic thinking needed

for good design.

TDD does not provide adequate guidance to encourage design. You mentioned

the Red-Green-Refactor loop, which recommends refactoring after each step,

but there's almost no guidance for refactoring. How should developers

decide when and what to refactor? This seems to be left purely to their

own judgment. For example, if I am writing a method that requires

multiple iterations of the TDD loop, should I refactor after every iteration

(which sounds pretty tedious) or wait until after several iterations so that

I can look at a bigger chunk of code when refactoring and hence be more

strategic? Without guidance, it will be tempting for developers to keep

putting off refactoring.

TDD is similar to the One Thing Rule we discussed earlier in that it is

biased: it provides very strong and clear instructions pushing developers

in one direction (in this case, acting tactically) with only vague

guidance in the other direction (designing more strategically). As a result,

developers are likely to err on the side of being too tactical.

TDD guarantees that developers will initially write bad code. If you start

writing code without thinking about the whole design problem, the first code

you write will almost certainly be wrong. Design only

happens after a bunch of bad code has accumulated.

I watched your video on TDD, and

you repeatedly wrote the wrong code, then fixed it later. If the developer

refactors conscientiously (as you did) they can still end up with good

code, but this works against human nature. With TDD, that bad code will

actually work (there are tests to prove it!) and it's human nature not

to want to change something that

works. If the code I'm developing is nontrivial, I will probably have to

accumulate a lot of bad code with TDD before I have enough code in front

of me to understand what the design should have been.

It will be very difficult for me to force myself to throw away

all that work.

It's easy for a developer to believe they are doing TDD correctly while

working entirely tactically, layering on hack after hack with an

occasional minor refactor, without ever thinking about the overall design.

I believe that the bundling approach is superior to TDD because it focuses

the development process around design: design first, then code, then write

unit tests. Of course, refactoring will still be

required: it's almost never possible to get the design right the first time.

But starting with design will reduce the amount of bad code you write and

get you to a good design sooner. It is possible to produce equally good

designs with TDD; it's just harder and requires a lot more discipline.

**UB:**

I'll address your points one at a time.

* I haven't found that the scale of TDD is so tactical that it discourages thinking.  Every programmer, regardless of their testing discipline, writes code one line at a time.  That's immensely tactical and yet does not discourage design.  So why would one test at a time discourage it?

* The literature on TDD strongly discourages delaying refactoring.  While thinking about design is strongly encouraged.  Both are integral parts of the discipline.

* We all write bad code at the start.  The discipline of TDD gives us the opportunity, and the safety, to continuously clean it.  Design insights arise from those kinds of cleaning activities.  The discipline of refactoring allows bad designs to be transformed, one step at a time, into better designs.

* It's not clear to me why the act of writing tests late is a better design choice.  There's nothing in TDD that prevents me from thinking through a design long before I write the very first tested code.

**JOHN:**

You say there is nothing about TDD that stops developers from thinking ahead

about design. This is only partly true. Under TDD I can think ahead, but I

can't actually write my ideas down in the form of code, since that would

violate TDD Rule 1. This is a significant discouragement.

You claim that "thinking about design is strongly encouraged" in TDD,

but I haven't seen this in your discussions of TDD. I watched your

video example of using TDD

for computing bowling scores, and design is never even mentioned after the

first minute or two (ironically, one of the conclusions of this

example is that the brief initial design turned out to be

useless). There is no suggestion of thinking ahead in the video;

it's all about cleaning up messes after the fact.

In all of the TDD materials you have shown me, I have not seen any

warnings about the dangers of becoming so tactical with TDD that

design never occurs (perhaps you don't even view this as a serious risk?).

**UB:**

I usually use an abbreviated form of UML to capture my early design decisions.  I have no objection to capturing them in pseudocode, or even real code.  However, I would not commit any such pre-written code.  I would likely hold it in a text file, and consult it while following the TDD cycle.  I might feel safe enough to copy and paste from the text file into my IDE in order to make a failing test pass.

The Bowling game is an example of how wildly our initial design decisions can  deviate from our eventual solutions.  It's true that introductory videos often do not expose the depth of a discipline.

**JOHN:**

As I was watching your TDD video for the second time, you said something

that jumped out at me:

  >Humans consider things that come first to be important and things that

   come at the end to be less important and somehow optional; that's

   why they are at the end, so we can leave them out if we have to.

This captures perfectly my concern about TDD. TDD insists that tests must

come first, and design, if it happens at all, comes at the end, after

code is working. I believe that good design is the most important

thing, so it must be the top priority. I don't consider tests optional,

but delaying them is safer than delaying design. Writing tests isn't particularly

difficult; the most important thing is having the discipline to do it.

Getting a good design is really hard, even if you are very disciplined;

that's why it needs to be the center of attention.

**UB:**

TDD is a coding discipline.  Of course design comes before coding -- I don't know anyone who thinks otherwise.  Even the Bowling Game video made that point. But, as we saw in the Bowling Game video, sometimes the code will take you in a very different direction.

That difference doesn't imply that the design shouldn't have been done.  It just implies that designs are speculative and may not always survive reality.

As Eisenhower once said:

>“In preparing for battle I have always found that plans are useless, but planning is indispensable.”

**JOHN:**

You ask why writing tests later is a better design choice. It isn't.

The benefit of the bundled approach doesn't come from writing tests later;

it comes from doing design sooner. Writing tests (a bit) later is a

consequence of this choice. The tests are still written pretty early-on

with the bundled approach, so I don't think the delay causes significant

problems.

**UB:**

I think we simply disagree that TDD discourages design.  The practice of TDD does not discourage me from design; because I value design.  I would suggest that those who do not value design will not design, no matter what discipline they practice.

**JOHN:**

You claim that the problems I worry about with TDD simply don't happen in

practice. Unfortunately I have heard contrary claims from senior

developers that I trust. They complain about horrible code produced by

TDD-based teams, and they believe that the problems were caused by TDD.

Of course horrible code can be produced with any design approach.

And maybe those teams didn't implement TDD properly, or maybe those

cases were outliers.

But the problems reported to me line up exactly with what I would

expect to happen, given the tactical nature of TDD.

**UB:**

My experience differs. I've worked on many projects where TDD has been used

effectively and profitably.  I'm sure the senior developers that you trust are telling you the truth about their experience.  Having never seen TDD lead to such bad outcomes myself, I sincerely doubt that the blame can be traced to TDD.

**JOHN:**

You ask me to trust your extensive experience with

TDD, and I admit that I have no personal experience with TDD.

On the other hand, I have a lot of experience with tactical programming,

and I know that it rarely ends well.

TDD is one of the most extreme forms of tactical programming I've

encountered.

In general, if "making it work" is the #1 priority, instead of

"develop a clean design", code turns to spaghetti.

I don't see enough safeguards in your approach to TDD

to prevent the disaster scenarios; I don't even see a clear

recognition of the risk.

Overall, TDD is in a bad place on the risk-reward spectrum. In comparison

to the bundling approach, the downside risks for poor code quality in TDD

are huge, and I don't see enough upside reward (if any) to compensate.

**UB:**

All I can say to that is that your opinion is based on a number of false impressions and speculations, and not upon direct experience.

**JOHN:**

Now let me ask you a couple of questions.

First, at a microscopic level, why on earth does TDD prohibit developers

from writing more code than needed to pass the current test? How does

enforcing myopia make systems better?

>**UB:**

The goal of the discipline is to make sure that everything is tested.

One good way to do that is to refuse to write any code unless it is to make a failing test pass.  Also, working in such short cycles provides insights into

the way the code is working.  Those insights often lead to better design decisions.

>**JOHN:**

I agree that seeing code (partially) working can provide insights. But

surely that benefit can be had without such a severe restriction on

how developers think?

Second, at a broader level, do you think TDD is likely to produce better

designs than approaches that are more design-centric, such as the bundling

approach I described? If so, can you explain why?

>**UB:**

My guess is that someone adept at bundling, and someone adept at TDD would produce very similar designs, with very similar test coverage.  I would also venture to guess that the TDDer would be somewhat more productive than the bundler if for no reason other than that the TDDer finds and fixes problems earlier than the bundler.

>**JOHN:**

I think that the bundling approach will result in a better design because

it actually focuses on design, rather than focusing on tests and hoping

that a good design will magically emerge. I think it's really hard to argue

that the best way to achieve one thing is to focus your attention on

something else. And the bundling approach will

make progress faster because the early thinking about design will reduce the

amount of bad code you end up having to throw away under TDD. Overall, I'd

argue that the best-case outcomes for the two approaches will

be about the same, but average and (especially) worst-case outcomes will

be far worse for TDD.

**JOHN:**

I don't think we're going to resolve our disagreements on TDD.

To do that, we'd need empirical data about the frequency of good and bad

outcomes from TDD. Unfortunately I'm not aware of any such data.

Thus, readers will have to decide for themselves whether the potential

benefits of TDD outweigh the risks.

For anyone who chooses to use TDD, I urge you to do so with extreme

caution. Your primary goal must not be just working code, but rather a

clean design that will allow you to develop quickly in the future.

TDD will not lead you naturally to the best design, so you will need

to do significant and continuous refactoring to avoid spaghetti code.

Ask yourself repeatedly "suppose that I knew everything I know now when

I first started on this project; would I have chosen the current

structure for the code?" When the answer is no (which will happen

frequently) stop and refactor. Recognize that TDD will cause you to

write more bad code than you may be used to, so

you must be prepared to throw out and rewrite more than you are used to.

Take time to plan ahead and think about the overall design, rather than

just making the next test work.

If you do all of these things diligently, I think it is possible to

mitigate the risks of TDD and produce well-designed code.

**UB:**

Let's just say that I agree with all that advice, but disagree with your assertion that TDD might be the cause of bad code.

### TDD Summary

**JOHN:**

Here is my attempt to summarize our thoughts on Test-Driven Development:

* We agree that unit tests are an essential element in software development.

They allow developers to make significant changes to a system without fear

of breaking something.

* We agree that it is possible to use TDD to produce systems with good designs.

* I believe that TDD discourages good design and can easily lead to very bad

code. You do not believe that TDD discourages good

design and don't see much of a risk of bad code.

* I believe that there are better approaches than TDD for producing good

unit test suites, such as the "bundling" approach discussed above. You agree

that bundling can produce outcomes just as good as TDD but think it may lead to

somewhat less test coverage.

* I believe that TDD and bundling have similar best-case outcomes, but that

the average and worst-case outcomes will be much worse for TDD. You disagree

and believe that, if anything, TDD may produce marginally better outcomes

than bundling. You also think that preference and personality are larger factors in

making the choice between the two.

**UB:**

This is a fair summary of our discussion.  We seem to disagree over the best application

of discipline.  I prefer a disciplined approach to keep the code covered by tests

written first in very short cycles.  You prefer a disciplined approach of writing relatively longer

bundles of code and then writing tests for those bundles.  We disagree on the risks and rewards of

these two disciplines.

## Closing Remarks

**JOHN:**

First, I'd like to thank you for tolerating (and responding to) the arguments

I have made about some of the key ideas in *Clean Code*. I hope this

discussion will provide food for thought for readers.

We have covered a lot of topics and subtopics in this discussion, but

I think that most of my concerns result from two general errors made

by *Clean Code*: failure to focus on what is important, and failure to

balance design tradeoffs.

In software design (and probably in any design environment) it is essential

to identify the things that really matter and focus on those. If you

focus your attention on things that are unimportant you are

unlikely to achieve the things that really are important.

Unfortunately, *Clean Code* repeatedly focuses on things that don't really

matter, such as:

* Dividing ten-line methods into five-line methods and dividing five-line methods

  into two- or three-line methods.

* Eliminating the use of comments written in English.

* Writing tests before code and making the basic unit of development a

  test rather than an abstraction.

None of these provides significant value, and we have seen how they

distract from producing the best possible designs.

Conversely, *Clean Code* fundamentally undervalues comments, which are

essential and irreplaceable. This

comes at a huge cost. Without interface comments the specifications for

interfaces are incomplete. This is guaranteed to result in confusion and bugs.

Without implementation comments, readers are forced to rederive knowledge

and intentions that were in the mind of the original developer. This wastes

time and leads to more bugs.

In my opening remarks I said that systems become complex when important

information is not accessible and obvious to developers. By refusing to

write comments, you are hiding important information that you have and

that others need.

The second general error in *Clean Code* has to do with balance. Design

represents a balance between competing concerns. Almost any design idea

becomes a bad thing if taken to the extreme. However, *Clean Code*

repeatedly gives very strong advice in one direction without correspondingly

strong advice in the other direction or any meaningful guidance about how

to recognize when you have gone too far. For example, making methods

shorter is often a good thing, but the *Clean Code* position is so one-sided

and extreme that readers are likely to chop things up too much. We saw

in the `PrimeGenerator` example how this resulted in code that was

nearly incomprehensible. Similarly, the *Clean Code* position on TDD is

one-sided, failing to

recognize any possible weakness and encouraging readers to take this to

a tactical extreme where design is completely squeezed out of the development

process.

**UB:**

John, I'd like to thank you for participating in this project.  This was a lot of fun for me.  I love disagreement and debate with smart people.  I also think that we share far more values than separate us.

For my part I'll just say that I have given due consideration to the points you've made, and while I disagree with your conclusions above, I have integrated several of your better ideas, as well as this entire document, into the second edition of _Clean Code_.

Thanks again, and give my best to your students.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/johnousterhout/aposd-vs-clean-code

Awesome Lists containing this project

README