Ivan Kusalic - home page

Testing Confusion

A few hours ago I’ve had intensive and respectful disagreement with a fellow team member. We just couldn’t agree on how the testing should be done. While eye-opening, the discussion was quite frustrating and so I’d like to explore this topic a bit more.

Disclaimer: I do not feel comfortable writing about this topic – I’m only starting to vaguely comprehend what the testing is all about. That said, all the more reason to write about it.

This is going to be an opinionated post, but I’d really like to hear your thoughts to improve/correct my, quite possibly, erroneous ways.

Relevant background and motivation

For the discussion to make sense, here’s a short background about our interest in testing.


I’ve been struggling to understand how to properly test for quite some time now. Everybody seems to be doing it. People mostly agree that it should be done. But the topic is not simple, it requires discipline and desire to improve while offering may pitfalls along the way. Seems like people are not really talking about the same things.

Should you do TDD, BDD or something else? Test before code or after? Minimal improvement steps or implementing the full unit of functionality? How much to test on which level of abstraction?

I can’t fully answer those questions, just gradually I’m starting to form my own opinion. The only one thing I know for sure: this is an important topic and I want to understand it properly.

Colleague / company

As I mentioned before, there is this guy – let’s call him Bob, who’s in my team. Bob and I get along well. We like to pair, have similar thoughts on code quality, like the same level of complexity, dislike the same code smells and so on. All in all, we really work well together. I enjoy it and I think Bob does as well.

The only thing that bothers me is that something is amiss when we are writing tests together. Let me correct that, something is also amiss when I’m writing tests for different projects in our code base. Or when I’m reading them. Hm…

For me, the tests we write feel wrong. They seem to do too much. They seem too fragile. Is the code coverage really all there is? Do you really touch them all when you refactor something?

Are all those tests really wrong? (scary thought)

Meet the ThoughtWorks guy1. Since some of us had expressed the doubts about the way testing is currently being done, he’s going to help us understand testing a bit better. Only one “problem”. He’s not going to force his opinion on us. He’ll guide us and offer his advice, but we have to embrace it for ourselves.

So there was a presentation with a broad overview on how to test things meaningfully. There we learned a lot. Or at least I hope we did. There is this thing called Ice-Cream Cone that you should avoid. We learned that what we called integration tests are actually acceptance test and that we do not have any integration tests at all. And much more. I really liked it.

So here we are, Bob and I, trying to wrap our heads around this problem, and for once test something in this code base that will make us both happy.

Task at hand

Meet the task at hand: we need to change the temporary directories users (as in linux users) are using. From the perspective of some Python libraries.

Not taking the TDD into consideration, we first wrote the implementation (at least we know how to do that together). Something along these lines:

function to be tested
def change_cache_directory(path='tmp'):
    d = os.path.expanduser(os.path.join('~', path))
    if not os.path.isdir(d):
    if not os.access(d, os.W_OK):
        raise IOError("Temp directory {0} not writeable".format(d))
    tempfile.tempdir = d

Just a tidbit of information: there is already an existing class that is going to call this in constructor (don’t ask). The class has 100% C0 coverage.

So far, so good.

Finally – tests

At this point Bob started to automatically write the tests. Few mocks here and there, verify each function is called with the exact parameters and ta-da, the function is tested. Oh, add to all the existing tests also some asserts that constructor calls this function. Done. Phew, that was not really that hard. Only 3 or so minutes passed.

At this point I raise the issue: should this function be unit-tested or integration-tested. We somehow agree that the integration tests make more sense.

Now we are about to write our first integration test. Not really sure where to start. We refactored the function to remove hard-coded directory path and to take an argument for new cache directory.

Since the integration tests are being extracted to a separate file, no need for the os module mock anymore. Or is there? And here we are, entering the discussion that lasted for the next hour or so and ended in agreeing that we disagree.

Disagreement – summary

In short: I liked basically all of the ThoughtWorks guy’s points. It just made sense for me. All the previous reading I did seemed to support this. I’ve even complied the points into a short summary gist.

Bob agreed in theory with some of the points, but for him they do not make sense in practice, in real and imperfect world.

In the end we decided to make a list of things we disagree on:

  • What to mock? Minimal possible set? Are mocks evil?
  • What to assert? Minimal consequences vs everything used?
    • Nice idea vs practical idea?
  • Single responsibility vs line count?
  • What is single responsibility of the test?
    • Spanning one vs more functions?
  • Assert everything necessary only once vs in multiple tests?
  • Assert unused mocks?
    • @assert_unused_mocks

We realised we do not agree even on most basic level about testing.

All my points boil down to a test that should:

  • make minimal set up (given)
  • execute the code (when)
  • verify only direct consequences (then)

On the other hand, he wants to assert everything all the time. We have a change in constructor? Add direct asserts to all the tests that instantiate the object. If this affects 7 tests, not a problem, change all 7. Ok, if there were 20+ tests, maybe it’s not necessary to assert it in all of them. Maybe.

I really have the need to oppose this. In my opinion this makes the tests brittle and they impede refactoring.

In his opinion this makes refactoring safer.

No conclusion in sight

In the end we couldn’t agree. That was really shocking for me. I just couldn’t convince him no matter which angle I tried. (He couldn’t do it as well). And usually we just agree…

I guess that’s just the proof that there is more to this topic than it meets the eye.

What are your thoughts and experiences with testing? What does it mean to test something properly? Send me an email.

  1. Yes, we have a Thoughtworks guy (really great guy BTW), you should get one as well