I’m Pretty Sure Most of Us Are Wrong about XCTestCase tearDown…

October 18, 2016

Since 2001, I’ve relied on an understanding of test execution flow in xUnit frameworks. But somewhere along the way, my understanding of the XCTestCase life cycle got messed up. I picked up an assumption that’s just wrong.

At best, it’s an assumption that can bloat our test runs. At worst, it can wreak havoc.

When is an XCTestCase deallocated?

This is the big question. I had it right for a long time, but then I lost it.

You see, when Objective-C moved away from manual memory management, we no longer had to release our objects. ARC is magical. Except for retain cycles, we got used to thinking that objects disappeared on their own.

Maybe I got lazy at that point. But I think it affected others, too. When people started writing test cases in Swift, I saw a common question:

“Why bother with setUp and tearDown?”

Why bother, indeed? For example, when creating your test fixture, what’s wrong with this way of making the System Under Test (SUT):

class ThingTests: XCTestCase {
let sut = Thing()

testSomethingOnSUT() {}
testSomethingElseOnSUT() {}
}

Isn’t this more “Swifty”?

setUp and tearDown still matter

I resisted this trend. No, I said, create your System Under Test in setUp, and release it in tearDown. Like this:

class ThingTests: XCTestCase {
var sut: Thing!

override func setUp() {
super.setUp()
sut = Thing()
}

override func tearDown() {
sut = nil
super.tearDown()
}

testSomethingOnSUT() {}
testSomethingElseOnSUT() {}
}

Why? I wanted to improve reporting of object life cycle problems. I’ve written xUnit frameworks before. So I knew that the control flow went something like this:

  1. try
  2. Call setUp
  3. Execute the test method
  4. If any assertions were not satisfied, report a test error
  5. Call tearDown
  6. catch and report any problem as a test failure

I want the creation and destruction of the SUT to fall within the try-catch scope. That way, if an exception is thrown, XCTest will report it as a failure of a specific test.

Even if it’s a crash, I can guarantee that the test log will show which test had started but not completed.

I thought I had it nailed. My results were good. But my understanding was incomplete…

No really, when is an XCTestCase deallocated?

Then one day, Benjamin Encz startled me with this tweet:

Wait, what? The XCTestCase is never really deallocated?

I ran an experiment and confirmed Benjamin’s observations. Wow, I thought, Apple screwed up XCTest in a big way.

…Until I realized I was wrong. I’d forgotten how xUnit frameworks are usually written. The question of deallocation was masking a different question:

When is an XCTestCase allocated?

How xUnit works

The xUnit frameworks share a common architecture:

  • Tests are composed into suites.
  • There may be a way to manually create individual tests, and group them into suites, but…
  • A TestCase class is an easier way to go. It contains methods annotated (in some way) as tests. Each test method becomes a single test, within a suite that corresponds to the TestCase.

XCTest queries the runtime for all subclasses of XCTestCase. For each such class, it queries for all methods which

  • have no arguments,
  • return no results, and
  • have names starting with the prefix “test”.

The assumption I mistakenly picked up was that to run a test,

  1. The specific XCTestCase subclass was instantiated
  2. The specific test method was invoked (in the try-catch scope)
  3. The XCTestCase was destroyed

This is wrong. I haven’t read the XCTest source, but if it’s anything like the old xUnit frameworks… This is what really happens:

  1. XCTest queries the runtime for XCTestCase / test method combinations. For each combination:
  2. A new test case is instantiated, associated with a specific test method.
  3. These are all aggregated into a collection.
  4. XCTest iterates through this collection of test cases.

In other words, it builds up the entire set of XCTestCase instances before running a single test.

Danger Zone!

setUp and tearDown were invented because the entire collection of test cases is created up front. They provide the hooks to manage object life cycle in tests.

This has two important implications:

  1. Anything automatically created as part of the XCTestCase’s initialization will exist too soon.
  2. Anything not released in the tearDown will continue to exist, even while other tests run.

Think of any object that alters global state, and shudder.

Let’s say you have an object that does some method swizzling when it’s created. It restores the original method when it’s deallocated. If you created this object in setUp but didn’t release it in tearDown, it will continue to exist. So the method swizzling done for one test will inadvertently affect all remaining tests!

Conclusion

Let’s boil this down to a rule-of-thumb:

Every object you create in setUp should be destroyed in tearDown.

Remember, that’s not the only thing to do in tearDown. It’s there so we can reset things back to a clean state. What’s part of our global state? This includes:

  • Method swizzling
  • File system
  • User defaults
  • Database
  • Keychain
  • Really, any sort of Dependency Injection via Ambient Context

These must all be cleaned up.

That’s also a frightening list of global state! Unit tests will do better by avoiding them if possible.

Have you been missing anything in tearDown? Let us know in the comments below, so we can all learn from each other.

Jon Reid

Posts Twitter Facebook Google+

I've been practicing Test Driven Development (TDD) since 2001. Learn more on my About page.

8 responses to I’m Pretty Sure Most of Us Are Wrong about XCTestCase tearDown…

  1. So you did have it nailed but only did not understand why –
    is this what you are saying ?

    For a explanation see
    http://stackoverflow.com/questions/21038375/what-is-the-purpose-of-xctestcases-setup-method#21040654
    ……………………
    There are actually two setUp methods:

    + (void)setUp;
    – (void)setUp;

    The class method (+ (void)setUp) is only run once during the entire test run.

    The instance method (- (void)setUp) is the one in the default template; it’s run before every single test.
    ……………

    • Nenad, it’s never been a question that -setUp is called once before each test. The important question is how long does something live in the test fixture?

      The only purpose of the class method variations +setUp and +tearDown is to optimize speed. A typical case is setting up and tearing down a database to be reused across tests in the suite. But note the risk, that it creates global reused state.

      So don’t use the class methods without measuring for time improvement. And you should never use them for unit tests.

  2. wow… will help me to track memory leaks (retain cycles) directly from unit tests.

    Thank you Jon

    • Be careful: this won’t reveal retain cycles automatically. But it does help. To check for retain cycles, I set a breakpoint on dealloc or deinit of the SUT. When the SUT is released in tearDown, the breakpoint should be hit. If not, there’s a retain cycle. It could be a problem in either test code or production code.

  3. This is how it works with Swift Corelibs XCTest as well:

    https://github.com/apple/swift-corelibs-xctest

  4. I think it might be worth mentioning that

    + (void) setUp
    + (void) tearDown

    and the initialization idiom you cite as the anti-pattern at the start of this piece do have an important role: shared test fixtures.

    Sometimes, you may have a test that requires a lot of work in setUp. That’s a code smell, and it suggests that you’re not actually writing a unit test, but once in a while you can’t avoid it. Some examples:

    – testing objects instantiated from a nib
    – testing objects (e.g. images) conveniently initialized from a file
    – lifecycle testing of compound objects (e.g. does closing your main window actually release everything)
    – testing objects that need to do a lot of work to initialize themselves (for example, objects that calculate the first 9000 digits if pi in their init method)

    In cases like these, it may be better to share a single test object across several different tests, and require each test to restore the object to its standard state.