How to Safely Parse JSON into Swift Immutable Models with TDD 

 April 11, 2017

by Jon Reid


How can we unit test JSON parsing, handling every possible error? Can we generate immutable models? And for Swift, how can we keep our Response Models free of optionals?

Of course, there are many JSON parsing libraries out there. Plug one in, define all fields as non-optional, and you’re good to go! …Until your app crashes, because something was different in the actual JSON data.

Unlikely? “The backend team would never do that to us”? I’ve had a released app crash because the backend folks changed one field from a string to an integer. I’ve seen app development and QA forced to pause because a commit assumed all fields were non-optional. (It crashed on the missing field, because hey, Swift.)

So let’s look at a pattern that will help us

  • Handle required types
  • Avoid optionals
  • Deliver immutable models

Even if you never plan to do your own parsing, we’ll learn things along the way about design and testing.

Note: These days, I just use Decodable. But read on if you want to learn about construction builders.

Problems with All-in-One Unit Tests

I often see unit tests like this for JSON parsing:

func test_parseJSON() {
    let json = // Read JSON saved in external file
    let actual = Model.parse(json)
    let expected = Model(
        // Define each field, including each subcomponent,
        // to match the JSON
    XCTAssertEqual(actual, expected)

This may be a fine acceptance test. But we can do better for unit testing. Here are the things I find problematic:

  1. Large, often externalized input. I discuss this at the very beginning of my JSON parsing screencast.
  2. The need to define Equatables all the way down. I discuss this in Let’s Stop Overusing Swift Equatables in Unit Tests.
  3. Hunting for mismatches. When a mismatch occurs, the assertion fails. But the failure message will not tell us which field was wrong. We’ll have to go hunting because the feedback is too coarse. A good unit test gives precise feedback.
  4. No way to grow a solution. One of the benefits of TDD is emergent design, where you gradually grow and refine a solution. An all-in-one test isn’t friendly toward emergent design. It all works, or it doesn’t.
  5. Difficult to test errors. We should test one error scenario at a time. But this is tricky with large input. We’ll end up copying and pasting the good input, then mutating one small piece of it. It’s the worst kind of duplication: “Everything’s the same, except for this one bit.”

These problems all point to the same thing: the test is too big. How can we break it into smaller parts?

Construction Builder to the Rescue

Disclosure: The book links below are affiliate links. If you buy anything, I earn a commission, at no extra cost to you.

“I want to combine individual parts into an immutable composite.” This leads us to the Construction Builder pattern. Despite the similar name, this is not the same as the GoF Builder pattern. Construction Builder comes from Domain Specific Languages by Martin Fowler.

Here’s the one-line description of the pattern:

Incrementally create an immutable object with a builder that stores constructor arguments in fields.

In a Construction Builder, each field can be set independently of the other fields.

This is the clue we need. We can first parse a flat structure into a set of fields. “Parse this field. Then that other field.” The tests can focus on one field at a time. And we’ll have tests to drive handling of unexpected JSON types.

Finally, everything is assembled into an immutable object. The build step can confirm that we have every field required.

A Simple, Flat JSON example

Here, ResponseModel has two immutable fields. We define a corresponding ResponseBuilder, but its fields are mutable and optional.

The parse method decides how to set each field. It will:

  • Set the field if the dictionary provides the desired type.
  • (Optional) Attempt to coerce the input into the desired type.

The build method creates the ResponseModel. It will:

  • Determine if it has all required fields. If not, either return nil or raise an exception.
  • Call the initializer. It can also do further conversion from the JSON-like fields to other data types.

Simple Parse

Let’s say the first field is numeric. Here’s how I’d parse it in Swift:

struct ResponseBuilder {
    let field1: Int?
    parse(dictionary dict: [String: Any]) {
        field1 = dict["field1"] as? Int

It’s straightforward: If there is a dictionary entry, and it can be downcast to an Int, do so.

In Objective-C, things are wordier. Having TDD’d the need for fields of different types, here’s part of what I refactored to:

static id requireType(id object, Class type)
    if (![object isKindOfClass:type])
        return nil;
    return object;
NSNumber *QCORequireNumber(id object)
    return requireType(object, [NSNumber class]);

@implementation QCOResponseBuilder
- (void)parseDictionary:(NSDictionary *)dict
    self.field1 = QCORequireNumber(dict[@"field1"]);

We can potentially extend QCORequireNumber to handle the string-to-number conversion. Remember my story of the shipping app that started crashing due to a JSON type change? After that, I began adding forward compatibility with defensive string-to-number and number-to-string converters.

Simple Build

Once parsing finishes, the build method can check to see if it has all required fields. In this Swift example, we return nil if anything is missing. But if all is well, we call the initializer:

func build() -> ResponseModel? {
    guard let field1 = field2,
          let field2 = field2 else {
        return nil
    return ResponseModel(field1: field1, field2: field2)

Here’s the same thing in Objective-C, with a slight twist. field1 is an NSNumber in the builder, so a nil value means “we didn’t get valid input for field1.” But in the response model, it’s easier to use C types instead of NSNumber. We’ll have build  handle that final conversion:

- (QCOResponseModel *)build
    if (!self.field1 || !self.field2)
        return nil;
    return [[QCOResponseModel alloc]
        initWithField1:self.field1.integerValue field2:self.field2];

What About Nested JSON Data?

So far, we’ve only looked at a flat example. How can we use Construction Builders to parse JSON for more complex data?

The MarvelBrowser TDD projects call the Marvel API for information about comic characters. Here’s an example response:

    "code": 200,
    "status": "Ok",
    "data": {
        "offset": 20,
        "total": 22,
        "results": [
                "name": "Cyclops"
                "name": "Phoenix"

The data section represents paging. There are 22 total results, and we’re skipping over the first 20. There are other fields, but I’m only showing a subset. And in the array of results for each character, I’m only showing the character name.

For each nested data structure, we define nested builders:

Let’s focus for now on the object at the top center. In CharactersSliceResponseBuilder, offset and total each have their own properties for incremental construction. But what is the type of results? It’s an array… of what?

It turns out to be an array of builders for the next level:

let offset: Int?
let total: Int?
let results: [CharacterResponseBuilder]?

Tests to Drive Nested Data

While writing the Swift version, I discovered that I could simplify the code by changing the parse methods to initializers.

Then, I needed tests to drive handling of nested data. First, I wanted to handle an array with a single item:

func test_createWithOneResult_shouldCaptureOneCharacter() {
    let dict: [String: Any] = ["results": [
            ["name": "ONE"],
    let sut = CharactersSliceResponseBuilder(dictionary: dict)
    XCTAssertEqual(sut.results?.count, 1)
    XCTAssertEqual(sut.results?[0].name, "ONE")

With TDD, the production code to pass this test has no loops. We tell ourselves, “Forget looping for now. Let’s solve this little bit first.”

The second test has two items. This drives us to iterate the array, not just take the first element.

func test_createWithTwoResults_shouldCaptureTwoCharacters() {
    let dict: [String: Any] = ["results": [
            ["name": "ONE"],
            ["name": "TWO"],
    let sut = CharactersSliceResponseBuilder(dictionary: dict)
    XCTAssertEqual(sut.results?.count, 2)
    XCTAssertEqual(sut.results?[0].name, "ONE")
    XCTAssertEqual(sut.results?[1].name, "TWO")

Once this second test passes, we can delete the first test. It was a “scaffolding” test, and is no longer needed.

If the input isn’t an array, the captured array should be nil:

func test_createWithNonArrayResult_shouldCaptureNothing() {
    let dict: [String: Any] = ["results": ["name": "BOGUS"]]
    let sut = CharactersSliceResponseBuilder(dictionary: dict)

Finally, what if the input is an array, but not everything inside is a dictionary? Let’s ignore non-dictionary entries:

func test_createWithTwoResultsButFirstNotDictionary_shouldCaptureValidSecondResult() {
    let dict: [String: Any] = ["results": [
            ["name": "TWO"],
    let sut = CharactersSliceResponseBuilder(dictionary: dict)
    XCTAssertEqual(sut.results?.count, 1)
    XCTAssertEqual(sut.results?[0].name, "TWO")

I encourage you to try TDDing the initializer. Do one test at a time, writing the simplest code that passes. Then see what you can refactor. Remember to keep everything green during refactoring!

Building a Response Model with Nested Data

What does the build method look like when we have an array of builders? Basically, we call build on each sub-builder, and put the results into an array:

func build() -> CharactersSliceResponseModel? {
    guard let offset = offset,
          let total = total else {
        return nil
    return CharactersSliceResponseModel(offset: offset, total: total, characters: buildCharacters())
private func buildCharacters() -> [CharacterResponseModel] {
    return results?.flatMap { $0.build() } ?? []

I arrived at this production code thanks to various tests which investigate CharactersSliceResponseModel. The tests focus on the characters array. Now the initializer requires offset and total, but the tests don’t care about those fields. Specifying them over and over in the tests is a problem, because

  • The repetition is noise. It hides the important input, namely the character names.
  • What happens when we add more required fields? I don’t want to have to change every single test.

To solve this, I extracted a test helper:

private func addRequiredFields(to dict: [String: Any]) -> [String: Any] {
    var dictPlusData = dict
    dictPlusData["offset"] = 0
    dictPlusData["total"] = 0
    return dictPlusData

Here’s an example of a test that uses it:

func test_build_withRequiredFieldsButNoResults_shouldHaveNoCharacters() {
    let dict = addRequiredFields(to: [:])
    let sut = CharactersSliceResponseBuilder(dictionary: dict)
    let response = sut.build()
    XCTAssertEqual(response?.characters.count, 0)

I’d still have to change a few tests if I add a new required field. But thanks to the helper, not that many.

Final JSON Edge Cases to Consider

Here’s that diagram again of the relationship between the Construction Builders and the Response Models:

I want to guard against a few more edge cases:

  • What if the FetchCharactersResponseBuilder receives no code?
  • What if it receives a success code of 200, but has no data?

These both indicate a problem in the data provided by the backend service. So what I’ll do is pretend I received a 500 code for “Internal Server Error”.

For Objective-C, the outermost response model contains the code, the status, and the slice. So the edge cases above can be handled by the FetchCharactersResponseBuilder.

But there’s one last edge case: What if the JSON is malformed and can’t be parsed?

The outermost Construction Builder is initialized from a dictionary. The JSON is already parsed before then. Let’s initiate the parsing from a standalone function:

QCOFetchCharactersResponseModel *QCOParseFetchCharactersJSONData(NSData *jsonData)
    id object = [NSJSONSerialization JSONObjectWithData:jsonData
    QCOFetchCharactersResponseBuilder *builder = [[QCOFetchCharactersResponseBuilder alloc]
    return [builder build];

If the JSON parsing fails, object will be nil. But a builder will still be created. When it calls build, there will be no code set because there was nothing to parse. Then end result, validated by a test, is that we’ll get a 500 error code:

- (void)test_parseWithMalformedJSON_shouldReturnCode500
    NSString *json = @"{\"cod";
    NSData *jsonData = [json dataUsingEncoding:NSUTF8StringEncoding];
    QCOFetchCharactersResponseModel *response = QCOParseFetchCharactersJSONData(jsonData);
    assertThat(@(response.code), is(@500));

But it’s best to package the response differently in Swift. We don’t want downstream code to check whether the code is 200 or not. We simply want success or failure. (And for a successful response, we really don’t care about the status message.) So the Swift version of the FetchCharactersResponseModel is

typealias FetchCharactersResponseModel = Result

The FetchCharactersResponseBuilder handles one edge case. When there is no data, we’ll return a “Bad data” error:

func build() -> FetchCharactersResponseModel {
    return data?.build()
                .flatMap { .success($0) } ?? .failure("Bad data")

It presumes that the code has already been confirmed to be 200. So the standalone function takes care of:

  • JSON parsing,
  • checking that the result is a dictionary, and
  • looking for code and status.


By using the Construction Builder pattern, we separated parsing (incremental construction) from building (an immutable Response Model).

Even if you avoid all this by using a JSON library, I hope this exercise has illustrated some TDD principles:

  • If something is too big to handle with small tests, break the problem into smaller parts. Write tests against the intermediate results.
  • There is no TDD rule saying, “Don’t do any design.” We need just enough design to proceed.
  • If there’s repeated test input that is necessary but irrelevant, look for a way to isolate it.

Have any questions or feedback? Please share your thoughts in the comments below!

There is no TDD rule saying, “Don’t do any design.” We need *just enough* design to proceed.

Click to Tweet

Jon Reid

About the author

Programming was fun when I was a kid. But working in Silicon Valley, I saw poor code lead to fear, with real human costs. Looking for ways to make my life better, I learned about Extreme Programming, including unit testing, test-driven development (TDD), and refactoring. Programming became fun again! I've now been doing TDD in Apple environments for 20 years. I'm committed to software crafting as a discipline, hoping we can all reach greater effectiveness and joy. Now a coach with Industrial Logic!

  • 1. How about mentioning the “JSON Schema” standard as a tool to “safely parse JSON”?

    2. The crashes can be eliminated if the edge cases are generated automatically. There is a tool called “Fuzzer” doing exactly this

    Using the mentioned tools in your tests should be enough to write a “crash-free” and safe JSON parser. So does the class breakdown really matter a lot?

    • Alexander,

      #1: The schema is a good starting point. But if the schema contains any errors, we’ll still have failures. Also, I keep in my memory the time the schema was changed without notice.

      #2: This looks like a great tool!

      Note that everything I share, such as my class design, is an example of one way of solving things. It’s not prescriptive. The first part of any design pattern is setting a proper context to determine if the pattern will help in a particular situation.

      Mainly, my point is “if the larger problem is difficult to TDD, break it into smaller problems.”

  • Great post, Jon. Thanks for sharing!

    I have seen the same problem you described many times, an engineer change something in the backend and an app starts crashing. In my experience, JSON libraries minimize but don’t solve the problem.

    I really like following disciplines and patterns that helps us reduce the human error and TDD has been a great ally. I normally use Parsers to do this job, but I’m keen to try the Construction Builder pattern. (They look very alike. I’m curious to find key differences).

    The responsibility to think of all edge cases still lie on us, which is problematic since we can forget some. It would be great to defer this responsibility to machines, which are far better than us to generate and process data.

    Have you tried generative tests? It’s also a great tool to improve TDD and help find edge cases since the system is responsible for generating values used in the tests, not the developer. Additionally, it randomly replace values in a mutation testing style, creating far more input data to stress the system than a developer would ever be able to.

    Sadly, I can’t find any Swift library for that yet and I can’t see Apple making this a first-class framework in the platform.

    • > Sadly, I can’t find any Swift library for that yet and I can’t see Apple making this a first-class framework in the platform.

      Except for Fuzzer there’s also a tool called SwiftCheck https://github.com/typelift/SwiftCheck
      which is inspired by Haskell QuickCheck tool

      it’s not primarily for json testing but it works as you’ve described by generating values in defined constraints.

        • Ok, I’m afraid I’ll take it back.

          I really dislike using libraries that encourages the use of custom operators such as:

          infix operator : SwiftCheckLabelPrecedence
          infix operator ^&&^ : SwiftCheckLogicalPrecedence
          infix operator ^||^ : SwiftCheckLogicalPrecedence

          and SwiftCheck makes heavy use of them.

          In my opinion, it makes code hard to read and maintain due to its extraneous vocabulary. Especially when working in a team.

  • Thank you for interesting post!

    I have few questions regarding JSON parsing and its testing.

    What if we don’t crash the app if json schema in response is not the one we expect?
    If you receive single entity, you might get nil or error for this.
    If you receive list of entities, you can just ignore entities that you weren’t able to parse (of course it depends on your goal, but still).

    A lot of json parsing libraries provide clear error of what has happened while parsing.
    We can always see exception/error message and easily understand the problem.
    To see it nicely in test, we can have custom assertion/matcher or just simple try-catch that will print out error, so we don’t have to test every possible case in json schema.

    Also, as was mentioned above, there’s a nice “JSON Schema” standard and it seems to be descriptive and quite clear.

  • > Have you tried generative tests?
    > Additionally, it randomly replace values in a mutation testing style, creating far more input data to stress the system than a developer would ever be able to.

    @Caio_Zullo, according to your description, it looks exactly like “Fuzzer” tool from my comment above. It’s in objective-c but it works with swift just fine. I’m using it in my current project.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}