Re: What is a "unit test"?

A response to Sean Conner's post asking the question: What is a "unit test"?

Original Post

Like Sean, a major portion of my developer career has been in the realm of C. However, I have release production quality code in languages from Assembly, C++, C#, F#, Javascript, Python, Rust, tcl/tk and Racket. All of these follow different paradigms. Procedural, OOO, Functional. Statically and Dynamically typed. With how different they all are, you'd expect Unit Testing to be just as complex and nuanced. When I was doing low level system development in I was building industrial control communications systems that could not have failures or down time. For that reason the QA/UAT process was very long and arduous. In the field beta testing took months. From the software side I wrote two types of tests: Unit Tests and System Tests.

I think to best understand what a unit test really is, we need to start at what a system test is. Whether you are doing Waterfall or some form of Agile the software you are designing has requirements. Those requirements tend to be written in the form of "As a user I expect to be able to do X and get Y result". As we all know the code involved to fulfill that requirement will most likely not be a single line. It will be multiple modules, classes, functions or even services/applications. When we want to test this scenario we go through the happy path, and then identify failure cases and what we expect the graceful way to handle those failures would be. In some cases the failure scenarios are many. So many that setting up all the system tests to verify them could be cumbersome.

We don't write our solutions to requirements as a single, massive, monolithic method. Regardless of the language you're using, there is some sort of modularization and code reuse mechanism built in. Some of these modules are standalone blocks of code, while others are compositions of other modules. If you've ever done Functional Programming the standard solution consists of creating extremely small, to the point functions that do one very specific thing and nothing else. You then compose these small functions into larger functions until you eventually get the full application. Let us think about how you'd write tests for these types of designs.

Let's say you are writing an application that makes multiple API calls and generates an output that combines them all. The failure cases are similar, if any given API fails the error is reported rather than the combined data, but each error has its own unique output. To break this code down we might create a fetching method, one that handles the success case and one that handles the failure case:


def fetch_api(url):
  res = request_url(url)
  if res.success:
    return res.data
  else:
    return return.error_info

def handle_success(data):
  modified = do_something_with_raw_data(data)
  more_modified = do_more_stuff_with_data(modified)
  return more_modified

def handle_failure(err):
  if err.code == 1:
    return do_stuff_for_err_1(err)
  elif err.code == 2:
    return do_stuff_for_err_2(err)
  elif err.code == 3:
    return do_stuff_for_err_3(err)
  elif err.code == 4:
    return do_stuff_for_err_4(err)
  else:
    return unknown_error()

def process_many_apis(apis):

  all_success = []

  for api in apis:
    res = fetch_api(api)
    if res is Error:
      return handle_failure(res)
    else:
      all_success.append(handle_success(res))

  return all_success

When writing your tests you could go directly to the method you'll be using `process_many_apis()` and write a test for a success scenario and one for a situation where you get error 1, and 2 and 3 and 4. This would mean you need to setup an API that returns all those error cases, or you need to mock up the APIs. It becomes a large task to have full coverage of the system. We don't write our code in this huge monolithic manner, why would we want to write our tests like this?

Instead if we break down the tests to the individual subcomponents and spread the coverage on those, our overall system test can be as simple as happy path and a single failure path. Our API fetching can be two tests, returning success and a single failure. Our handler for success has just the one test and our handler for failure would have 5. While this may be a lot more tests, creating a test case for handling the failure means mocking up the error message being passed, not creating an entire system API that actually fails. We can look at the code and recognize that if the coverage of both handlers and the fetcher are good, the expected output from the system is pretty much guaranteed.

While this example is extremely simple, I'm sure we've all had real world projects where the edge cases were on the order of 10s or 100s. Writing tests for every single one of those, even if you write them programmatically would just be ridiculous. This is what defines "unit tests." The way you can break down the testing to be easy to create, and consume, so that the system tests no longer require extensive coverage. If you can test at the lowest level with the most coverage, each level of composition above that can have reduced coverage.

So to answer Sean's question, a unit test is that which requires the least amount of work to setup but is able to reduce the need for coverage of the larger system. Whatever you want to consider a "unit" is up to you and the language you're using.

$ published: 2022-10-12 18:54 $

$ tags: programming $

-- CC-BY-4.0 jecxjo 2022-10-12

back