💾 Archived View for dmerej.info › blog › 0030-is-tdd-worth-it.gmi captured on 2021-11-30 at 20:18:30. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
2017, Jan 23 - Dimitri Merejkowsky License: CC By 4.0
Well, here is a good question! Here's what I have to say:
1. I don't know
2. It depends
Those are, by the way, two very valid answers you can give to any question. It's OK to not have a universal answer to everything ;)
Still, I guess you want me to elaborate a tad more, so here goes ...
Note: If you don't know TDD at all, you can read the introduction here[1]
I'll start by describing my own experience with testing in general, and TDD more specifically. Hopefully it will help you understand how I came to the above answer. It's a long story, so if you prefer, you can jump directly to the so what? section.
The story begins during my first "real" job. I was working in a team that had already written quite a few lines of `C++` code. We were using `CMake` and had several `git` repositories.
So I wrote a command-line tool in Python named `qibuild` that would:
The idea was to abstract away the nasty details of cross-platform `C++` compilation, so that developers could concentrate on how to implement the algorithms and features they were thinking about, without having to care about "low-level" details such as the build system.
The tool quickly became widely used by the members of the team, because the command line API was nice and easy to remember.
$ cd workspace $ qibuild configure $ qibuild make $ qibuild install /path/to/dest
It also began to be used on the build farms, both for continuous integration and release scripts.
Soon, I had to add new features to the tool, but without breaking the workflow of my fellow developers.
I decided to advise my co-workers to *not* use the latest commit on the `master` branch, as they did for the rest of the company's source code, and instead, I started to make frequent releases.
So instead of running `git pull`, they could just use: `pip install -U qibuild` and get the latest stable release.
Testing was complicated: the code base was already quite large, and the safest way to make sure I did not break anything was to re-compile everything from scratch (that alone took something like 15 minutes), and then perform a few basic checks such as:
My first idea was to write a bunch a "example" code.
Instead of having to compile hundreds of source code files spread across several projects, I could use two projects with just two or three source files:
test world CMakeLists.txt world.h world.cpp hello CMakeLists.txt main.cpp
The `world` project contained source code for a shared library, (`libworld.so`), and the `hello` project contained source code for an executable (`hello-bin`) that was using `libworld.so`.
Compiling `world` and `hello` from scratch just took a few seconds, so testing manually was doable.
But I was not very good at testing manually. Quite often I forgot to test some corner cases, and so many bugs were introduced without me noticing.
So I decided to start writing automated tests.
The tests looked like:
class ConfigureTestCase(QiBuilTestCase): def setUp(self): pass def test_configure(self): self.run(["qibuild", "configure", "hello"]) def test_build(self): # We need to configure before we can build: self.run(["qibuild", "configure", "hello"]) self.run(["qibuild", "build", "hello"]) def test_install(self): # We need to configure and build before we can install: self.run(["qibuild", "configure", "hello"]) self.run(["qibuild", "build", "hello"]) self.run(["qibuild", "install", "hello", self.test_dest]) # do something with self.test_dest def tearDown(self): # Clean the build directories: super().clean_build() # Clean the destination directory for install testing: if os.path.exists(self.test_dest): shutil.rmtree(self.test_dest)
Few things to note here:
At the time, that's all the tests I had.
That meant I could do refactoring without fearing regressions to much, but I still had to run *the entire test suite* to be a little more confident about any change I just made.
I also started measuring test coverage and was unhappy with the results. (60% if I recall correctly)
I also noticed that even though I was very careful, every release I made had some serious regressions, and so members of my team started to get reluctant to the idea of upgrading.
Code was clearly becoming cleaner, but this was not a good enough reason for them to upgrade.
A light at the end of a tunnel [IMG]
The decision to try TDD came from several sources, I'm not sure which was the decisive one at moment, but here are a few of them:
So there, I started using TDD for all the new developments, and I kept doing that for several years.
2: https://www.youtube.com/watch?v=YX3iRjKj7C0
3: https://www.destroyallsoftware.com/screencasts/catalog
4: https://destroyallsoftware-talks.s3.amazonaws.com/boundaries.mp4
Coverage went up, tests became more reliable and useful , regressions became more and more uncommon, adding new features became simpler and easier, and overall everyone was happy with the tool.
For the curious, here what the tests looked like:
def test_running_after_install(qibuild_action, tmpdir): qibuild_action.add_test_project("world") qibuild_action.add_test_project("hello") qibuild_action("configure", "hello") qibuild_action("make", "hello") qibuild_action("install", "hello", tmpdir) hello = qibuild.find.find_bin(tmpdir, "hello") subprocess.run([hello])
So, now I had finally solved that xkcd puzzle[5]:
All I had to do was to write tests first, and everything would be OK!
But no-one around me believed me.
A few of them tried, but they gave up soon.
Many of them were working with "legacy" code and just making sure the *old* tests still pass was challenging enough.
I tried to told them about the classical beginners mistakes[6] but it did not work.
6: http://blog.cleancoder.com/uncle-bob/2016/03/19/GivingUpOnTDD.html
But I was right! I had seen the light! It did not matter if I did not manage to convince anyone, I was right, and they were wrong.
I began realizing how little I knew about testing thanks to this very blog.
You can read more about this in "My Thoughts on: 'Why Most Unit Testing is Waste'"[7].
7: My Thoughts on: 'Why Most Unit Testing is Waste'
It was then I understood that maybe things were not that simple.
I wrote two more articles to remember myself that my thoughts on testing were not completely black and white:
0012-is-line-coverage-meaningless.gmi)
0014-when-tdd-fails.gmi)
This happened after I took a new job in an other company.
People there were ready to try TDD, and I was lucky enough to be there when two new projects started.
Surely this time, people willing to try would have no excuse (no legacy code this time!), and I even gave a talk to the whole team about TDD
But nope, it did not go as I expected:
Maybe TDD was working for me just because:
One day, I wondered how hard it would be to implement a wiki from scratch.
The basic stuff seemed easy enough:
I decided to write the server in `Go`.
This was new to me, because it was the first time I was working with a language with such a short compilation time.
I found myself forgetting to type `go build` before restarting the server, so I wrote this short script:
# dev.py print(":: Starting loop") print("> Will stop as soon as the build fails") print("> To restart the server, press CTRL-C") while True: cmd = ["go", "build", "server.go"] subprocess.check_call(cmd) try: cmd = ["./server"] subprocess.check_call(cmd) except KeyboardInterrupt: pass
Here's how the script works, assuming the `dev.py` script is running and you are in a state where the server is running with the latest version of the source code:
Most of the time (especially when you get better at mastering the `Go` language), you get the new version of code running very shortly after saving the `.go` source file you were working on.
After a while, I started having the server generating `HTML` forms, and I found myself filling the same form and hitting the `submit` button over and over again.
So I started automating, using `py.test` and selenium
First, I wrote a fixture so that the server source code will always get built before running:
@pytest.fixture(autouse=True, scope="session") def build_and_run(): subprocess.check_call(["go", "build"]) process = subprocess.Popen(["./server"]) yield process process.kill()
Then I wrote my own `browser` fixture, using the "facade" design pattern to hide the `selenium` API:
class Browser(): def __init__(self): self._driver = webdriver.Chrome() def click_button(self, button_id): button = self._driver.find_element_by_id(button_id) button.click() def read(self, path): full_url = "http://localhost:1234/%s" % path self._driver.get(full_url) return self._driver.page_source
This allowed me to write things like this:
def test_edit_foo(browser): browser.read("/foo/edit") browser.fill_text("input-area", "Hello, world") browser.click_button("submit-button") assert "Hello, world" in browser.read("/foo")
That turned out to be a very nice experience.
I could:
Then again, the feedback loop was very short. I could edit the `HTML` to add the proper `id` attribute, and then re-run the tests to check if the generated HTML looked good in a web browser.
So, was I writing tests before or after? And did it matter?
Around the same time, I watched *Writing Software*, a talk David Heinemeier Hansson gave in RailsConf 2014 Keynote.
You can watch the talk on youtube[8].
8: https://youtu.be/9LfmrkyP81M
In it, David talks about TDD, but it's only a small fraction of his talk, and I highly recommend you listen all it has to say and not focus on the most controversial parts.
Anyway, the talk gave me a lot to think about.
For me, TDD worked really well for several years for one of the projects I've been working on.
I liked the fast feedback loop, the fact I could refactor with confidence thanks to well-written and fast tests, and how I could just type one command and have an answer to the eternal question: "Did I just break something?".
But, when I started working on a Web application written in `Go`, it turned out I could get the same feeling and the same kind of loop without doing TDD at all. All it took was a 10 lines Python script.
To say it differently, I think I was too obsessed with the *what* and the *how* and forgot about the *why*, something I feel happens far too often when it comes to new technologies.
Let me explain:
For me, the *what* of TDD is just one sentence: "Write your tests firsts, *before* the production code".
The *how* is: "Follow the rules: it's *red*, *green*, *refactor*".
Those are very easy to remember and explain, but I think I did not manage to convince anyone to try because they did not really care about the *what* and the *how*. They wanted to know *why*.
I kept telling them they should always write their tests first, that they should try and stick to it for several weeks before "getting" it, TDD not being something you could "learn in one week-end".
But that's not what they wanted to hear, they wanted to know *why* TDD was worth trying, and the answer: "Because it worked really well for me and my project" was not good enough.
So here goes: why do we do TDD?
I think TDD is just a *framework*. It's a set of tools, rules and conventions you can use to write better tests and better production code.
But what are "good tests" and "good production code"?
So, write your tests first, write them last, or don't write any, but please remember what really matters: your code will be written once, and read many times.
Also, you are not paid to write code[9]. What matters is that the features that need to be implemented are done and that the bugs are fixed.
9: http://bravenewgeek.com/you-are-not-paid-to-write-code/%5D
Thanks for reading!
----