Sunday, 17 February 2008

Test Driven Development, part 2. Why TDD?

In my first post on Test Driven Development I introduced TDD as a development methodology and showed, without going into a deep level of detail, how TDD encapsulates high level software requirements and break these into low-level unit tests which are defined and written before any actual production software is written.

Why use the TDD Methodology?

In this post I want to explain why TDD is a good way of writing software and what benefits the methodology yields to software projects where requirements are changing continuously. If you are not familiar with TDD it may be a good idea to go back and read the first part of this series.

It is a statement of obvious fact that testing is necessary for any type of software development effort. Testing can be carried out in a variety of ways, so why is TDD better than the other approaches? The simple answer is that since TDD makes you think of your tests first, all your software development is focused on passing these tests. And as each test is a representation of a software requirement, the entire software development process drives towards meeting the pre-defined, clear requirements specified up front. Instead of a traditional approach where testing happens after development in a kind of "let's test to see if we got it right" mentality, TDD changes the testing mentality into "now that we have our tests, we had better get this right."

A further benefit of the up-front testing approach of TDD is that every unit of code that is written already has a test. This means that as the software grows, so does the number of tests, and there is no need to go through an exercise to create tests once the development phase is over. The tests are defined and written when necessary as new parts of the software are developed or as requirements are added or changed.

Additionally the tests that are written during a TDD project can be run any time to verify the integrity of the produced code. This is an absolutely invaluable aspect of TDD. The fact that you have a click-and-run verification process at your fingertips means that changes to software (requirements changes or refactoring) can happen quickly and safely. If you change a single line of code, re-write a method or an entire class, a single run of your tests will verify whether your code still functions like it should. The bigger the software project, and the more developers changing and adding code on the project the more valuable this aspect of TDD becomes. Just imagine an application with 25 libraries and ten developers working for three months to complete the project. Every single day each developer will commit (on average) changes at least once. Over the course of the three month period that means there are rougly 600 opportunities for introducing bugs into the central code base. With TDD and automated testing any bug introduced into the code base will be flagged at the first test run and can be fixed when it is introduced rather than weeks hence when nobody can figure out where it came from.

More subtle benefits of TDD include those that emerge from the step-by-step process of test and code creation. This will be covered in much greater details in a later post on how TDD is practiced, however it can be said here that the relatively strict guidelines of the TDD methodology encourages brevity, clear and logical encapsulation, and discourages bloated methods. TDD methodology forces you to write testable code, and following the guidelines will yield better, more readable, and more economic code. A common problem with non-TDD approaches is that code ends up being very hard to test (in an automated manner) because of the way the code is written. With TDD this problem is not prevalent.

There are some developers that subscribe to unit testing and automated test runs, but who do not believe in the TDD methodology. The most common reason I am met with goes something like "How can you test your tests? If you cannot test your tests you cannot be sure that your tests are correct. If a test is incorrect the resulting software will have incorrect behaviour. As such, TDD is an unsafe approach." I sometimes am told that TDD requires a very mature team of developers in order to function, and there is also a lot of doubt as to whether the up-front effort required to write tests pays off in the long run. "It's so time consuming," they say.

These are all valid concerns, but they typically stem from a lack of understanding of the underlying TDD methodology. It is true that you can end up writing tests that are incorrect. I have certainly done that more than once. However I know that I have written incorrect tests because eventually these incorrect tests have been caught out by other tests that I have created later. Writing an incorrect test amounts to roughly the same level of severity as misunderstanding or omitting a an untested software requirement. The fallout is determined by how badly wrong the test was written. The risk of these things happening is not minimised by creating layer upon layer of tests for your tests, but rather through clear communication and channels for requirements elicitation and documentation both prior to development and during the development phase.

I do not believe that TDD requires a "very mature team of developers". Rather I believe it requires a team of developers with clear vision and great aptitude. A developer's number of years of experience "in the field" does not determine whether or not he is fit for TDD. It is true that the TDD methodology requires practise and is not learned quickly, but this does not mean that it cannot be taught to or practiced by juniors. On the contrary, TDD teaches very good software development skills and can be an excellent tool in the education of inexperienced developers.

But of course, TDD in the hands of sloppy developers will not yield better software than that team would using any other methodology. TDD is not a silver bullet. Nothing is. If you believe it is you'll end up shooting yourself in the foot.

My answer to the argument about TDD being time consuming usually centers around examples from my own experience. Recently I had to add a single member field on a class, load the value for that field from a database, and ensure that the field was transferred properly from the data access layer to a WCF service layer. It took me between four and five hours to write all the tests and code required. I'll add that the data access layer involves custom data mapping with iBATIS and is a little bit more involved than what you would normally assume, but four-to-five hours for this task probably sounds like a lot of effort for what I accomplished. But the pay-off for all that work came shortly afterwards when a change elsewhere in the system caused my newly written tests to fail. And they failed for reasons I had not even thought of - so chances are that without the tests in place this bug will have gone unnoticed for much longer. It's difficult to quantify how much effort would have gone into discovering, hunting down and fixing the bug, but I am pretty sure that it would have taken me at least an hour - maybe two. So if two and a half of the total five were spent writing my tests I've already had a significant return on my investment. And the best thing of all is that this investment will keep yielding a return, because the tests are still there and still running.

Finally, some people are simply more comfortable writing unit tests after development because this approach is more familiar and in line with the write-first-test-second approach that we all know so well. To this I submit that "test-after is better than no test at all", but also offer two points of reflection on why TDD is still the better alternative:

When you write unit tests after writing code you run into a dilemma once a test you’ve just written fails.

  1. If your test fails because you wrote the wrong test for your unit you will have to change your test to suit your code. This is inherently dangerous because at this point you’ve no assurance that your unit actually does what it should. A test that has been retrofitted in this manner can end up showing false proof that a unit behaves correctly when it does not
  2. If your test fails but is the correct test for your unit you have to change your unit so that it passes the test. This is exactly what’s at the centre of TDD methodology. Your tests should come first to provide the benchmark for your units. In this scenario you end up applying TDD principles, and it is better to adhere closely to these and apply them up front.

In my next post I will be writing about the how of TDD, going into practical code examples and discussing the step-by-step methodology that yields this amazing result I have spoken about here today.

No comments: