Monday, 28 September 2009

Literate Programming

A while back I wrote a rather lengthy entry about the significance of strong and clear naming conventions, and I spent some time explaining how well-defined, unambiguously named methods with to-the-point parameter names end up reading like sentences out of a book.

Today a friend pointed me to the Literate Programming website, where the following snippet can be found:

"I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be works of literature. Hence, my title: 'Literate Programming.'

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other."

When I read this I immediately thought: YES! This is exactly what I've been trying to say. Code should not be a language for a select 'elite' that prides itself on obscurity. It should be something that can be read with ease and even by someone with very little technical understanding. I dare say that if you achieve that, you've achieved maintainable code.

Wednesday, 23 September 2009

Boolean Polymorphism

I think this is a rather good article about the pitfalls of large parameter lists and code maintainability.

Wednesday, 22 July 2009

Factory Assembly Lines, Red Cars, and TDD

For the last five weeks I have been supporting developers on a project that makes full use of automated unit and integration tests, including the use of a mocking framework. The frameworks of choice are NUnit and Moq, which is fairly standard stuff. What makes our little project special, though, is that the developers were all (bar one) completely new to the concepts of unit testing and mocking! This has made for some interesting challenges. All the developers are highly capable and intelligent people, so they've adopted the new principles with amazing speed. But as anyone who's transitioned to TDD will know, the mindset required is very different from that of 'traditional' development. As such, I've had to do a lot of mentoring during the last month or so.

There's nothing quite like having a good example to go by when you're trying to learn something new. That's the case with TDD as well. I also think that in order to learn TDD and the test-first discipline you need something more. What's needed is a school of thought. Perhaps even a dogma. I think that this is far more important than specific examples of how to do specific things, because with the correct line of thinking you'll arrive at good solutions to any of your testing problems. It was with this in mind that I wrote what follows; a rather absurd testing scenario involving a factory assembly line and lots of red cars.

What I want to address now is the creation of tests, and the scope of this discussion includes what a test should test (the scope of the test, if you like), where boundaries should be drawn, and how such decisions should influence your design while you’re coding. For now, imagine that you are responsible for the production line of cars, and that you want to be able to test that a car is painted in the colour that you have specified.

How would you go about testing such a thing? The first thing that springs to mind is perhaps to instruct your assembly line (people and machines) to build you a red car, and then go have a look at the car when it’s built to see if it is, in fact, red. Such an approach would certainly work, but it would be a hell of an expensive test! What if the car came out the wrong colour? Or a different shade of red than you expected? You’d have to keep cranking out cars while making adjustments until you get the colour right, and that’s just not a sensible approach.

Of course, this example is absurd because you would never do that if you owned a car factory – but I’ve chosen this example exactly for that reason; you’re clearly doing too much work just to check that the car turns out with the right colour. Too much work. That’s a key thing to remember.

How could you improve on the process? Well, you might start by skipping the entire “build-me-a-car” bit and focus on the paint job itself. A car is painted by a big robot with lots of paint guns, and this whole mess takes place at the end when the car has been assembled and all non-metallic surfaces have been carefully masked etc. Since this is where paint is applied we can narrow our testing to this machine. We can put a piece of paper in front of the paint gun and ask it to colour the paper red. If things turn out unexpectedly, we can easily repeat the test. Paper is cheap, and even paint is much cheaper than a car. Not only that, but repeating the test is much faster also. This test is therefore infinitely better than the first test. We’re doing much less work, it’s costing less, and we’ve focused right down on the thing that makes a car red (or any other colour). Focus. That’s another key thing to remember.

Let’s stop and think about focus for a moment. In the first test, if a car turned out burgundy rather than the sports-car red you’d imagined, where would you make your adjustments before testing again? Sure, you may know that it’s the big ol’ paint robot at the end of the line that does the work, but how do you get your instructions to it? Is there a human involved? At what stage of the car assembly do you have to ‘input’ the colour? Right at the start? Does anything happen to your colour instruction while the car is being built? Does one human tell another, or is it written down on paper and later typed into a computer? It should be fairly easy to see that a test with such wide focus (or no focus at all) is prone to all kinds of environmental noise that can affect the output and thus make accurate testing difficult. So focus is important.

Now, we’ve narrowed our focus to the robot that paints the car, and we’ve devised a test that involves the robot spraying paint onto a sheet of paper. This is, in the scheme of things, probably a decent test. But we can do better. Does the paint gun produce the colour? No it doesn’t. The purpose of the paint gun is to deliver paint at the correct pressure and velocity, and to ensure that the nozzle produces a mist of paint with exactly the right droplet size etc. We can certainly use the paint gun to test colour, but since it’s got nothing to do with producing the colour of paint, we should see if we can create a better test. We should narrow our focus again.

If you keep repeating the process of narrowing your focus you’ll eventually end up at the part of the machine that mixes paint to create specific colours. Now the focus seems to be about right (you can probably narrow things down even further, though). The part of the machine that mixes the paint is what’s responsible for the colour that eventually ends up on the car. Arguably, you can take the whole paint element out of it because your paint is likely to be generic and colourless – so you can just focus on the mixing of pigments. But I don’t want to be too pedantic, either.

Now that we’ve got the focus right, we’ve got to get our test right. We’re trying to verify that when we ask for red, red is what we get. But if you think about it, red isn’t very specific either. Is it blood red? Burgundy? Maroon? A bit more on the pink side of the spectrum? To try and test for red isn’t specific enough. And here’s another thing to be mindful of. Be specific. Don’t test for red, but instead test for a specific shade of red. And so your test will start to change shape as well. Rather than testing for a colour you’ll find that you’ll be testing that the paint mixer dispenses the correct amounts of red, green, and blue (the basic components of the RGB colour system) in order to produce a specific shade of a specific colour. Now you’ve got a good test. A valuable test. A test you can write home about.

Such a test can be carried out without much work, it’s focused (it involves only part of a sub-system), and it’s very specific (it tests conditions for very specific input). All these things are good. In the context of the car factory you’ve now got a fast, cheap, repeatable, and accurate test that’ll verify that your factory is capable of cranking out cars in every colour of the spectrum.

Well, that’s actually a half truth. Or less than a half truth. You’ve only created a test for mixing pigments. You’ve not tested that the right amount of mixed pigment is added to the correct volume of paint. Nor have you tested that the paint gun actually works like it should and that the paint-robot moves like it should. And there isn’t a single test, yet, for any other aspect of the car assembly line. All this is beyond the scope of this discussion, but suffice to say that if you are a car producer hell-bent on making big bucks you’d apply the same principles as above to creating tests for every single aspect of your production line.

Now let’s try and relate this example back to software. The car assembly line would likely be implemented as a number of applications and/or services that each carry out specific tasks (basic assembly, welding, engine installation etc). The paint-robot might be one such application. The robot’s paint gun would be represented by a class, as would the pigment mixer. And the actual mixing of pigments would probably be represented as a single method on the pigment mixer class. That’s the unit you’d be testing. You wouldn’t write a test that invokes all those applications because it would require too much work and be too difficult, and it would result in an inherently inaccurate test.

So, when you’re writing tests and designing your code you should always make sure that what you’re working on lends itself to testing (it doesn’t require a lot of work), it is focused, and very specific. If what you are testing is part of something much, much larger then ensure that your unit is isolated. You do this by using mock objects for its dependencies. If you can’t use mock objects for whatever reason you should stop and think about why that is and see if you can change your design. Maybe you don’t have to. Maybe you can run your test in a slightly wider context without doing too much work or becoming too unfocused or non-specific. If you can, fine. If you can’t – change your design.

Careful thinking about how you’d test the code you are going to write will help you write better systems. As with anything, it requires practice and experience (and you’ll never stop learning) and it’s possible to go terribly wrong – but if every decision you make is the result of careful, informed consideration the likelihood of failure is minimal. Good tests are tests that are focused on one system, and specifically a single function of that one system – and that do very little work in order to test that function. Make the creation of such tests your goal, and you’ll quickly learn how to write clean, robust, and testable code.

Monday, 22 June 2009

PartCover - an alternative to NCover?

I've been looking for a cheap or free tool for measuring NUnit test coverage and just stumbled upon PartCover. I've installed it and fiddled with it and it seems to be an OK tool. It's Open Source, too - which I think is great.

I have to say, though, that the documentation is poor and the application itself is buggy (I experienced a critical exception when trying to save a coverage report to file). Despite this, I'll take a buggy PartCover over NCover which would set me back $650 for a license!

Might write more about this later. Check it out!

Friday, 13 March 2009

Names and the Importance of Semantics

I've taken the below 'post' from some documentation I've recently written for the developers where I'm currently working. It appears to me that, in general, developers (and, indeed, architects) spend far too little time thinking about names and naming conventions. Personally I spend hours thinking about names. Good names are an incredibly important aspect of good software design. Anyway - if you care to read on, this is what I've said on the subject so far:

Aside from writing code that is correct, efficient, and actually works, the most important aspect of software development is, arguably, naming. Choosing good names for namespaces, classes, methods, and properties (even member variables, local variables and parameters) is incrediby important (and therefore also difficult) because the semantics they convey. A name must be unambiguous and clearly convey the purpose and function of the named component when viewed in isolation, as well as when it is viewed in the context of its root namespace, immediate namespace, class, and so on.

Good type and namespace names leave little room for misunderstandings and mistakes because of ambiguity. Namespaces and types defined in the .NET Framework are very clearly named throughout and the semantics of each name are typically very clear. A good example of this is the System.IO namespace.

The System.IO namespace name is unambiguous because it is short, because the relationship between the components of the namespace ("System" and "IO") is clear, and because each component of the namespace is named well semantically:

"System" implies a logical grouping of functionality that is 'close to the machine' or 'close to the framework'. The "System" namespace, by virtue of its name, is very clearly not a task or application specific namespace.

While it may be argued that the word "System" is ambigous because it can encapsulate so much functionality (and varied functionality at that), when seen in its context (it is the main root namespace of the entire framework) the name is still very clear.

"IO" is a very old and commonly used abbreviation ("InputOutput") in computer science and is always associated with data transfer between devices (both internal and peripheral). The immediate association of "IO" is that of file read/write operations, and this is exactly the kind of functionality that the classes in this namespace provides.

System.IO is therefore a really good namespace because the semantics of the first component's name lends meaning to the other. "System" tells the developer that we're dealing with general, non application-specific functionality, and "IO" implies that the functionality is file-operation specific.

The File class within the System.IO namespace is another good example of good naming. Its fully qualified name, System.IO.File, makes it absolutely clear what the class is and what it does - it's very clearly a representation of a file on a file system and provides file operations to the caller.

The methods on the File class are also very well named. A couple of examples illustrate how clear, short names can convey a lot of information when viewed in their proper context:

public static System.IO.FileStream Open(string path, System.IO.FileMode mode)

The method name File.Open is in itself very clear; it is obvious that the Open() method will open a file on a disk. There is sometimes a tendency to be too specific when naming methods - the above method might for example be called OpenFile(). Viewed entirely in isolation, the second method name, OpenFile(), is clearer than Open() - but when viewed in the context of its class, OpenFile() is obviously a poorer name than Open() because the context (the File class) already makes it obvious that we are in fact dealing with files!

public static bool Exists(string path)

This method name is clear because it can be phrased as a question with a simple yes/no (or true/false) answer: "Does this file exist?"

public static void Move(string sourceFileName, string destFileName)

This method name is unambiguous (clearly File.Move() is a method that moves a file on a file system), but pay particular attention to the parameter names; sourceFileName and destFileName leave no doubt about what you are dealing with. If this method's signature is paraphrased into a sentence it would read something like "move this source file to this destination."

The above examples above illustrate the importance of context when constructing names in software. A named element must make sense in isolation, but it is also very important that it makes sense contextually. Methods must be named so that the name itself carries the correct semantics, but the method's parameters (and return type) should be named in a manner so as to add further semantic value to the method's signature.

Tuesday, 10 March 2009

Orange Mocha Frappucino

This is just a quick note to those of you that I know read this blog - because I know you know Steve. He's started blogging over on Orange Mocha Frappucino. First post is about handling web.config in unit tests. Good one, Steve!

Tuesday, 24 February 2009

James Joyce .NET

Last week, after doing a code review, I coined the term "James Joyce .NET". Though the James Joyce factor isn't particularly .NET specific (I'm sure there are Ruby, C++, Java, and most certainly C developers out there who do things in similarly cryptic manners), I think the term has a nice ring to it. And in fairness to Joyce, I think the code I reviewed was far less readable than Ulysses.

Aside from software that works and "does its thing" in a reliable manner, clients care a lot about code that's maintainable. The client may not know that they care about this, at least not at first - but once you tell them that a change is going to take three weeks to complete "because of this, that, and the other" (because the code you've written isn't maintainable) you bet your sweet ass they start caring. They care not because of the state of your code; they care because of the cost. And rightly so.

There are many key things to consider when writing maintainable code but the subject of my current rant will be readability. Does your code read like a book? Probably not (hey, it's not fiction) - and that's OK. But do your method names and signatures read like sentences? Could you translate the body of your methods into short little textual paragraphs that clearly illustrate what they do? If you answered "no" to both of these questions it's probably time to stop and think a little.

Disclaimer: I'm renowned for harping on about ideals and I've been caught out many a time falling short of my own mark. That said, I stick to my ideals because without them we've got nowhere to go.

To start with, code must be legible. That means you have to choose names (for your classes, methods, parameters, variables) that clearly indicate purpose. Favour readability over brevity when creating names. Second, your names must make semantic sense. A class-method name combination should be completely unambiguous, easy to read, and clearly convey "what you get". Classes and methods should do what it says on the tin. Don't be obscure, don't be ambiguous.

Third: Use the appropriate constructs for the job. Don't embed an algorithm in a property. Don't use private properties. Properties are there to expose aspects of a class, so if you're not going to expose it, use a private member variable instead. Don't use indexers for anything but accessing collection data. Don't use generics unless you need generic code, and then only if generics give you something that a typed parameter list doesn't (using interface/abstract classes instead of concrete types).

Seriously, think long and hard about any code you write. Just because something works doesn't make it good. And give some thought to those people that come after you. Will they be able to understand what you've done? You might very well be a genius and the best thing since sliced bread but really, what good is that if nobody understands what you've done? Don't obfuscate your code by being lazy, too clever, or by using constructs in a way that they were not intended.

</ rant>

Wednesday, 18 February 2009

What not to do 5 minutes before a meeting...

I typically don't read the Metro magazine on my way to work, but today I did. And there was a little snippet in there about a YouTube video that was supposed to be very funny. This tickled my curiousity, so when I found myself with a couple of minutes to spare before a meeting this morning, I went to YouTube to watch the clip about David After Dentist.

Bad decision. It took me a good 10 minutes to stop grinning and get my sudden bursts of laughter under control. Not good when someone's called a meeting with you to give you an update on serious matters.

Funny as hell though :-)