Friday, 13 March 2009

Names and the Importance of Semantics

I've taken the below 'post' from some documentation I've recently written for the developers where I'm currently working. It appears to me that, in general, developers (and, indeed, architects) spend far too little time thinking about names and naming conventions. Personally I spend hours thinking about names. Good names are an incredibly important aspect of good software design. Anyway - if you care to read on, this is what I've said on the subject so far:

Aside from writing code that is correct, efficient, and actually works, the most important aspect of software development is, arguably, naming. Choosing good names for namespaces, classes, methods, and properties (even member variables, local variables and parameters) is incrediby important (and therefore also difficult) because the semantics they convey. A name must be unambiguous and clearly convey the purpose and function of the named component when viewed in isolation, as well as when it is viewed in the context of its root namespace, immediate namespace, class, and so on.

Good type and namespace names leave little room for misunderstandings and mistakes because of ambiguity. Namespaces and types defined in the .NET Framework are very clearly named throughout and the semantics of each name are typically very clear. A good example of this is the System.IO namespace.

The System.IO namespace name is unambiguous because it is short, because the relationship between the components of the namespace ("System" and "IO") is clear, and because each component of the namespace is named well semantically:

"System" implies a logical grouping of functionality that is 'close to the machine' or 'close to the framework'. The "System" namespace, by virtue of its name, is very clearly not a task or application specific namespace.

While it may be argued that the word "System" is ambigous because it can encapsulate so much functionality (and varied functionality at that), when seen in its context (it is the main root namespace of the entire framework) the name is still very clear.

"IO" is a very old and commonly used abbreviation ("InputOutput") in computer science and is always associated with data transfer between devices (both internal and peripheral). The immediate association of "IO" is that of file read/write operations, and this is exactly the kind of functionality that the classes in this namespace provides.

System.IO is therefore a really good namespace because the semantics of the first component's name lends meaning to the other. "System" tells the developer that we're dealing with general, non application-specific functionality, and "IO" implies that the functionality is file-operation specific.

The File class within the System.IO namespace is another good example of good naming. Its fully qualified name, System.IO.File, makes it absolutely clear what the class is and what it does - it's very clearly a representation of a file on a file system and provides file operations to the caller.

The methods on the File class are also very well named. A couple of examples illustrate how clear, short names can convey a lot of information when viewed in their proper context:

public static System.IO.FileStream Open(string path, System.IO.FileMode mode)

The method name File.Open is in itself very clear; it is obvious that the Open() method will open a file on a disk. There is sometimes a tendency to be too specific when naming methods - the above method might for example be called OpenFile(). Viewed entirely in isolation, the second method name, OpenFile(), is clearer than Open() - but when viewed in the context of its class, OpenFile() is obviously a poorer name than Open() because the context (the File class) already makes it obvious that we are in fact dealing with files!

public static bool Exists(string path)

This method name is clear because it can be phrased as a question with a simple yes/no (or true/false) answer: "Does this file exist?"

public static void Move(string sourceFileName, string destFileName)

This method name is unambiguous (clearly File.Move() is a method that moves a file on a file system), but pay particular attention to the parameter names; sourceFileName and destFileName leave no doubt about what you are dealing with. If this method's signature is paraphrased into a sentence it would read something like "move this source file to this destination."

The above examples above illustrate the importance of context when constructing names in software. A named element must make sense in isolation, but it is also very important that it makes sense contextually. Methods must be named so that the name itself carries the correct semantics, but the method's parameters (and return type) should be named in a manner so as to add further semantic value to the method's signature.

1 comment:

BigBrother said...

Oh my God! And here I was thinking you had a life. Shock to the system. That's all this fishy relative can say. :-)