Saturday, August 04, 2007

Get the ball rolling, part 4 (of 4, told ya :-)

The past two weeks have been hectic to say the least, and I had no time to conclude this quadrilogy. Fear not :-), however, because here I am. Anyway,I hope you used the extra time to read comments and answers to the previous posts, especially Zibibbo's comments; his contribution has been extremely valuable.
Indeed, I've left quite a few interesting topics hanging. I'll deal with the small stuff first.

Code [and quality]:
I already mentioned that I'm not always so line-conscious when I write code. Among other things, in real life I would have used assertions more liberally.
For instance, there is an implicit contract between Frame and Game. Frame is assuming that Game won't call its Throw method if the frame IsComplete. Of course, Game is currently respecting the contract. Still, it would be good to make the contract explicit in code. An assertion would do just fine.
I remember someone saying that with TDD, you don't need assertions anymore. I find this concept naive. Assertions like the above would help to locate defects (not just detect them) during development, or to highlight maintenance changes that have violated some internal assumption of existing code. Test cases aren't always the most efficient way to document the system. We have several tools, we should use them wisely, not religiously.

When is visual modeling not helpful?
In my experience, there are times when a model just won't cut it.
The most frequent case is when you're dealing with untried (or even unstable) technology. In this context, the wise thing to do is write code first. Code that is intended to familiarize yourself with the technology, even to break it, so that you know what you can safely do, and what you shouldn't do at all.
I consider this learning phase part of the design itself, because the ultimate result (unless you're just hacking your way through it) is just abstract knowledge, not executable knowledge: the code is probably too crappy to be kept, at least by my standards.
Still, the knowledge you acquired will shape the ultimate design. Indeed, once you possess enough knowledge, you can start a more intentional design process. Here modeling may have a role, or not, depending on the scale of what you're doing.
There are certainly many other cases where starting with a UML model is not the right thing to do. For instance, if you're under severe resource constraints (memory, timing, etc), you better do your homework and understand what you need at the code level before you move up to abstract modeling. If you are unsure about how your system will scale under severe loading, you can still use some modeling techniques (see, for instance, my More on Quantitative Design Methods post), but you need some realistic code to start with, and I'm not talking about UML models anyway.
So, again, the process you choose must be a good fit for your problem, your knowledge, your people. Restraining yourself to a small set of code-centric practices doesn't look that smart.

What about a Bowling Framework?
This is not really something that can be discussed briefly, so I'll have to postpone some more elaborated thoughts for another post. Guess I'll merge them with all the "Form Vs. Function " stuff I've been hinting to all this time.
Leaving the Ball and Lane aside for a while, a small but significant improvement would be to use Factory Method on Game, basically making Game an abstract class. That would allow (e.g.) regular bowling to create 9 Frames and 1 FinalFrame, 3-6-9 bowling to change the 3rd, 6th and 9th frame to instances of a (new) StrikeFrame class, and so on.
Note that this is a simple refactoring of the existing code/design (in this sense, the existing design has been consciously underengineered, to the point where making it better is simple). Inded, there is an important, trivial and at the same time deep reason why this refactoring is easy. I'll cover that while talking about options.
If you try to implement 3-6-9 bowling, you might also find it useful to change
if( frames[ currentFrame ].IsComplete() )
while( frames[ currentFrame ].IsComplete() )

but this is not strictly necessary, depending on your specific implementation.
There are also a few more changes that would make the framework better: for instance, as I mentioned in my previous post, moving MaxBalls from a constant to a virtual function would allow the base class to instantiate the thrownBalls array even if the creational responsibility for frames were moved to the derived class (using Factory Method).
Finally, the good Zibibbo posted also a solution based on a state machine concept. Now, it's pretty obvious that bowling is indeed based on a state machine, but curiously enough, I wouldn't try to base the framework on a generalized state machine. I'll have to save this for another post (again, I suspect, Form Vs. Function), but shortly, in this specific case (and in quite a few more) I'd try to deal with variations by providing a structure where variations can be plugged in, instead of playing the functional abstraction card. More on this another time.

Procedural complexity Vs. Structural complexity
Several people have been (and still are) taught to program procedurally. In some disciplines, like scientific and engineering computing, this is often still considered the way to go. You get your matrix code right :-), and above that, it's naked algorithms all around.
When you program this way (been there, done that) you develop the ability to pour through long functions, calling other functions, calling other functions, and overall "see" what is going on. You learn to understand the dynamics of mutually recursive function. You learn to keep track of side effects on global variables. You learn to pay attention to side effects on the parameters you're passing around. And so on.
In short, you learn to deal with procedural complexity. Procedural complexity is best dealt with at the code level, as you need to master several tiny details that can only be faithfully represented in code.
People who have truly absorbed what objects are about tend to move some of this complexity on the structural side. They use shape to simplify procedures.
While the only shape you can give to procedural code is basically a tree (with the exception of [mutually] recursive functions, and ignoring function pointers), OO software is a different material, which can be shaped in a more complex collaboration graph.
This graph is best dealt with visually, because you're not interested in the tiny details of the small methods, but in getting the collaboration right, the shape right (where "right" is, again, highly contextual), even the holes right (that is, what is not in the diagram can be important as well).
Dealing with structural complexity requires a different set of skills and tools. Note that people can be good at dealing with procedural and with structural complexity; it's just a matter of learning.
As usual, good design is about balance between this two forms of complexity. More on this another time :-).

Balls, Real Options, and some Big Stuff.
Real Options are still cool in project management, and in the past few years software development has taken notice. Unfortunately, most papers on software and real options tend to fall in one of these two categories:
- the mathematically heavy with little practical advice.
- the agile advocacy with the rather narrow view that options are just about waiting.
Now, in a paper I've referenced elsewhere, Avi Kamara put it right in a few words. The problem with the old-fashioned economic theory is the assumption that the investment is an all or nothing, now or never, project. Real options challenge that view, by becoming aware of the costs and benefits of doing or not doing things, build flexibility and take advantage of opportunities over time (italics are quotes from Kamara).
Now, this seems to be like a recipe for agile development. And indeed it is, once we break the (wrong) equation agile = code centric, or agile = YAGNI. Let's look a little deeper.
Options don't come out of nowhere. They came either from the outside, or from the inside. Options coming from the outside are new market opportunities, emerging users's requests or feedback, new technologies, and so on. Of course, sticking to a plan (or a Big Upfront Design) and ignoring those options wouldn't be smart (it's not really a matter of being agile, but to be economically competent or not).
Options come also from the inside. If you build your software upon platform-specific technologies, you won't have an option to expand into a different platform. If you don't build an option for growth inside your software, you just won't have the option to grow. Indeed, it is widely acknowledged in the (good) literature that options have a cost (the option premium). You pay that cost because it provides you with an option. You don't want to make the full investment now (the "all" in "all or nothing") but if you just do nothing, you won't get the option either.
The key, of course, is that the option premium should be small. Also, the exercise cost should be reasonable as well (exercise price, or strike price, is what you pay to actually exercise your option). Note: for those interested in these ideas, the best book I've found so far on real option is "Real options analysis" by Johnathan Mun, Wiley finance series)

Now, we can begin to see the role of careful design in building the right options inside software, and why YAGNI is an oversimplified strategy.
In a previous post I mentioned how a class Lane could be a useful abstraction for a more realistic bowling scorer. The lane would have a sensor in the foul line and in the gutters. This could be useful for some variation of bowling, like Low Ball (look under "Special Games"). So, should I invest into a Lane class I don't really need right now, because it might be useful in the future? Wait, there is more :-).
If you consider 5 pin Bowling, you'll see that different pins are awarded different scores. Yeap. You can no longer assume "number of pins = score". If you think about it, there is just no natural place in the XP episode code to put this kind of knowledge. They went for then "nothing" side of the investment. Bad luck? Well guys, that's just YAGNI in the real world. Zero premium, but potentially high exercise cost.
Now, consider this: a well-placed, under-engineered class can be seen as an option with a very low premium and very low exercise cost. Consider my Ball class. As it stands now, it does precious nothing (therefore, the premium cost was small). However, the Ball class can easily be turned into a full-fledged calculator of the Ball score. It could easily talk to a Lane class if that's useful - the power of modularity allows the rest of my design to blissfully ignore the way Ball gets to know the score. I might even have a hierarchy of Ball classes for different games (well, most likely, I would).
Of course, there would be some small changes here and there. I would have to break the Game interface, that takes a score and builds a Ball. I used that trick to keep the interface compatible with the XP episode code and harvest their test cases, so I don't really mind :-). I would also have to turn the public hitPins member into something else, but here is the power of names: they make it easier to replace a concept, especially in a refactoring-aware IDE.
Bottom line: by introducing a seemingly useless Ball class, I paid a tiny premium cost to secure myself a very low exercise cost, in case I needed to support different kinds of bowling. I didn't go for the "all" investment (a full-blown bowling framework), but I didn't go for the "nothing" either. That's applied real option theory; no babbling about being agile will get you there. The right amount of under-engineering, just like the right amount of over-engineering, comes only from reasoning, not from blindly applying oversimplified concepts like YAGNI.
One more thing about options: in the Bowling Framework section, I mentioned that refactoring my design by applying Factory Method to Game was a trivial refactoring. The reason it was trivial, of course, is that Game is a named entity. You find it, refactor it, and it's done. Unnamed entities (like literals) are more troublesome. Here is the scoop: primitive types are almost like unnamed entities. If you keep your score into an integer, and you have integers everywhere (indexes, counters, whatever), you can't easily refactor your code, because not any integer is a score. That was the idea behind Information Hiding. Some people got it, some didn't. Choose your teacher wisely :-).

A few conclusions
As I said, there are definitely times when modeling makes little sense. There are also times when it's really useful. Along the same lines, there are times when requirements are relatively unknown, or when nobody can define what "success" really means before they see it. There are also times when requirements are largely well-known and (dare I say it!) largely stable.
If you were asked to implement an automatic scoring system for 10 pin bowling and 3-6-9 bowling and 9 pin bowling, would you really start coding the way they did in the XP episode, and then slowly change it to something better?
Would you go YAGNI, even though you perfectly know that you ARE gonna need extensibility? Or would you rather spend a few more minutes on the drawing board, and come up with something like I did?
Do you really believe you can release a working 3-6-9 and a working 9-pin faster using the XP episode code than mine? Is it smart to behave as if requirements were unknown when they're, in fact, largely known? Is it smart not to exploit knowledge you already have? What do you consider more agile, that is, adaptive to the environment?
Now, just to provoke a reaction from a friend of mine (who, most likely, is somewhere having fun and not reading my blog anyway :-). You might remember Jurassik Park, and how Malcom explained chaos theory to Ellie by dropping some water on her hand (if you don't, here is the script, look for "38" inside; and no, I'm not a fan :-).
The drops took completely different paths, because of small scale changes (although not really small scale for a drop, but anyway). Some people contend that requirements are just as chaotic, and that minor changes in the environment will have a gigantic impact on your code anyway, so why bother with upfront design.
Well, I could contend that my upfront design led to code better equipped to deal with change than the XP episode did, but I won't. Because you know, the drops did take a completely different path because of small scale changes. But a very predictable thing happened: they both went down. Gravity didn't behave chaotically.
Good designers learn to see the underlying order inside chaos, or more exactly, predominant forces that won't be affected by environmental conditions. It's not a matter of reading a crystal ball. It's a matter of learning, experience, and well, talent. Of course, I'm not saying that there are always predominant, stable forces; just that, in several cases, there are. Context is the key. Just as ignoring gravity ain't safe, ignoring context ain't agile.


Fulvio said...

Sulle asserzioni e TDD: sembra quasi ovvio che i test non bastano soprattutto se sono test di unità. il rispetto dei contracts sarebbe megllio verificato in test di integrazione, ma soprattutto in quelli di sistema. e comunque non si terranno in conto possibili estensioni future non comprese nei test che le asserzioni possono verificare. almeno la penso così!

Sul visual modeling: mi hai convinto già tanto tempo fa, niente da dire :D

P Vs S: aspetto il post Form Vs Function che pare essere succulento già dalle premesse :D

Real Options: mi hai fatto riconsiderare le classi che sembrano perfettamente struct che di solito evito come la peste. In un paio di occasioni avrebbero potuto essere come la tua ball. Dovrò studiare un po' queste real options.

"Good designers learn to see the underlying order inside chaos, or more exactly, predominant forces that won't be affected by environmental conditions" questa me la stampo e la metto davanti al letto in modo da leggerla ogni mattina :D

Giorgio said...

I agree that some forces won't change as frequently as the others: they are probably implicitly assumed as stable in some Agile approaches. For example, in DDD we assume that we can compute most of the logic in objects that are kept in memory; in mathematics we assume that our operations will only scale if we write giant matrix and procedural code.
However, I have a different reason for ignoring requirements: choosing a subset of them and using them to guide my first model (usually in code) that I will expand later.
For example, in university I was writing an encrypted peer-to-peer file sharing system, and I didn't know where to start. But considering only the requirement for searching files, without encryption and a single server "got the ball rolling" for me. When I had built a client-server protocol and a multi-threaded server, I picked up the next features. Considering more than a few requirements was premature for me; of course the ability to choose the right subset for starting out would be handy.

Carlo.Pescio said...

Giorgio: I understand your point, and in a sense, it's very similar to Beck's idea of a "stepping stone" in Responsive Design (I left a long commentary on RD here: )

Of course, the dynamics of research projects and of most commercial software development tend to be different. This is not to say that we cannot use your approach - I actually do in many cases, usually with some strategic thinking behind it, like:

- I can "modularize awya" the decision. I may not want to deal with a portion of the requirement / design space right now, but I know how to modularize that decision away. I make the decision "invariant" for the rest of the system (see This is the ideal case, especially if I'm confident enough that I can work on that later.

- I have a backup plan. For instance, I'm consulting in a project right now, and although we know how to deal with a few requirements, we don't like the implications of doing so the "standard" way, as it conflicts with the cloud-based nature of the project. I've chosen to ignore the issue in release 0 and just exclude those features, as we learn more about what we can do. But it's not like we're just hoping for the best - worst case, we'll do it the standard way.

- In almost every other case, in a real-world, money-at-stake project, I would suggest starting with the most difficult part ( You don't want to spend a lot of time painting yourself into a corner. Fail fast if you have to fail.

That said, unlike in Hacker News, where failure is like "cool, welcome to the entrepreneur club" etc, in the cold world out there failure is often associated with bankrupcy :-) and serious financial issues for everyone involved. The point of the "conclusions" paragraph was to keep context in mind. I'll still consider that valid after 5 years :-).