Thursday, April 29, 2010

Notes on Software Design, Chapter 0: What?

As an author, I've learnt that the introduction is best written last, when text has unfolded, and you really know what your paper is about.
I'm far from that stage with my Notes on Software Design, but I see things clearly enough now that I can write a reasonable introduction, hence the Chapter 0, coming after Chapter 5.

It all starts with the realization that software is just a material, and software design (at any level) is an act meant to give shape (or form) to the material. Software, however, is not a physical material, so we can't borrow on traditional disciplines to seek guidance. Well, sort of.

What do we know about physical materials?
I'm not trying to write an essay on Materials science, so I'll focus on a few central issues here.

Materials have well defined properties. Examples of properties are Electrical conductivity, Coefficient of thermal expansion, Hygroscopy, and so on.

Properties enrich our language and make reasoning more effective (Donald Norman would say that properties "make us smart"). Arguments based on well-defined properties are more robust and don't lead to endless debates. "This material has high hygroscopy, so keep it away from moisture". Period. No debate.

Properties allow to focus: I need to sustain high compression. Therefore, I need high compressive strength. Etc.

Properties allow to select the best materials. Elasticity is required if you want your material to come back to its original shape after stress. Actually, if you know the stress, you can pick the best material, based on a set of criteria, defined by the target product. An important criterion, of course, would be cost.

Properties provide guidance while shaping the material. In fact, we even have manufacturing properties, like castability (see the wikipedia page above on material properties).

Now, properties are well-defined when they are based on replicable, experimental observations. Most often, they are based on both a quantitative measurement process and on a physical theory of matter.

The quantitative side is often based on forces. Compressibility is based on Pressure (and Volume). So we can link everything back to a few well-understood forces.

The theoretical side is usually based on a model of matter. Physical materials are subjected to the well-know laws, like newtonian physics (at the right scale). The model of matter itself (e.g. the notion of metals crystal structure) is helpful when we try to understand why materials behave in a certain way under some specific force.

Moving up (as I said, I'm not trying to write a comprehensive essay on physical materials), we have construction principles and patterns. This is encoded knowledge, prescribing what we should or should not do, or describing how to do something. The bright side is that, in the modern world, we can trace back most principles and patterns to a set of well-defined forces and properties.

Finally, we have tools. Finite-Element Analysis, for instance, can be used to investigate the mechanical, thermal, electromagnetic, [etc], properties of a large structure.

What do we know about software as a material?

Not much, I'm afraid. Our knowledge is basically articulated in:

- Principles
- Patterns and Blueprints
- Methods
- Metrics
- Ilities

There is no shortage of principles. This interesting page, for instance, lists quite a few. Many are ill-defined and redundant, but a working knowledge of most principles is considered necessary for any good programmer / designer.

We have patterns - I know I don't have to say more about this. We also have reference architectures, which are not quite the same as patterns or pattern languages, but close enough.

We have methods. Test Driven Design, for instance, is a method, not a principle.

We have metrics. Metrics seem to be very close to properties. Halstead defined a concept of "volume" for software, based on information theory. The well-known Chidamber and Kemerer suite defined a number of property-like concepts like Depth of Inheritance Tree, Number of Children, and so on. Metrics have the nice property of being easy to measure (most of the times). They have the dubious property of being very remotely connected with design reasoning. In fact, most literature on metrics is about proving they're not meaningless, mostly by showing some correlation with bug density or change density or something more connected with software development.

We have -ilities, like reliability, scalability, and so on. These are often ill-defined, hard to measure properties of the final product. More akin to defining a "safe car" than to defining the properties of an alloy.
Some authors have contributed more ility-like properties. Robert Martin, for instance, talks about Rigidity, Fragility, Viscosity and so on, but they are mostly based on metaphorical reasoning, not an a solid theory. They're also partially overlapping, and not precisely defined.

What do we ignore about software as a material?

We don't have a theory of forces. Lacking forces, it's basically impossible to come up with meaningful properties. Compressibility can't be defined when you don't even know about compression (pressure).

We don't have true properties. Properties should extend from language design to library design to application design. Properties should encompass paradigms. Properties should be based on a sensible theory of what software design is about, not on the mere fact that we can measure something or come up with a nice formula. Properties must be perceived as useful by software designers.

We don't have a way to model forces and properties, basically because we don't know jack about them.

Some perspective
I live in an old building (and I like it :-). When you look at the walls, however, you can almost hear the architect thinking "every problem can be solved by making some wall thicker". Modern buildings are designed with completely different techniques. The next-generation green buildings will make the most out of our knowledge of construction materials.

I also live in the software world. You probably know the adage "All problems in computer science can be solved by another level of indirection" (David Wheeler). Why is that? What is, exactly, a level of indirection? What is indirection, by the way? Why is it useful? What is the underlying theory, what is a reasonable measure? What is the real difference between indirection in data and in control flow? Pick your favorite book on software design, and look up the answer if you can find it :-).

So what?
Well, in the end, this is what I'm aiming for. A theory of software forces. A set of useful properties of software as a material. This is what all this stuff is about. I haven't worked out everything yet. Actually, I've changed my mind on several things I've written so far (hey, it's a blog, not a book :-). But it's slowly coming together.

Oh, by the way. I know quite a few people that won't feel good about the above. They want software to be an art. They want software to be "about humans".
Let me state this clearly: I'm not trying to pursue some sort of "deskilling", whereby any fool could put together great software by applying some sort of magic process. I don't believe in deskilling. Actually, I believe in upskilling. I also understand the idea of software development as a craft, and even as an art. I know the poetry of code, so to speak :-). I rely on intuition and tacit knowledge every single day.
Still, no amount of craftsmanship will prevent a cold, thin glass from breaking when force is applied. I want to know why, and how to shape it anyway, and I want a better way to say that, to teach that, I want to reach a deeper understanding upon which better, more ambitious systems can be built.

Too much? Most likely :-), but hey, what is life without a noble aspiration?


Unknown said...

It seems that you are aiming for something yet undiscovered by others, or maybe not set in a proper form. In this post, though, you're comparing software with materials. But what if software is a really independent/uncomparable thing? Unrelated to everything else?

There can equally be forces, but as you said maybe no properties. Are forces enough to be used for comparison? I know, I know, I'll wait for the following chapters. B)))

From Christopher Alexander on, software has been compared to building, with its design, rules, and again forces: I see, though, that even "negative" forces are very different. I mean, the house of straw of the first little pig can be destroyed by rain, wind, fire, but how many lines of PHP or C code have you seen that are still doing their dirty job?

Carlo Pescio said...

I take your comment as a chance to state some of my points better.

Of course, software is not a physical material. I do not expect to find the same kind of properties that we can find in physical materials. Forces are also different. That does not break the metaphor.

Software is encoded knowledge. Still, we shape software. We can implement the same function using totally different forms. A different form gives the final product different -ilities (testability, extendibility, etc). I use the "material" metaphor (much like Donald Schon did in his observation of professionals, see also my paper "Listen to Your Tools and Materials") not because I want to compare software construction to building construction, but because it's the best metaphor I've found so far. Leaving science aside, even a craftman or an artis is usually shaping a material, and a deep understanding of your materials (even at the tacit level) make you better at shaping it. It's not very different from code. Even as I'm writing, I'm shaping a material.

Consider a piece of software making a choice. Suppose that choice can be implemented in (at least) two ways: using a switch/case or polymorphism. The function is the same. The shape is different. You say that software may not have properties. So what is the difference? Something that we can't define precisely? A vague "extendibility" property? How would that compare to dynamic loading and name lookup, a table-based implementation, or with an AOP-based solution? Are we condemned to stay forever in the dark, debating over design choices without any useful theory? I don't think so.

There is a force - let's say change, although I'll be using a different term later on - and the two shapes are reacting differently under that force. They react differently because they have different properties. It's pretty simple. The problem is defining the forces and the properties precisely. I don't need high formality. I just need a sound theory with practical applications.

Note: I never speak of "negative" forces. Forces are forces. Forces have consequences. We may or may not like the consequences. "Dirty code", for instance, is the typical ill-defined concept that leads to endless debates.
When something "dirty" works just fine, perhaps we haven't properly understood what dirty is about: grease may look dirty, but without grease most heavy-duty machinery would simply jam. Or it may be just "dirty" like "unmaintenable" (another ill-defined concept) but still working (waiting for the wind of change to bring it down). Or ...

It just looks like alchemy before chemistry :-). I know most people like it this way; actually, they want it to be this way, because there are so many opportunities within confusion. And alchemy is so romantic! Still, chemistry proved its value over and over.

And yes, I'm aiming for something yet to be discovered. After all, sofware design research is dead. But that would be a long story : ).

cyrille said...

I love this series on physics of software!

I just came across a post that compares software with the properties of elasticity, plasticity and fracture of metals, thought you might enjoy that other smart people think about that too:

Carlo Pescio said...

Cyrille: excellent pointer!

Funny thing is, over time I've been pondering quite a bit on technical debt, which is a simple metaphor but (unfortunately) would not stand up to scrutiny from an economist. There have been several attempts to improve/fix the concept, which (unfortunately again) breaks its simplicity. In my effort, I ended up with a better model based on real options, but nowhere as intuitive as the simple notion of debt.

Still, it didn't occur to me to reconsider technical debt under the "physics of software" perspective. Although I'm not sure that the parallel with deformation is the most suited, the idea is indeed excellent!

As I move further in my exploration of the subject, I'll certainly come back to this.

Lee Shin said...
This comment has been removed by a blog administrator.
Unknown said...
This comment has been removed by a blog administrator.