Monday, July 02, 2012

Life without Stupid Objects, Episode 1


So, this is not, strictly speaking, the next post to the previous post. Along the road, I realized I was using a certain style in the little code I wanted to show, and that it wasn't the style most people use, and that it would be distracting to explain that style while trying to communicate a much more important concept. 

So this post is about persistence and a way of writing repositories. Or it is about avoiding objects with no methods and mappers between stupid objects. Or it is about layered architectures and what constitutes a good layer, and why we shouldn't pretend we have a layered architecture when we don't. Or it is about applied physics of software, understanding our material (software) and what we can really do with it. Or why we should avoid guru-driven programming. You choose; either way, I hope you'll find it interesting.


"Line of Business" applications
Consider the usual business application: you insert, search, edit, and delete a variety of business entities, usually storing those entities in some kind of database. There is some logic kicking in at specific times, from simple validation logic to complex calculations and workflow actions. Sometimes, data is transformed; very commonly, it is summarized in reports. In many cases, it is persisted in a form which is not really different from what you see on the screen. Needless to say, there is also a user interface to do all that, and today, it's very common to go for a web application.

Let me leave the user interface aside; it's a large topic, and would (will) deserve some separate discussion. What is a reasonable shape (architecture) for all the rest? This is not a new question, and you can find many answers, picking from the various patterns in Fowler's et al. (Patterns of Enterprise Application Architecture), from the architectural suggestions in Evan's DDD book, from to the reference architectures coming with specific technologies (like the now old EJBs) or toolsets (e.g. O/R mappers), etc.

The sheer volume of knowledge is so big that we face a problem of depth vs. breadth. Either we cover a lot of alternatives, but at a rather superficial level (like in Fowler's book) or we choose a particular style, and we go down deep, possibly providing an implementation as well.

At this time, I'm more interested in discussing forces and how they are shaping different solutions. I'll be focusing on a rather popular style, not because it's necessarily the best, but because it's a good example of how stupid object may arise out of good intentions, and how can we cure that. In this sense, this post is more like an example of "applied physics of software" than an attempt to discuss every nuts and bolts of every possible choice. In the end we'll see some code as well, because some properties are better appreciated through a few lines of code than through endless babbling.

Forces, layers and decisions
Of course, most of you guys already "know" how to do this. Sure, you may all have different architectural preferences, but there is a strong tendency to go for this overall structure:
- a database of some kind (often SQL, sometimes NoSQL)
- a "persistence layer"
- a "business layer"
- (possibly) an "application layer"
- (possibly) a "service layer"
- (a "gui layer", which I'll ignore)

Where do those layers come from? What is a layer, exactly, anyhow? It isn't just a random grouping of classes / functions / modules, right? What is the purpose, definition, and role of a layer in a software system? You could head for "Pattern Oriented Software Architecture, Vol. 1: A System of Patterns" (Buschmann et al), chapter 2, go to the Layer pattern, and you won't find an exact definition of a layer :-), but you'll find its responsibilities: "Provides services used by Layer J+1. Delegates subtasks to Layer J-1". Hmmm. Well, ok.

Layers, by nature, are a grouping mechanism, wereby we grow software by stacking abstractions "vertically", while grouping similar software abstractions "horizontally". Why do we give that shape to our software? In the physical world, a shape is the result of applying a force (this is an interesting insight from D'Arby, to which I'll return at some point), like compression of shear. In the software world, the shape is always the result of applying a decision.

Many decisions are taken without even thinking: for instance, we decide that data must persist between sessions, even if we turn power off. That's sort of a given and people don't even realize they're choosing something.

Some decisions are technological in nature:  we decide that we want to use a SQL database. We decide that we will use Oracle. We decide that we won't use stored procedures. Etc.

Some decisions are business related: we decide that we want to support multiple authentication techniques. We decide that we will greet the user with a "happy birthday" message if he logs in at the right time, etc.

Decisions are unstable. Some more than others (this will largely be the subject of my next post). Now, in a 40-year old paper you can't afford not to read (On the Criteria To Be Used in Decomposing Systems into Modules, and yes, that's forty years kids) David Parnas said: "We propose instead that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others"

So what is a good definition of a layer? A layer is a group of abstractions, hiding one or more internal decisions plus all the decisions made inside the underlying layers. Therefore, a layered system is "good" when you can actually change a decision made in one layer, and none of the layers stacked on top of it needs to change. In order for that to happen, a layer should avoid replicating/exposing the responsibilities of the underlying layer.

Let me give a proper example. In the quintessential layered architecture (hardware abstraction in OS) at some point there is a portion of code dealing with virtual memory, and how you manipulate CPU-specific things to make that happen. This is not exposed to the upper layer (it could be, but it's more like a backdoor). Instead, we get a new abstraction (virtual memory). Languages like C# and Java effectively add another layer, basically hiding the concept of addressing at all (still exposed in C and C++). This in a sense is the best layering, as the concept disappears and we don't even speak of a "memory layer".

Contrast this with a business layer built on top of a persistence layer, and yet exposing/replicating the persistence responsibilities (say, CRUD) for the application layer to call. This is not an ideal layered architecture (sorry), because your layers are becoming bigger as you travel up, and changes made in a lower layer may end up being reflected in an upper layer.

In practice, it's quite rare to find a problem (forcefield) which lends itself well to true layering. In many cases, layers are brittle, that is, most of the decisions made inside those layers can't actually be changed without breaking all the layers above. For instance, if you aim for the Buschmann book again, you'll find an example of layered design for a chess game, explained as follows (from bottom to the top):

- Elementary units of the game, such as a bishop
- Basic moves, such as castling
- Medium-term tactics, such as the Sicilian defense
- Overall game strategies

In practice, you have very little latitude for change in every layer: you can safely change only minor implementation details. Suppose that you change a "major" decision ("business rule"), like the basic moves of the games. You allow some previously illegal move, or vice versa. That requires a change in layer 2. However, that change would immediately invalidate all the upper layers: layering is not so effective here.

It is also worth observing that layers, although conceptually simple, are extremely constraining, as they basically define a single dimension upon which you can stack your abstractions, and therefore will push you toward a design where you have to identify a "principal decomposition" (contrast this with the goal of aspect orientation to free you from the tyranny of the principal decomposition).

In most cases, developers choose technology as the principal decomposition in Line of Business systems. I want to stress the fact that it is a choice (a decision), and it is not inherent in the concept of layers or in any design paradigm, but it's still the most popular choice.

Why technology?
Once you see a layer as a set of artifacts that are somehow "squeezed together" by some force, you can easily see that technology is only one of the forces acting on your artifacts.

Sure, you have a force saying "let's squeeze together all these classes, as they know they're dealing with SQL", the rationale being that if you change to NoSQL, you want to "change only one layer" (sooner or later, I'll come back to the concept of entanglement and explain this stuff better).

Yet, you have at least another (orthogonal) force, pushing all the technologies together, and layering the system according to the domain instead, the rationale being that if you add a field to one of your abstractions (like a title, so you can say "hello Dr. Something, happy birthday"), you only have to change one abstraction. When you choose technology as your principal decomposition, this decision is spread among layers instead, giving place to an unfavorable entanglement field among many "distant" abstractions (I consider layers to be somehow distant in the artifact space).

There is also another force. I'm sure many of you felt it, but frequently decided to look the other way :-). It’s a force pushing the logic toward the data, which gave rise over time to things like stored procedures, still considered like dirty, non-portable hacks by many. And yet, as you move toward a distributed, NoSQL solution heavily based on map-reduce, you'll see your (distributed) data attracting more logic, with nodes offering higher-level services, effectively subverting some of your layers. But we can't talk about these things in public.

So, in the end, people ignore all that and favor technology as the principal decomposition. There are many valid, and many non-valid reasons to do so. I won't even try to enumerate, as reason #1 is "all the gurus are telling us this is the right decomposition" and you can't argue with that. Still, trying to list your reasons and ponder on the validity of each one is an interesting venture in self-awareness :-). For instance, division of labor and technical specialization seems like a good reason, but is not without consequences.

Note that, in the typical line-of-business application, the distinction between the "application layer" (or "use case layer" as some call it) is not necessarily driven by technology. Actually, that line is blurry, and again, it's mostly being accepted as reasonable because the gurus tell us so, not because it is technically bulletproof. The typical definition is that the domain is about things that "are always valid", independently of the application / use cases, while the application layer realizes the use cases by delegating to the domain layer.
Makes sense, except when you look again, and discover that "always valid" makes no sense whatsoever. For instance, my User class is offering a date of birth field (and possibly an IsBirthday() method) because my application wants to greet the user. Were I just interested (e.g.) in geolocating my user, that function wouldn't be provided, that field wouldn't even be there. So much for being part of the domain, then.
In the end, many people bring in technology again as a criteria to separate the application layer and the domain layer, by stating that domain objects should not know about persistence (which de facto alters the stack, but they don't say so because the gurus never said so). This brings us to the next big question.

Acceptable relationships between layers
Some relationships are basically imposed by the function (which here determines the form). Some are free for us to choose. Of course, choosing some relationship is akin to choosing a stacking order, or to abandon the stacking concept in favor of a more flexible dependency net (but hey, you can keep drawing it as if it were a stack - nobody is actually checking these things anyway :-).

- nobody should know the UI. That's the generally accepted idea. It's a bit preposterous, given that in many cases the back-end exists only to support the UI, but you can't challenge the notion in guru-driven development. It does not matter that, since the UI is in a web app, and the bandwidth is limited, you need to expose paging at the service level, and probably handle it down to the database level (using your SQL dialect of choice) to keep performance reasonable. You can still pretend that your stack does not depend on the UI if you keep adding enough layers. So, let's pretend nobody knows about the UI.

- the service layer, if distinct from the application layer, has to know the application layer. That's sort of a necessity, because when you split the two, it's to allocate in the service layer all the knowledge (decisions) related to the "delivery mechanism", as Martin calls it, that is, knowledge of your services being exposed as SOAP, JSON, RPC or REST style, etc. But in the end you have to invoke the damn use case / application layer, so the service knows the application.

- is the persistence layer allowed to know the domain layer? Hmmm. This would be a violation of the concept of layers. The persistence layers should only know about one decision (the persistence technology). It should also hide the underlying layers. Unfortunately, to hide the domain layer, it should also expose the domain functions, because the application layer wants those functions. That's not going to work. Of course, if we give up the idea of layers, new shapes are possible. But the gurus want layers, so I won't explore the other shapes.

- at this point, the application layer has to know the domain layer. That's again sort a given, because the application layer is thin and is just somewhat "orchestrating" the domain layer.

- is the domain layer allowed to know the persistence layer? This has been, and in some sense still is, at the center of a long-winded debate.
Proponents of techniques like Active Record and Lazy Fetching naturally think it's ok for a domain object to call into the persistence layer (perhaps without knowing so, through some ORM-generated code) when need arises.
Critics have many arrows to shoot. In practice, I've often observed horrible performance in systems where access to the persistence layer is basically out of control. I'm not the only one to have observed so, of course. Some tried / are still trying to solve the problem by making the whole thing even more complicated. For instance, they try to aggressively cache / prefetch objects, which requires a proper implementation of an identity map, etc. Others have gone with a more simplified approach: the domain layer is forbidden from talking to the persistence layer. It's the application layer which is responsible to mediate between the two. As I said, this actually brings persistence and domain at the same level in the stack, but nobody wants to say so.

In the end, the intended shape of many LoB systems is one of these:


Note that the force that is keeping the application layer and the domain layer apart is reusability (real or imaginary) of the domain layer in case 1, and a more technological separation in case 2.

Thou shall have no stupid objects
Or perhaps thou shall. At some point people realize that we can't easily obtain the shapes above by using intelligent objects.

What is coming out from the service layer, once the details of delivery (SOAP, JSON, whatever) have been removed? Not domain objects, as the service layer has to talk #only# to the application layer.

What is going in / coming out of the persistence layer? Not domain objects, as persistence layer can't talk to the domain layer.

What is going back to the service layer? Once again, not domain objects, because the service layer can only talk to the application layer. Not service-aware classes, as the application layer can't create service-level objects.

So, most naturally, people tend do the obvious thing: they add more classes, totally devoid of behavior, because damn, we have already allocated all the possible behavior in those !@#$ layers. Those classes go by many names, like DTO (data transfer object) or even Entity in some Microsoft-oriented shops (not to be confused with Entities in DDD, of course, just like the ValueObject in DDD should not be confused with stupid DTO-like things). Some (myself included) at some point see the sadness of writing those fake classes and try to adopt hashmap-like things, or dynamic objects in .NET, or whatever. Not a quantum leap, anyway, and fiercely opposed by some people anyway.

The funny thing is that this kind of architecture is often sanctioned by gurus.  You can read what the ubiquitous Robert Martin has to say by starting from here, although it's much clearer if you watch the entire video (entertaining, albeit a bit slow) here. The slides in the first post can get you up to speed quickly (if you can fill the gaps; btw, the ".key" file is actually a ".zip" file).

Martin is actually right when he says that (ideally, at least) the "delivery mechanism", (aka the service layer) should be considered a detail. However, note how quickly he falls into the idea that you need stupid objects to carry around information. Indeed, at some point he declares that "it is ok for objects to simply hold some data". Hmm, ok, if you say so. Except that NO, IT'S NOT OK. Just because you're doing it, or know no better way to get what you want, does not make it ok.

Guru-driven design is dangerous. Gurus have to know best, so if they can't do something, that something is irrelevant. If something they do has consequence, those consequences are ok. For instance, a guru can claim that in DDD is ok to have duplication (entanglement) between two bounded contexts, but not within a bounded context. Now, that's plain wrong, but people will accept it because you're a guru, there is no theory of forces to prove you wrong, and unlike in the physical world, the material won't react in a powerful way when you're wrong (try to negate the existence of gravity by jumping out the window instead :-). Of course, the guru could instead say that it's just the best trade/off he knows, but that's less guru-ish than ideal.

The actual danger is that people will stop trying. After all, the gurus say it's ok. So there is no need to explore new materials (AOP, a mixin revival, structural conformance, whatever) or to explore new ways of shaping traditional materials (as I'll do in a moment). It's very much like having a guru saying that it's ok for a bulletproof window to be totally opaque, because he doesn't know how to make bulletproof glass. Except that I don't like the idea of people spending life in darkness. So, sorry, it's not ok.

In this corner... the mapper!
Of course, when you condemn smart people to a never-ending pile of layers, all entangled on the same domain concepts, passing around stupid data structures, they can't just pretend not to see it. They won't challenge your architecture, because you're the guru, but they'll try to avoid some drudgery by writing smart code.

For various reasons, the only techniques with a dim chance of success here are reflection and generative techniques, and at some point everyone comes up with a concept of Mapper (emphasis on -er), something that can somehow find "equivalent" fields in similar object and clone them. Given enough motivation, these things get beefed up and turned into full-fledged libraries, for popular joy.

Introducing "mappers", hopefully semi-automatic classes, provides some relief for coders, but is only a coping technique. For instance, run-time friction kicks in anyway. In a very interesting dissertation (Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat), Guoqing Xu found that a lot of inefficiency stems exactly from long "copy chains". Does it ring a bell?

Still, I don't want to challenge the established architectures, so here is the modest contribution of this long post: to tell you guys a possible way to keep those layers but let go of stupid objects (and henceforth of mappers). I don't claim originality: it's very possible that many of you guys are already using this style. I came up with this a few years ago, mostly while trying to mediate between the desire of some of my clients to go with the "standard" layered architecture and my disgust for stupid objects. I haven't seen this style adopted and pushed to its real potential elsewhere, but it's totally possible that it's commonly adopted by some and that I just had no chance to see it used. It's based on the semi-layered architecture where the domain layer can't see the persistence layer, but is a very general idea.

On truly understanding your material
An important concept in the physics of software is that we have run-time things (like objects, with their memory footprint, and actual function calls, with parameters being passed and CPU cycles eaten) and artifacts, like source files, where we describe those things, their structure, their behavior.

In practice, we expect some run-time properties and some artifact-side properties, but traditional literature has never taken time to explain where we want something to happen. For instance, where do we want layering to happen? Where do we want isolation of change? We want that on the artifact side, of course. When a decision changes, we want to change only one folder of source files (layer). We may also want to reuse an artifact elsewhere, as a source file or as a compiled library or as an executable. We actually don't care about run-time separation at all (to some extent - I'll add something about this in the "critics" section).

Still, people (including gurus) tend to get trapped into a linear form of thinking. If artifact A (say, an Application-level class) can't talk to artifact P (say, a Persistence-level) class in terms of class D (say, a domain-object class), then I have to introduce another class, and therefore, a new run-time object has to be created and then "mapped".

In fact, the emergence of stupid DTOs as a consequence of separation of artifacts is largely due to an underlying (wrong) assumption: that artifact separation requires run-time separation. This, however, is far from true, although it may require different programming techniques, depending on your language. From another perspective, the idea may suggest useful concepts we may want to be supported by our programming languages - this is akin to Materials science, and the idea of creating a new material based on the forces it has to withstand; contrast this with how many languages are created, based on notions of "purity" instead of a sound theory of forces.

Application talking to Persistence without DTOs
Say that you have an Application object, and this object has somehow obtained a Domain object (we'll see how in a moment). Say that you want to persist the Domain object, but unfortunately, the Persistence layer can't know about your Domain object in a layered architecture. Is there anything better than creating a DTO by basically cloning the Domain object (and then claiming that it's ok to have stupid objects)?

Well, if you think about it, while it's not ok for an object to exist only for the purpose of containing data, a Domain object may want to offer an interface to read some of those data. After all, when we accepted technology as a principal decomposition, we also accepted some kind of entanglement on domain concepts, and that interface is there exactly to represent the R-entanglement we got.

Suppose I've got a User class (a domain concept) with some domain logic, like telling me whether or not today is the user's birthday. I could easily do this (example in C#, but without the I in interfaces, so it looks like Java :-)

// the R-entanglement interface
interface UserData
{
  int Key { get; }
  DateTime dateOfBirth { get; }
}

// the Domain/Business object
class User : UserData
{
  // UserData implementation stuff

  // some business logic
  public bool IsBirthday()
  {
      // ...
  }
}  

// the Persistence layer, using the Repository pattern
class UserRepository
{
  public void Store( UserData u )
  {
    // I just want to read the data and issue an insert
  }
}

well, that was easy. Just because I don't want my Repository to depend on my domain object, doesn't mean I have to clone data before I pass it in. I can just use an interface (which does not need to live in the Domain layer).
Things get much harder the other way, though, and I guess here is when people turn back to DTOs:

class UserRepository
{
  public UserData Get( int key )
  {
    // ????
  }
}

Of course, it's not enough to give UserData the "set" method for properties. Of course we have to do that, but the real issue is that my repository has to create the object, and you can't create an interface, and you don't want to create a domain object inside the repository (because you don't want to know the domain object). Note that even creation alone (without usage) would break the layer shape, if you need to put a dependency between your artifacts.

Now you need to know your material well enough to solve this problem, as there is no language-independent solution. I'll give you the C# solution; in C++, you can basically use the same thing (you don't even need the interface as it is inferred). In Java you can't, so you have to bring in the heavy artillery, but most likely we're gonna do it in C# as well (more in a minute).

class UserRepository
{
  public T Get< T >( int key ) where T : UserData, new
  {
    // ... get data from the database
    T u = new T();
    // fill u with data (needs "set" methods in UserData)
    return u;
  }
}

and usage in the application layer would just be:

class MyWonderfulUseCase
{
  public void DoSomethingUseful()
  {
    // ...
    User u = userRepository.Get< User >( key ) ;
    bool b = u.isBirthday(); // u is a full-fledged domain object :-)
    // ...
  }
}

Look ma, no stupid objects. No mappers. No unwanted dependencies between layers (as artifacts). No friction. No hashmaps. All type safe. No casts either.

What was that?
Forget the language for a moment. All I did was to separate the artifacts without separating the instances (run-time things). I did that through interfaces + generics, because in C# and C++ generics / templates are powerful enough for this sort of things. But generally speaking, what I want to do is:

- define an interface to access data (or two, if you want read/write separation).

- communicate between layers in terms of that interface (removing the need for a DTO).

- define a mechanism for the "lower layer" to create objects in the "upper layer" without having to know their classes. In this case, I used generic programming.

Generally speaking, you may want to use an Inversion of Control Container instead. Besides working in Java as well, the IoC will be helpful in a variety of cases (like handling inheritance) where generics just won't work. In this case, the code above would become something like (using Unity in C#):

class UserRepository
{
  public UserData Get( int key )
  {
    // ... get data from the database
    UserData u = Container.Resolve();
    // fill u with data (needs "set" methods)
    return u;
  }
}

There is another reason to switch to an IoC: the generic version is a drag when your objects contain other objects which contain other objects and they all come up with a single query, perhaps through a stored procedure. The IoC has no problems with that. The generic version requires you to specify all types in the instantiation, which is not so nice (but not an insurmountable problem in many cases).

Note, however, that by removing the generic parameter, I have abstracted things a little too much. I wanted a User, but I'm getting back a UserData. So my application code would become:

      User u = userRepository.Get( key ) as User ;
      bool b = u.isBirthday(); // u is a full-fledged domain object :-)

which may not look like a cast, but is cast anyway. This can be fixed to some extent, if you add another interface on top of your domain class (which could still be useful for mocking) and combine the generic version above with the IoC. You'll still have a cast, but only in one place (the generic function).

Of course, the gurus will readily tell you that your architecture should not depend on an IoC. That's a rather narrow view of things. I don't need an IoC. I could go with a factory. I also don't need polymorphism. I could go with function pointers. Except I'm using polymorphism, thank you. I'll also be happy to use a language-provided mechanism once you get me a decent one :-). In fact, the idea that most programming languages don't readily support decoupling instance creation from class name at the artifact level is appalling, and it's just another sign of why we need a physics of software. The IoC is a superstructure. I want an infrastructure.

In practice, moving outside this limited case, it would be even more beneficial to have mixins, so that we can mix behaviors into an object. I need run-time mixins though. This is a long story and perhaps I'll say something about it in the future, probably using the only widely adopted language I know where I can express those things (Javascript: ain't that ironic? :-). The design space out there is far less constrained than the usual gurus may lead you to believe. Don't fall into habit. Think think think.

What about the other "mappers"?
Or about the other stupid objects? If you look at Martin's example, for instance, he has a ResponseModel class (actually, many: at least one per service) carrying only data. In practice, we tend to fall into one of these cases:

- that class is basically cloning the interface of some business object. There is no smart to add because the infrastructure will take care of the rest (for instance, in a .NET WebApi service, the JSON serializer [sic] is smart enough to send back my stuff without me adding behavior. Therefore, pass back the xxxData interface instead, or the read-only version if you like so (you may recognize this as similar to storing something by passing a read-only interface, as above; after all, why should these two things be different at all?)

- there is significant behaviour that could be allocated there, but it's on the service side. For instance, the class represents an error, and you want to convert it into an HTTP status + some json, and you need custom code. It's exactly the same issue we have between Persistence and Domain, except that now is the Application which has enough information (the error, the message) to build the object, but can't create the object because its class lives across the border. Just use the same trick: get a writeable interface from your IoC (or whatever), fill with data, pass the thing to the Service layer, which will happily cast it to an intelligent object. Here I think a cast is unavoidable (and no, I'm not weeping in fear).

How do you pass data from the service layer to the application layer is left as an exercise for the reader :-). In practice, this depends a lot on your technology stack, how far is already trying to go (model binding) and how well does it play with interfaces, IoCs, and the like. If it doesn't, I would say it's more of a limitation of a specific implementation than a fault of the general principle. Of course, some guru will tell you that your Model is not your Domain, and that you need just another layer. Oh well. Where is YAGNI when I need it?

Critique
In practice, I've heard three distinct critiques (plus one):

- it requires an IoC. Funny enough, this is usually coming from people already using an IoC :-), but for other, more guru-sanctioned things like instantiating a logging service (which they never ever change). As I said, I don't need an IoC. I need a flexible way of separating creation from static knowledge of a class name. The IoC is just a reasonable way to get that.

- I may then call business logic inside the persistence layer (as I'm creating a business object). Yes, but we'll spot you as you'll have a dependency on the business package. Sure, you can try to defeat the checking through reflection. But then, you can always load an assembly at run-time using reflection as well. As good Bjarne used to say, we want to protect ourselves from Murphy, not from Machiavelli.

- I don't mind writing all those structures and mappers. It's simple code. It's relaxing. Now, this may sound stupid, and I can honestly say that I felt like saying something very impolite when I first heard this argument. On retrospective, I understand the feeling. Technically, it's work, but it's rote work, so you can turn off your brain for a while, and nobody can say anything because hey, someone gotta write that code. In a high-stress environment, having something brainless to do can give you a legitimate break. That said, I don't like the argument, because it's corrupting the design to cope with a problem in a very different area (management). It's not my style, sorry.

- a small variant to the above is that I'm leaving some choices open (like, should I return a readable interface to the service layer, or create a service-layer instance from the application layer using an IoC). What's worse, I'm not mandating a style: you may want to use one thing or the other, depending on context. But that requires you to think, which is the opposite of the brainless coding above. Sorry. I'm into #braindrivendesign. There is no way around that :-).

What about testing? Right, someone has gotta play the T card at some point (this is the "plus one"). But actually, this technique plays very well with all your testing needs, as everything is interface-based and can be easily mocked. You don't need stupid object to be test-friendly.

However, I'd like this post to contribute to Brain-Driven Design, not to Guru-Driven Design. So, don't do this just because I said so. Just remember it can be done; you may not have tried this shape before: give it a try when you see a good fit, and see what happens. If, instead, you find yourself thinking "no no no no" and spending all your mental energy trying to find faults in this techniques, and not a single moment pondering on what's good, well, as I said above, it could be a nice step into self-awareness to stop and think why (I did this exercise many times in my life :-).

The end
Honestly, I don't think the layered shape is a good fit for LoB applications. As I said, very few systems can be built with a real layered shape, and most often, there is a significant difference between what is told/drawn and what is done (code). The most natural shape for OO systems is a network of services, and a very interesting thing would be to redefine the database as a provider of services, not merely data. This would also be more aligned with the need of distributed technologies and map / reduce (and therefore, make your [ex]persistence layer more robust in face of technology changes). It's actually easier than you think, and sometimes all we need is a small change in our queries. I'll touch on this in my next post, provided I don't forget :-).

Still, even with the heavy constraints of a semi-layered architecture, we can avoid stupid objects without losing separation of concerns (which is an artifact thing, not a run-time thing). Note that I didn't push (as some have done over the years) any presentation responsibility into the business objects (yes, a RenderInXml method is still a presentation responsibility). I've kept the same separation of concerns we have in traditional, DTO-based systems.

In the end, although I hope you guys enjoyed this post, I wrote all this just so that next time I'll be able to write code where domain objects come freely out of repositories, without being accused of breaking separation of concerns. So, next time I'll go back to the forcefield, and see what is telling about our unstable User class :-).

If you read so far, you should follow me on twitter.

33 comments:

xpmatteo said...

Hello Carlo,

I dig the technique for decoupling the domain objects as a bag of attributes from the domain object as a domain object. I agree about not wanting DTO's and other "simple data structures."

But I don't appreciate the effort that goes in making the repository not know about the actual class it should create. It seems to me that if you switch to the hexagonal architecture, many of the headaches that you argue about disappear.

In the hexagonal architecture, I don't mind my persistency layer to know about the domain objects; I really don't want the reverse to happen. A concrete example:

// in the domain package we have:

class User { ... }

interface UserRepository {
User find(Long id);
}

// in the persistence package we have

class MysqlUserRepository implements UserRepository {
User find(Long id) {
User user = new User();
// retrieve attributes ...
return user;
}
}


What's wrong with this style in your opinion? Or let me say better -- this looks much simpler that what you describe, as you don't need any magicks to instantiate a User without knowing what its concrete class is. Why didn't you use this style with the customer you were talking about?

Carlo Pescio said...

(the answer is too long for Blogger so I had to split it into 2 comments :-)

But I don't appreciate the effort that goes in making the repository not know about the actual class it should create.
---
I'll add to this in the end

It seems to me that if you switch to the hexagonal architecture, many of the headaches that you argue about disappear.
---
:-)))))) if you put like this, however, it sounds very much like the "guru driven design" syndrome I was talking about, right? It's just a different guru, claiming that something is not a problem, and henceforth making it disappear : ).

In the hexagonal architecture, I don't mind my persistency layer to know about the domain objects;
--
however, some forces are there whether you mind or not :-). This is the ultimate goal of all my babbling - that people stop seeing things as someone tells them they are, and start to see the forces as they actually are.

What's wrong with this style in your opinion? [...]
Why didn't you use this style with the customer you were talking about?

---
Part of the answer was indeed provided when I said “while trying to mediate between the desire of some of my clients to go with the "standard" layered architecture and my disgust for stupid objects”.

Those teams were deeply convinced of the virtues of the layered architectures, and even moving to this style was a painful process. In your case, you “break” perfect layering even more.

Now, just be clear: I’ve used the style that you suggest – actually, I’m using it in a system I’m building at this time, although I don’t call it a “hexagonal architecture”. Still, it’s not without consequences, and the guys pushing for a more layered architecture had a pretty good understanding of the shortcomings of doing as you say.

As usual, context is the key. They had a relatively complex database structure – I’d say between 300 and 500 tables, with many relationships. They expected significant concurrency and a relatively large number of records – not huge, but some tables could easily go over 50 million rows even when periodically purged. They had quite a few stored procedures (which I think were more than justified). Most reasonably, they wanted to test, performance test, and stress the persistence layer as soon as possible, and independently of a domain layer.

In this light, you can easily see that your style favors a “domain first” approach, where you want to mock the repository and return full-fledged business objects, but the layered style supports a “database first” approach, which was much closer to their design approach, their internal division of labor (not that I supported that, but you don’t break habits in a snap), etc.

They also believed they may need to “map” the result of a single query into different domain objects (say, within two different bounded contexts) and indeed one of them proposed a polymorphic mapper :-) as well. Note that in my style, I can easily accomplish that by having a different IoC context for distinct bounded contexts – a nice symmetry, in the end. As you might guess, however, in the end there was no need for this flexibility, at least as far as I’ve seen the code. But it’s just another limit of the “hexagonal” solution.

They also had “fake” reasons, like “then the persistence layer can call business methods”, which yes, it’s true, and unlike with the style I proposed, can happen because of Murphy, not Machiavelli. I don’t consider this a major point.

Carlo Pescio said...

Finally, they had a more philosophical stance, that is, that persistence is a “low level” service and as such cannot depend on “higher level” concepts like domain objects.
Honestly, it makes a damn lot of sense, and this is part of the problem with basing design on principles, gurus, etc. A different perception is just an inch away, but you have to change your position to change your perspective, and most people are very reluctant to do so. (this is the point where I should repeat that this is why we need a physics of software :-)))

If you ask me, anyway, the biggest shortcoming of having the repository statically bound to domain objects is indeed the impossibility of mocking domain objects, and test persistence alone.
Note that as soon as you bring in the opportunity for mocking (by introducing an interface) your forcefield is altered to a perfect equivalent of mine; you face the same issues, and may want to adopt the same solution.

This brings me back to your original observation about the cost: the real cost, as far as I’m concerned, is the creation of those xxxData interfaces. I don’t see a real difference in effort between writing “new” and writing “Container.Resolve”, although as I said I would welcome a more infrastructural solution.

Those interfaces, however, are needed if you want to mock your domain objects and test persistence alone, or mock the domain and test the service layer alone, etc. So yes, you can save some effort if you accept to lose a few opportunities (and strict adherence to layering, which I don’t value so much). Or you may want those opportunities. In that case, I like the idea that I can have them without stupid objects + mappers.

Also, note that when you want to mock the xxxData interfaces in a trivial way, it’s relatively easy to generate the code at run-time, so there is not much effort in that either.

In the end, however, it’s fine both ways for my purpose. I just want to get domain objects out of my repositories. That’s enough for my next post :-).

xpmatteo said...

This brings me back to your original observation about the cost: the real cost, as far as I’m concerned, is the creation of those xxxData interfaces. I don’t see a real difference in effort between writing “new” and writing “Container.Resolve”, although as I said I would welcome a more infrastructural solution.


I disagree; calling "Container.Resolve" implies a "IoC" framework and that has a lot of consequences. Calling "new" is much simpler. All else being equal, I much prefer not to have to use a IoC. That is, I prefer a style of architecture that does not "force" me or "push" me towards needing a IoC. Btw, wouldn't passing a factory be also an option for making your style work in Java?

But this is my preference; of course context is king and if the team is perfectly at ease with, say, Spring, I would go with the flow. Likewise, I agree that the style that I propose pushes towards doing objects first, persistence later; that's largely the point for me :-)

Other minor points: you can test persistence alone in the hexagonal style (and this is another reason I like it). There is a nice example in a chapter of Growing Object-Oriented Software. And if you have different bounded context, then you can define different repositories, just like you can define different "IoC" contexts.

In the end, I found the hexagonal arch. works better than the usual layered approach for doing business apps in TDD; I say this because I tried Active Record too :-) But I'm no guru, just a guy with a few data points :-)

Kevin Berridge said...

It sounds like you intend to touch on this in your next post, but even so, I'd like to ask just to raise the issue...

In your response to xpmatteo's comment, you said you just wanted to get the domain objects out of the repositories. Why? You've made it clear that you don't buy into the relatively arbitrary self-imposed limitations of layering, so you must be responding to some other pressure. What pressure makes you want to remove the dependency on the domain objects from your persistence layer?

I'd also like to hear more of your thoughts on "dumb objects." I hate mapping layers, which are required when you introduce DTO type classes. But it's not the DTOs I hate, it's the mapping. In fact, ever since watching Rich Hickey's "Simple Made Easy" talk and reading David West's "Object Thinking" I've been attracted to the idea of dumb data classes passed into and out of the persistence and UI layers, but "wrapped" by behavioral objects. This eliminates the need for DTOs and mapping layers completely by focusing on data.

If I was writing in a functional language, I may be able to simply deal with data, but in C# data == classes. Do you see any important drawbacks to a style of passing data classes between your layers, with behavioral objects at the center responsible for the business logic? (This may be similar to "hexagonal architecture" and also has similarities with the "Interactors" discussed by Bob Martin.)

Really enjoying your blog! Thanks!

Manuel said...

Hi Kevin, here are my two cents.


"In your response to xpmatteo's comment, you said you just wanted to get the domain objects out of the repositories. Why? You've made it clear that you don't buy into the relatively arbitrary self-imposed limitations of layering, so you must be responding to some other pressure. What pressure makes you want to remove the dependency on the domain objects from your persistence layer?"

---

I think the goal of the article was showing how to realize a layered architecture without exposing domain objects to the persistence layer (which would break the layers).


"I'd also like to hear more of your thoughts on "dumb objects." I hate mapping layers, which are required when you introduce DTO type classes. But it's not the DTOs I hate, it's the mapping. In fact, ever since watching Rich Hickey's "Simple Made Easy" talk and reading David West's "Object Thinking" I've been attracted to the idea of dumb data classes passed into and out of the persistence and UI layers, but "wrapped" by behavioral objects. This eliminates the need for DTOs and mapping layers completely by focusing on data."

---

The problem here is more a perspective/mindset one I think. As far as I understand object thinking, it's all about fractal design and messages between intelligent objects (that are like little virtual machines). The concept of "dumb object" seems to me in contrast with this vision. I don't say that it's absolutely wrong, but IMHO it's wrong in the context of object thinking and design.

Unknown said...

I need to read this post a couple of time more, it's a lot to process before having any comment! Anyhow, from time to time you've mentioned some features we can't find in programming languages that may help to directly support your physics of software (such as open classes and this time a mechanism to separate creation from knowledge of class name). I see that a language is not just a set of features, but if you had to list a basic set of features for a programming language, what would you include?

Anonymous said...

Carlo, my expectations had been met :-)

Business applications are not my field, so please forgive me if I missed something in your post.

I understand this post is about making a true layered architecture withouth using stupid objects (I also guess a layered architecture wouldn't be your first choice, am I wrong?)

What I don't like very much in your solution is the interface UserData: isn't it a "dumb" interface that provides only getters and setters? Even its name makes me ring a bell: UserData seems to me an "implementation name". Why it's acceptable in this context?

Thank you so much
Daniele

Carlo Pescio said...

[part 1]Matteo,
Generally speaking, I’m not sure you got the spirit of this post (could well be my fault). My spirit:
- Decisions act as forces, including layering decisions and decisions about acceptable dependencies between layers.
- Even within the highly constraining forcefield defined by a layered architecture with persistence at the bottom, as proposed by many, we can find a way (at least two, actually) to avoid stupid objects and mappers.
- Technique #1 is very cheap and only based on generics; technique #2 will benefit from an IoC, but as I said you can go by with factories.

It seem like you just want to win an argument about which architecture is better instead, with the usual rhetoric that goes with that. As I told you elsewhere, there can be no progress in design as a discipline if we keep talking / thinking this way. Anyway, trying to get back to the spirit of forces, choices, etc:

calling "Container.Resolve" implies a "IoC" framework and that has a lot of consequences. Calling "new" is much simpler
--
You know, I don’t like design by fear. The IoC has a lot of consequences :-). C’mon. It’s not much different than the C programmer claiming that OOP has a lot of consequences. Maybe you’re thinking about some huge, invasive framework, and we know you don’t like frameworks. But it doesn’t have to be like that. You’re also ignoring the option of using generics. And you’re also ignoring the main point: that the team really wanted persistence to be at the bottom, like many are suggesting. They would have just rejected your idea, and probably used stupid objects and mappers.

Btw, wouldn't passing a factory be also an option for making your style work in Java?
--
Of course, I also sort of said so. Also, using a function pointer table is a way to make it work in C. Perhaps macros will work too :-).

Likewise, I agree that the style that I propose pushes towards doing objects first, persistence later; that's largely the point for me :-)
--
Actually, within the context of data-intensive applications, I tend to use a very balanced approach. Proper database design needs a lot of attention, and may easily influence the “upper layers”. But once again, you seem to think that I’m suggesting, proposing, endorsing a specific layering. I’m not. I’m just talking about forces, about choices, about possibilities. I’m trying to open new possibilities even within highly constrained environments. You’re trying to remove possibilities and sell your preferred architecture. In this sense, you’re trying to engage me in a battle that I’m not interested in fighting.

Other minor points: you can test persistence alone in the hexagonal style (and this is another reason I like it). There is a nice example in a chapter of Growing Object-Oriented Software.
--
Hmm. Let’s leave aside the fact that the “nice example” is based on a wonderful OO design where you have classes like DatabaseCleaner, JPATransactor, EntityManager, EntityManagerFactory (C’mon), CustomerBuilder, etc etc. I said: “they wanted to test, performance test, and stress the persistence layer as soon as possible, and independently of a domain layer”, and your nice example seems to be testing persistence using domain objects (where is that Customer class coming from?)
You see, if you stop trying to win an argument and step back to my spirit of looking for forces, you’ll quickly understand that if your persistence layer interface is based on domain object, you can’t test your persistence layer without domain objects :-), and that once you try to do that, as I said, you’ll find useful to have an interface for the domain object (the data access part, at least) so that you can mock it, and that will bring you to the same forcefield as the other guys. Forces are there despite what your hexagonal guru says.

Carlo Pescio said...

[part 2]
And if you have different bounded context, then you can define different repositories, just like you can define different "IoC" contexts.
---
Are you serious? If you leave aside your desire to win an argument, are you seriously suggesting that instead of configuring an IoC context (something like 1-2 lines of code or a reference to a different configuration file) it would be better to replicate the entire repository logic, then probably refactor to a polymorphic solution out of disgust, etc?
Again, moving back to the decisions / forces perspective, you should see that here we have a force, a rejection force. Having two different domain objects (in different bounded contexts) is trying to pull apart things down to the persistence layer. You’re just giving up, and duplicating the repository. Then perhaps you’ll try to counterbalance that with implementation inheritance. The “abstract mapper” guy I mentioned was trying to handle the same force with a smaller-scale entity (the mapper), as duplicating a repository seems like a rather big deal. I’m trying to keep that force within the center in which it was born: separation of the domain objects, and nothing more, by adding the polymorphic interface in there (and exploiting the IoC/factory).
I can only hope that, if you stop trying to show me that your hexagonal guru knows better, you’ll see the inner dance in each of those choices, how they try to balance forces in different ways, subject to different constraints, within slightly different forcefields set up by previous decisions. That’s the spirit of this blog. Or you can keep pushing your hexagonal thing; it’s your choice :-).

Carlo Pescio said...

[part 1] Hello Kevin,
I guess my phrasing was a bit unfortunate. What I meant was “I want my repositories to give me back domain objects” (not stupid objects that I then have to map into smart objects). Still, your question gives me a chance to state my point better:
- forces are shaping our materials, and decisions are forces, including layering and dependency decisions
- strict layering with persistence at the bottom is setting up a very strong/constraining force field
- even within that forcefield, we can get rid of stupid objects by understanding the nature of software, and that separation of artifacts (which is where we deal with concerns) does not require separation of instances (which would require copy/mapping)

As far as I’m concerned, I try not to set up an overly constraining force field in my projects. I’ve been working on many projects though, and teams have preferences (and icons, and gurus, mantras, etc :-), the layered style is very popular, persistence at the bottom too, and I understand its merits (as I’ve explained to Matteo) without feeling a need to support it.

About the dumb objects + functions things: ever since a good guy (thanks Vic) put a reference to my blog in the InfoQ page, that talk has been a constant source of visitors, so I should just be thankful :-). I also like Rick’s humor :-) and the ideas he brought into Datomic. Still, that talk is mostly an excellent exercise in rhetoric. There, I said it :-). I guess that criticizing Martin, Hickey, and a few more gurus in a single post is either breaking a world record or committing social suicide :-).

Consider for a moment Martin’s talk. He’s an excellent speaker, and he builds a very convincing argument that just like a blueprint reveals the function of the building, so should our architecture. Then he moves, almost unobserved, to say that the folder structure should reveal the function. And so on. Thanks to his speaking ability, people won’t spend a minute thinking that:
- the floor blueprint is just one of the many artifacts in building design; some might not be so immediately connected with function.
- software is a material where form and function are more disconnected (Gabriel).
- software architecture is more about the internal users (programmers) than end users (Gabriel again, this time on patterns). End users mostly see the UI side.
- software architecture and folder structure are quite two different things :-)
- etc.

Carlo Pescio said...

[part 2] Rick’s talk, honestly, is just another nice rhetoric exercise. He starts with a vested interest in the functional paradigm. He proposes an appealing concept: that things “doing one thing” are simpler than those “complecting things”, and that we should aim for simplicity. And he moves from there to prove that (guess what) the functional paradigm, immutable objects, etc are inherently superior. As in many rhetoric exercises, the final revelation was already nicely contained in the initial set up.

We have seen this style in every field over the centuries. At some point, “physics” was based on earth, wind, fire, and aether (and philosophy and rhetoric to back this up). Medicine was based on bodily humors theory. Etc. And we had fierce battles between schools. Then something happened: people started to actually look at things :-), do experiments, build different kind of theories, etc.

Getting back to stupid object + functions. Instead of proposing concepts based on abstract notions of “purity” or artificial separation concocted in our mind, I’ve chosen to look at what is actually happening to software. And what is actually happening is that, despite what gurus say, some things then to change simultaneously, some don’t. And that you can try to keep things apart (like data and functions) but those things will still keep changing simultaneously. And therefore I would say that we should not seek arbitrary separation forced upon us by the gurus of functional programming, but we should seek separation along the natural lines: what tend to change together should stay together, what is not changing together should be kept distant.

Combining his words any my words, I propose that if you look at things as they are, not as you wish they were, some things are naturally entangled (complected) and that a mere separation in data + functions is not going to disentangle the two, and will just give you two distant yet entangled artifacts. Still, I also propose that we should avoid entangling (complecting) things that are not naturally entangled, and that we should try different forms and different materials (because they react differently to forces), make sure we understand the difference between entanglement in the run-time space and in the artifact space, etc.

I also don’t like much the idea that we can make honest progress by scrapping away everything we have learnt so far, like many are trying to do now with OOP, just as it happened with other ideas before. Indeed, my insistence on understanding forces, reactions, etc stems exactly from a desire to understand what makes things work (or not), despite the paradigm (and of course, within the huge variance of people). Of course, creating a chasm is a much more effective way to become popular :-).

Carlo Pescio said...

Manuel: right on the spot :-)

Carlo Pescio said...

Fulvio, you have this habit of asking questions that would take 4 books to answer :-). I’ll take this chance, however, to expand a little on my answer to Kevin. I understand the fascination with “pure” and “minimalistic” languages, say the LISP family. I’ve been through that too, may years ago, although I remember coming to LISP with great expectations but in the end finding ML much more inspiring. At some point in my life, after the usual CS exposure to functional programming, concurrent programming, CSS, CSP, etc, I’ve actually designed and implemented a small concurrent language, inspired by the minimalistic CSP approach. We were still in the 80s, and I made that working on single-tasking MS-DOS by [ab]using interrupts, of course. That was a long time ago. I no longer consider that kind of language interesting for practical applications, although minimalism is still interesting inside a teaching environment.

Following the idea of design as shaping a material, my ideal material should provide many different properties, most likely coming from support for different paradigms. While a minimalistic language would push me to create everything out of glass, I’d like to choose between steel, glass, concrete, marble, wood, etc, depending on what I’m building, the surrounding forces, etc.

“Supporting” the concepts of the physics of software requires, however, that we provide ways to express and to break entanglement (this is interestingly similar to what Coplien and Liping said about providing ways to create and break symmetry). A direct expression of entanglement is the ability to say that I want something to happen when an object changes, or when it is created or deleted. Several patterns (like, of course, the observer) emerged directly from lack of native support for that, and wrapping everything into Observable things is ugly and only works with change, not creation, etc. Providing ways to break entanglement means, in my current understanding of things, mostly breaking entanglement on the artifact side while maintaining it on the run-time side. So, for instance, mixins allows that, AOP allows that, OOP allows that, generics allows that, reflection allows that, etc etc, but they all do in a rather ad-hoc way, as no one has ever considered the problem as I just stated it. Again, many patterns (like all the creational patterns) and tools (the infamous IoC) are born directly out of native support for this kind of things.

I should add that I do not consider “purity” as an interesting design concept because I’m interested in real-world languages, not in teaching devices, and pure languages have a bad record when they make contact with the real world. For instance, one may start with the “pure” notion of functions and statelessness (with “pure” meaning “I didn’t get the idea from the real world but from math, so it must be pure”), and then realize that unfortunately the world is stateful, come up with notions like monads and the like and take great intellectual pride in that, as those without proper CS education will fail to master his pure concepts. Having proper CS education :-), I still see that and think “this is just artificial complexity coming from negation of reality”.

Ok, I guess I pissed off enough people for one post, so I’ll just stop here :-).

Carlo Pescio said...

Daniele:
I understand this post is about making a true layered architecture withouth using stupid objects (I also guess a layered architecture wouldn't be your first choice, am I wrong?)
--
Right

What I don't like very much in your solution is the interface UserData: isn't it a "dumb" interface that provides only getters and setters?
--
Yes, it is. Sure, there could be the usual validation behind that, as the real implementation is inside a business object. But as I said, once we choose technology as a layering dimension, we accepted the idea that layers will be entangled on the structural aspect of the problem domain. That interface is here to remind us that. It could be split in two (read/write), but there is no way around it, unless you resort to dynamic objects and make it implicit.

Even its name makes me ring a bell: UserData seems to me an "implementation name". Why it's acceptable in this context?
--
I would actually welcome a better name, but the idea is that it represents the structural / data part of the object. UserStructure doesn’t look much better, does it? :-) Unfortunately, many names are “taken”, that is, have already been used with different meaning in the database-oriented system literature. Considering that in many cases (but not always) we have a corresponding database table, and that before being a table it could have been an Entity in the E/R model, it would make sense to call it UserEntity (which is why, I guess, this name was chosen in some Microsoft-friendly circles for DTOs). It would cause a little confusion (also in the DDD circles), so I’m a bit wary suggesting that naming. In one project, the team chose Plain as a prefix, so we have PlainUser (data) vs User (business). In another project the team pushed for a naming convention so ugly that I can’t even repeat it here. Bottom line: I’m totally open to suggestions :-)

Unknown said...

Carlo, I know :D! But, even though it might seem, I never ask for a deep insight view, just some good starting point for thinking!

My understanding is that you want to split a class (as having the same class declared, just to not confuse C++ guys, in multiple files) while its instances are the composition of the entire declaration, right? sort of partial classes on steroids. Or maybe the ability to give different views of the same class without resorting to inheritance or interface implementation. Or have you thought about a different concept from classes to represent artifacts?

About the observer, is thinking about AOP, not as a solution, but as a concept close to your idea, right?

About decoupling instances creation from knowledge of the class name, I feel lost :D Is it about having a sort of product trader in the language?

I'm still reserving question about the intent of this post, I need to focus some things better :D

Manuel said...


Following the idea of design as shaping a material, my ideal material should provide many different properties, most likely coming from support for different paradigms. While a minimalistic language would push me to create everything out of glass, I’d like to choose between steel, glass, concrete, marble, wood, etc, depending on what I’m building, the surrounding forces, etc.

---

Carlo, I'm curious about this one (among other things, but for them I have to think throught them).

What do you think about the Scala language? It's gaining traction in the industry and it's multi-paradigm. To be fair, even Common Lisp is multi-paradigm, but this is another story.

xpmatteo said...

Hello Carlo,

What I'm trying to do is understand what you mean. There's a big gap between what you know and what I know. My reaction to your post is that the elegant solution(s) that you use to avoid stupid objects in a layered architecture are not needed in my preferred style of architecture; I just wondered if you agree on this.

But this argument is largely out of topic wrt your post, and it certainly would take too many words in writing while I'm sure you could explain away my doubts to me in a few minutes if we met face to face. So forgive me if I insisted.

The big picture I get from your post is that the "obvious" layered architecture has many nonobvious implications that most programmers either ignore or "fix" by writing tons of boilerplate objects. And that if you're careful, you can manage to write meaningful objects even within these limitations. And understanding this technique is prerequisite for your next post.

What I'm really curious about then is the followup to your previous post; one design problem I feel about is how to deal with instability in the attributes of domain objects. I wonder if you're going to talk about this next and I'm curious about what your going to say :-)

Stefano said...

Have you ever considered, at some point in this Physics of Software series, delving deeper into the implications your theory has for language design? You don't have to go as far as designing a full-blown programming language of course, but it would be interesting if you could explain in more detail how you would improve on existing languages, and maybe illustrate that with pseudo-code snippets and a few real-world examples. Apart from it making an interesting reading, there's always the chance that someone, someday, somewhere might actually implement your ideas in a real language. But if you don't get those ideas out first, that's bound never to happen...

Carlo Pescio said...

Fulvio: I'm not inventing much on that side as the concept of mixin has been around for a long time, and more recently has also been explored in the context of AOP. In a true multi-paradigm language, I'd also like to mix-in things at run-time, much like you can do (for instance) in javascript. While the traditional "confrontational" approach between language schools tend to see static and dynamic typing as adversaries, I simply see them as different materials, with different properties, to be used in different contexts (even inside the same application - usually not toy applications).

(observer etc) The interception mechanism underlying most AOP implementations could be used, of course, to implement */*-entanglement on the run-time side. However, in a language truly inspired by the physics of software, this should be more like a first-class concept. Although I haven't given this any serious thought (as I'm still struggling with concepts at what I consider a pre-verbal level), I would probably start with a review of data-flow languages here, because data-flow is a direct expression of some forms of entanglement.

Instance/class name: yeah, sort of, or an operator new on steroids if you want to look at the mechanics of it. From the physics of software perspective, while polymorphism allows one to break some forms of entanglement on the artifact side, the need to hard-code class names has always been a strong limit - that's why we have "creational patterns", or more exactly "creational problems" as those patterns are, in most cases, coping techniques.
This is one of those cases where I see a strong similarity with Coplien/Liping's idea of symmetry.

Carlo Pescio said...

Manuel: I can't say much about Scala other than it looks interesting on paper, because I've never used it myself. It surely has quite a few interesting features, and of course is lacking a lot of other features that I consider interesting as well in a truly multi-paradigmatic sense.

On the JVM side, I have spent quite some time playing with AspectJ, but then again, more into an exploratory mode, to probe the new limits of the design space. When you move back to professional programming, many other issues arise, and I tend to be a bit more conservative with languages and tools, while probably bold on design choices.

Carlo Pescio said...

Matteo:

the elegant solution(s) that you use to avoid stupid objects in a layered architecture are not needed in my preferred style of architecture; I just wondered if you agree on this.
--
sure, with the consequences I've outlined. I want to stress the fact that I'm not pushing an architecture, I just want to make assumptions, choices and consequences explicit.


And understanding this technique is prerequisite for your next post.
--
Actually, my next post will simply sort of assume that you can ask your repository and get back domain objects. In your style, that's not an issue to begin with. In other cases, that's regarded as an issue. I just wanted to turn that into a non-issue even for those who want a strong layering. I also wanted to talk about stupid objects and mapping. Funny enough, in the last few days I've been consulting on a project where we settled for an architecture quite similar to this, then another mapping issue arose on the presentation layer, thanks to some short-sighted design in microsoft MVC. I'm glad to say that we overcame the issue, ending up with a more elegant, flexible, faster and mapping-free solution. Of course, we had to try, to spend time, we had to want it. This is where people usually stop. The stupid object, the mapping, is so much easier, brainless, it's like a siren song to most :-).

What I'm really curious about then is the followup to your previous post; one design problem I feel about is how to deal with instability in the attributes of domain objects. I wonder if you're going to talk about this next and I'm curious about what your going to say :-)
--
as I said to Fulvio, the conclusion might actually be rather anticlimatic. I just want to share with you guys something I "see" in the force field of those instable objects, and take the chance to talk about stability / instability in a larger context. Don't expect too much :-)

Carlo Pescio said...

Stefano: absolutely, that was on my "table of contents" from the very beginning. I think it would also be interesting to "disassemble" current design principles, and see how they can be explained and easily extended once we understand what they're saying in terms of forces. For instance, the open/closed principle is usually expressed in terms of classes and polymorphism, but it doesn't have to. Another chapter should deal with known patterns, and how they reach some balance between forces.
Then, in practice, I'm juggling a number of things, and this stuff is unpopular as ever :-), and before we get to that level, I should at least explain entanglement, spin, isolation etc, so yeah, it's going to take a while :-(

Unknown said...

Ok now I have a few points on the intent of the article :)

My understanding is that a layer, by definition, is an infrastructure (as you discussed back in time). My main experience with this kind of architectures is with J2EE, but actually there is nothing there which constrained layering, and it's too simple to bypass a layer to reach your aim (it happened quite often in my experience).

But my intuition is that layers should isolate services which present themselves as cohesive entities. Such as virtual memory, as in your example. A "Business" layer doesn't seem so cohesive as virtual memory, does it?

It's intriguing the asymmetry between separating artifacts and runtime entities. My UML skills could be rusted (it's 4 years or so I haven't seen a single UML digram, besides on your blog), but I can't remember of a way for visually representing this kind of relationships between artifacts and runtime objects. Have you found a good one?

I'm realizing (very slowly) all of the entanglement stuff. But until you get to the "code"-level, there's always something that escapes me. Maybe it's just a matter of experience, but I didn't get the creation problem until you've shown some code. Do you manage to spot entanglement before you code? Which "tools" are you using?

Unknown said...

It's just a consideration, not a real comment! Thinking about mixins, I rememered about the metacode proposal by D. Vandevoorde to the C++ language. It seems to me that that kind of proposal could serve as basis to many issues you raised...

Carlo Pescio said...

Fulvio:

But my intuition is that layers should isolate services which present themselves as cohesive entities. Such as virtual memory, as in your example. A "Business" layer doesn't seem so cohesive as virtual memory, does it?
---
Right, it does not. Which seems to suggest that layering is not a natural shape here....


but I can't remember of a way for visually representing this kind of relationships between artifacts and runtime objects. Have you found a good one?
--
Not really, but I haven't been looking either :-)


Do you manage to spot entanglement before you code? Which "tools" are you using?
--
sure (actually code is sometimes hiding things in a sea of details). I honestly don't know how - I just see it. I wish I could be more helpful - maybe one day I'll figure that out, but at this stage, you guys have to learn to see on your own :-)

stomi said...

Hello,

What if a domain object (User) needs some external collaborator passed as a constructor argument for example? I'm not sure how much responsibility of a domain object should have, but it may happen that it requires other collaborators.

I would be interested in hearing your opinion about hybrid objects, which have both lots of accessors and behavior. The user looks like one of them to me, and it just feels wrong for me. It can happen that we need to change the behavior in some scenarios but without changing the data. But separating this behavior to other objects requires less encapsulation.

Carlo Pescio said...

Stomi: a partial answer to your question will be at the core of the next episode.

Generally speaking, OO is relatively ok with collaborators. Is not so ok on having stupid parties involved in a collaboration :-).

In many cases (I had this conversation just today) the problem seems to be that some classes (like user) tend to be seen at the center of many tasks, where in fact the only thing we need to know is the user identity. Allocating all those behaviors to the user is a perfect way to end up with a bloated class. Exposing identity, on the other hand, is not a huge breach of encapsulation.

More on this stuff next time :-)

Anonymous said...

Carlo,

maybe I miss something (I don't have much experience with this kind of application) but instead of using templates, can't I simply use polymorphism like this?

class UserRepository
{
public void FillUser( int key, UserData user )
{
// fill user with data (needs "set" methods in UserData)
}
}

class MyWonderfulUseCase
{
public void DoSomethingUseful()
{
// ...
User u = new User();
userRepository.FillUser( key, u );
bool b = u.isBirthday(); // u is a full-fledged domain object :-)
// ...
}
}

thanks a lot
Daniele

Carlo Pescio said...

Daniele: yes, but that technique does not scale very well, for instance, when you consider this simple variation:

class UserRepository
{
public List GetUsers( some condition here )
{
//...
}
}

because you return an unbound number of objects. You can pass a prototype object to clone, but it's less than ideal.

If you see the different solutions as points in a decision space, then:

- passing a mutable objects works well for single-object scenarios

- generic programming works well [also] for multiple-object scenarios, yet tend do break for deeply nested objects

- IoC works well [also] for deeply nested objects

- none is really perfect when you want to choose the actual implementation among sister classes, based on some condition. Some IoC are more helpful than others here, or a Product Trader is needed (most people use the term Factory when they're in fact implementing a product trader).

Unknown said...

What is your opinion about Reflection based persistence infrastructure? Generally you could create a repository that accepts "object", in languages that have them use generics to specify this to and then have the persistence layer analyze this data based on reflection + metadata mapping.

My question hints at, if you would consider Hibernate/JPA type persistence sufficient to avoid a dependency from persistence to the domain layer (that we wan't to avoid).

Carlo Pescio said...

Benjamin: sure, a purely reflection-based persistence layer (based on attributes/annotations in code or on external mapping artifacts) does not constitute a coupling between persistence and domain.

Strictly speaking, if you do it with attributes or annotations in domain classes, it qualifies as the opposite (we put some persistence concerns into the domain).
It can be harmless, and I've done it quite a few times, but those annotations often reveal (for instance) a bias toward relational mapping, percolated into the domain layer.

Overall, I'm not using Hibernate-like frameworks much. They come with a lot of assumptions and consequences, and they're rarely a good fit for the kind of projects I'm involved with. They also tend to constrain my architectural choices, which I don't like much. Your mileage may vary :-)

Unknown said...

I agree that Annotations in code constitute a coupling, since they import the mapping layers code. So mapping with XML files is preferred here.

Could you maybe elaborate shortly on how they constraint architecture choices? Are you writing your persistence code manually then and implement "custom ORMs" for every use-case?