Data Directed Programming – A Useful Pattern

A common problem in software development is how to deal with multiple but generally similar objects that differ in some way requiring handling, processing and application behavior specific to their subtype.  After going through a free MIT lecture series taught by  Hal Abelson and  Gerald Jay Sussman on the Structure and Interpretation of Computer Programs, I came away learning a very useful software design pattern that I have used in multiple projects now to deal with this situation.

In the lectures, it is sometimes called “Data Directed Programming” or “Dispatch on Type”, which is a way of handling multiple objects that may be related and the same in some general sense, but differ slightly requiring targeted processing and handling for them.  Think of cars like Hondas, Toyotas and Fords;  They are all cars and similar in a general sense – they all have brakes, transmissions and tires etc., but you might need to take a specific make to a specialized mechanic to perform certain repairs, because the specifics (i.e. the implementation) of those components can differ requiring special knowledge in order to work on them.

In software development you will at some point inevitably run into a situation where you are dealing with objects of a general type that require different processing based on their subtype.  This usually leads to all sorts of conditional gymnastics and if/else branches that can become complicated and difficult to understand and maintain.  That’s where this pattern can come in to save the day, keep your code clean and get you out of that mess.

Data Directed Programming – The Basic Pattern:

  • Keep a table or dictionary of types that map to procedures.
  • When you call a general procedure that would apply to a series of related objects with different subtypes, use the type as a lookup to find the corresponding behavior.

Let’s try to make the pattern concrete with an example (forgive me if it’s a little contrived, but it will get the point across and you will recognize when to use this pattern in the real world).  Imagine we have a chat app and we want to process messages that have the same overall shape, but can fall into subtypes such as Private, Broadcast or Group typed messages which differ in a relatively small way and require different handling.

We want to have a single re-usable procedure (or function) that handles message processing – one that we can call when a message comes in.  Let’s see how this looks without using Data Directed Programming (we’ll use vanilla JavaScript for the examples):

// General reusable message processing procedure
const onMessageSent = (message) => {
  // check if message subtype is private and handle that specific processing:
  if (message.type === 'private') {
    ...private message specific processing
  } else if (message.type === 'broadcast') {
    ...broadcast message specific processing
  } else if (message.type === 'group') {
    ...group message specific processing
  } else { 
    throw Error('Message type not recognized.');
  }
};

Note the multiple conditional branches we need to handle the different subtypes of messages appropriately.  It’s not horrible in this contrived example, but it can quickly get out of hand in real world situations and is generally just a bad idea to have complex if/else chains if they can be avoided.  This is where our pattern helps out to eliminate that.

If we make a table and use the message subtype as a primary ID key, we can just pass in the subtype and use the procedure that maps to it directly without having to make conditional branches in our code.

// Make a table, i.e. a dictionary, with keys of message subtypes that map to specialized procedures for that type:
const processMessage = {
  private: (msg) => ...private msg processing,
  broadcast: (msg) => ...broadcast msg processing,
  group: (msg) => ...group msg processing
};

// Our general message processing handler uses the dictionary and looks up the processing we want by subtype, then calling the processing procedure:
const onMessageSent = (message) => {  
  processMessage[message.type](message);
};

As you can see, this is quite a bit cleaner and we’ve gotten rid of all those conditional branches that can be difficult to read and understand.  In my opinion,  good software design and clean code can be based on this one question: Will someone who has never seen this code before be able to understand it quickly and easily?

I can’t stress enough how useful this pattern has been in many of the applications and products I’ve worked on.  This is a common scenario that too often ends up with complicated conditional branching. Obviously you don’t want to overuse any one pattern, but I have found it works well when this common situation arises.

Further Resources: