Why I love WebSockets

When I was in school, passing notes took some effort. First, you needed to find a piece of paper that was large enough for your message, but small enough that it could be folded into the requisite football shape. Next, you had to write something. Anything smaller than several sentences just wasn’t worth the overhead, so you had to write about half a page’s worth of stuff, or draw a picture large and detailed enough to make it worth it. After that, you set about the process of folding your note into the aforementioned form. Finally, you had to negotiate with your neighbor to get the note from your desk to its final destination. All that was just to send the message. On the receiving side, the note was unfolded and read. Then your counterpart would go about constructing a response, refolding the note, and negotiating the return trip.

The first version of the Internet

http://www.flickr.com/photos/kmorely

Passing notes in class was a task that required effort, skill and time. You sent a message and you waited for a response. If you thought of something new that you really needed to say, you had to wait until the response came back. At that point, you could alter your original message or add new content. While your note was in transit or being read and replied to on the other end, you had no control. You were at the mercy of the medium over which you were forced to communicate. Note passing simply isn’t designed to allow for short, quick, asynchronous communication.

Nowadays, kids just text each other on their smartphones. They send messages quickly and easily without having to invest in all that overhead. After a bit of upfront work to get someone’s phone number, the back channel classroom chatter flows freely. That is, until someone forgets to silence their phone and the teacher confiscates everything with a battery.

Just as the methods of slacking off in school have evolved, so have methods of communicating over the Web. HTTP is the note passing of the Internet. It works well enough for most communications, and when the message is large enough, the overhead is minimal. However, it is less efficient for smaller messages. The headers included by browsers these days can easily outweigh the message body.

Also, just like note passing, HTTP is synchronous. The client sends a request and waits until the server responds. If there is something new to be said, a new request is initiated. If the server has something to add, it has to wait until it is asked. It can’t simply send a message when it is ready.

WebSockets are the smartphone to HTTP’s notes. They let you send information quickly and easily. Why go through all that folding when you can simply send a text to say “idk, wut do u think we should do?” Why use 1K of headers when all you want to know is, “Did someone else kill my rat?” Better yet, why ask at all? Why not have the server tell you when the rat has been killed by someone else?

WebSockets are made for small messages. They are made for asynchronous communications. They are made for the types of applications users expect these days. That’s why I like WebSockets so much. They let me communicate without overhead or rigorous process. I can write an application that is free from request/response pairs. I can write an application that responds as quickly as my users can act. I can write the applications that I like to write.

D is for Documentation

Code is the way in which humans tell computers what to do. Lots of effort has gone into making code easier for humans to read in the form of high level languages like Java or C++ and scripting languages like PHP, Ruby, and Python. Despite mankind’s best efforts, writing code is still clearly an exercise for talking to computers. It has not evolved to the point where talking to a computer is as easy and natural as talking to other people.

That’s why documentation is so important. Programming languages are just a translation of a developer’s intent into something a computer can execute. The code may show the computer what you intended for it to do, but the context is lost when another developer comes back to it later. Computers don’t know what to do with context. If they did, the days of Skynet would already be upon us. Humans can process context and it makes the process of dissecting and understanding a computer program much easier.

I find it both sad and hilarious when I see a speaker evangelizing code without comments. Invariably, the speaker shows a slide with five lines of code and spends ten minutes explaining its form and function. Even the simplest and most contrived examples from some of the foremost experts in the field require context and explanation.

When a bug decides to show itself at three in the morning, in code that someone else wrote, context and intent are two very powerful tools. When bugs are found the question, “What was this supposed to do?” is more common than “What is this thing doing?” Figuring out what it is doing is easier when you have good log data to go on. Knowing what it was supposed to do is something only the original developer can tell you.

If you aren’t aware of the concept of Test Driven Development, I strongly recommend you dig into it. In summary, tests are written before the code to ensure that they code matches the business requirements. I would like to propose a complimentary development driver: Documentation Driven Development. By writing out the code as comments first, you can ensure that the context of the development process will be captured. For example, I start writing code with a docblock like this:

/**
 * Returns the array of AMQP arguments for the given queue.
 *
 * Depending on the configuration available, we may have one or more arguments which
 * need to be sent to RabbitMQ when the queue is declared. These arguments could be
 * things like high availability configurations.
 *
 * If something in getInstance() is failing, check here first. Trying to declare a
 * queue with a set of arguments that does not match the arguments which were used
 * the first time the queue was declared most likely will not work. Check the config
 * for AMQP and make sure that the arguments have not been changed since the queue
 * was originally created. The easiest way to reset them is to kill off the queue
 * and try to recreate it based on the new config.
 *
 * @param string $name The name of the queue which will be used as a key in configs.
 *
 * @return array The array of arguments from the config.
 */

Next I dive into the method body itself:

private static function _getQueueArgs($name)
{
    // Start with nothing.

    // We may need to set some configuration arguments.

    // Check for queue specific args first and then try defaults. We will log where we 
    // found the data.

    // Return the args we found.
}

After that, I layer in the actual code:

/**
 * Returns the array of AMQP arguments for the given queue.
 *
 * Depending on the configuration available, we may have one or more arguments which
 * need to be sent to RabbitMQ when the queue is declared. These arguments could be
 * things like high availability configurations.
 *
 * If something in getInstance() is failing, check here first. Trying to declare a
 * queue with a set of arguments that does not match the arguments which were used
 * the first time the queue was declared most likely will not work. Check the config
 * for AMQP and make sure that the arguments have not been changed since the queue
 * was originally created. The easiest way to reset them is to kill off the queue
 * and try to recreate it based on the new config.
 *
 * @param string $name The name of the queue which will be used as a key in configs.
 *
 * @return array The array of arguments from the config.
 */
private static function _getQueueArgs($name)
{
    static::$logger->trace('Entering ' . __FUNCTION__);

    // Start with nothing.
    $args = array();

    // We may need to set some configuration arguments.
    $cfg = Settings\AMQP::getInstance();

    // Check for queue specific args first and then try defaults. We will log where we 
    // found the data.
    if (array_key_exists($name, $cfg['queue_arguments'])) {
        $args = $cfg['queue_arguments'][$name];
        static::$logger->info('Queue specific args found for ' . $name);
    } elseif (array_key_exists('default', $cfg['queue_arguments'])) {
        $args = $cfg['queue_arguments']['default'];
        static:$logger->info('Default args used for ' . $name);
    }

    // Return the args we found.
    static::$logger->trace('Exiting ' . __FUNCTION__ . ' on success.');
    return $args;
}

The final result is a small method which is well documented and little if any extra time to write.

Armed with data from logs, unit tests which ensure functionality, configurations to control execution, isolation switches to lock down features, and contextual information in the form inline documentation, the process of finding bugs becomes easier. LUCID code communicates as if it were a member of the development team. It does all the things you expect from a coworker. It talks, it makes commitments, it works around problems and keeps a record of both what it is doing and why it is doing it.