0x1b – This is how Erlang makes sense, pt. 3

This is the third and final part of a series about Erlang.

Erlang/OTP didn’t come out of thin air. Quite the opposite – seldom you see a language and a platform where the implementation mirrors the intention that closely.

Lists, Strings, Binaries, Tuples: what is what and why?

A tuple is one strange creature: you cannot append to a tuple, you cannot loop through a tuple – all these things you can do with lists. So why do they exist in the first place?

There is something good about tuples: tuples – being of a fixed size – can be implemented in a super-efficient manner, being memory-efficient and providing fixed-time random access at the same time. Lists, on the other hand, are neither of both. For efficient encoding, decoding and parsing of messages that need structured data in some way you should always use tuples and nothing else.

So then why using lists in so many other places? Well, for most of the issues you don’t need fixed-time random access, but a structure, which can be enumerated and that can grow quite fast. And yes, this is your list. And finally, asking why a language stemming from a LISP heritage would have lists is like asking why the milkman is selling this white stuff :)

But even though you can do many things without fixed-time random access, there is still an important area where you just cannot do that efficiently, and this is string processing. Erlang implements strings as a lists of bytes (or byte blocks), and this cannot be done efficiently: to find out whether a pathname ends in, say, “.html” the platform has to scan to the end of the list and to look and see just then. Other implementations have to do the same, but they are faster in finding the end: an implementation with ASCIIZ-strings (i.e. C-style) looks for the first NUL-byte in the data – something that todays and even yesterdays CPUs support out-of-the-box, and PASCAL-type strings (i.e. length byte(s) + data) just add the length offset to the start address of the memory buffer.

Where are my objects?

The guys inventing Erlang/OTP wanted one feature, which is “live updates”. That means you can just update the erlang code of some part, and all new processes that need that piece of code start using the updated code right away. As a Rails developer you feel reminded of Rails’ development mode; but this time it is done in the language and is done right. What sounds like magic is, in fact, pure technology. Unices do the same all the time with dynamically linked applications: when you deploy a new version of, say, libc on your system all new processes start to use that version; while all the old processes still run with the old version of the library. No reboot required. (By the way, Windows doesn’t do that: a Windows system locks all .DLLs in use and prevents them from being replaced. Hence the need for a reboot when updating.)

Now you might ask: what happens to the data? When do you migrate old-version data to new-version data? This is something that you have to do yourself, if you need it. Better you don’t have incompatible changes to your data – Erlang doesn’t help you with that. But the same is true for the code itself. A change from, say,

fun(Opt) 
case Opt of
stop -> do_stop()
end.

to

fun(Opt) 
case Opt of
finish -> do_stop()
end.

could break the application. Live updates have to rely on code that is compatible to previous versions in terms of the API and of Data.

What has this to do with Objects? In a narrow sense this is totally unrelated. However, objects in an OOP implementation no longer expose the public interface only. To allow for inheritance the objects must expose its “protected” interfaces (i.e. every method that can be called from a related object) and all methods that can potentially be overridden in a derived class. These massively blows up the size of the application interface. Which makes an incompatible change more likely. Which breaks live updates. Case closed. (And hey: while OOP is nice to have, it is never strictly needed, right?)


Final Disclaimer

Opposed to Erlang/OTP this article does come out of thin air, sort of. I am no authoritative source regarding the history of Erlang/OTP, and as I never intended to be one. I do, however, like the intellectual exercise to ask not only the how, but the why, the when, and by whom too.

Advertisements

3 responses to “0x1b – This is how Erlang makes sense, pt. 3

  1. OO also clinges on to the concept of mutable internal state.

    All thing that can mutate in a program will together cause a combinatoric explosion of different possible states, and that makes the program’s correctness difficult to reason about. (I’m not talking about formal proofs, only plain old “wtf is this doing?” code-review-like analysis.)

    While it is tempting for the OO-focused mind to ask for erlang modules to be able to inherit from other modules and override exports and call exports from the super-module, it is not something I (now being quite used to CO as well as OO) ever find important. This is because of a few reasons:

    1) I tend to use external module calls only for side-effect free functions, I keep side-effects in very small parts of the program code, at the top of the call stack (this is something haskell enforces with monads, those functions not called with a monad cant perform side-effects nor call other functions that require a monad).

    2) If I want some generic api, then I can simply export the same things from several modules. See modules ordsets and sets for example. There is very little implementation reuse between those two, only interface reuse. Inheritance doesn’t buy you much there.

    3) When I want side-effects I don’t use modules, I use processes. Process messaging has low coupling. Things that perform side-effects tend to be what you want to make parametric (for example, logging straight to console, or to a file, or to a db, or multiplex to several, etc)

    I find that separation of concerns in CO focuses on separating concerns in what processes do, and let data-structures peek into each other without many language barriers to do so. While in OO the separation of concerns apply to data structures, and very little to what thread concern is separated into. At least this is something I bring with me when I write in OO languages.

  2. Thanks for your nice additions, Christian. And one of the good thing with OTP is certainly that you start your processes in no-time; which makes breaking functionality out of the one big monolithic application “box” a thing that you can actually do, without spawning resource hogging real OS processes. I am sorry for the naming ambiguity, though…

  3. Thanks for the interesting introduction.
    The links between the parts 1, 2 and 3 seem to be broken.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s