After more than a year of complaining about the syntax, I’m forcing myself to finally sit down and learn some Erlang. Between CouchDB, EjabberD, and all the other interesting projects people are implementing in Erlang, I would be remiss as a systems engineer to not at least pick up the basics.
Unfortunately, I’m still chafing a bit at a number of little annoyances:
- The REPL is basically crippled since you can’t define functions. Being forced to think in terms of compilation units (rather than simple expressions) pisses me off.
- Why oh why do I need to explicitly list the module name in my file header if I’m also bound by the restriction that filenames and module names have to be the same? The old Java package/file path ties were always a big annoyance when I was stuck in that environment.
- For a functional language, there’s an awful lot of syntactic vinegar for basic operations like
mapandfold. I appreciate having a concise syntax for lambdas, but writingfun my_function/2smells a bit. - Records (as syntactic sugar for tuples) are a poor substitute for a real type system. Both tutorial and real-world Erlang code I’ve seen is basically full of tagged tuples, which means you get the verbosity of a strongly-typed language without any of the ability of real type checking to catch errors at compilation time.
I want to stick with it long enough to find the real gems underneath all this noise. I mean, if I can sit through extended sessions reading and writing Perl, I should be able to find something to love about Erlang. Furthermore, most of the complaints I make above are inapplicable to mainstream languages — i.e., C and Java dont have an REPL or lambdas, and Ruby and Perl don’t have anything resembling a traditional compiler — not miraculously better.
I definitely think that learning a new language should make you feel a little bit uncomfortable. Unfortunately, right now Erlang leaves me feeling uncomfortable in all the wrong ways: I understand everything that’s going on with the language, and just don’t like it.
I’m going to keep plugging away for at least a little bit longer, though. Next up: reading the source to EJabberD to (hopefully) get a sense for idiomatic language use in a context where its unique features (lightweight concurrency + distributed computing) are a real advantage.
The interesting thing about Erlang, is that once you get past the syntax, it feels like a scripting language. A pretty fast scripting language in the functional style, that happens to compile. I’ve never cared much for types. What I do like about Erlang is that as bad as records are, at least they don’t have methods. Data should be data, and code code, and yes code can be data, but I breath a sigh of relief every time I remember that none of these lists or tuples or numbers have any behavior.
I don’t disagree with you about wanting clear delineation between active and passive objects, or even liking the agility of a language without compile-time type checking. My point is simply that the overuse of tagged tuples (where the atom in the first tuple position basically serves as a type flag) leads to a lot of need to repeat yourself, where a proper polymorphic type system (whether via object orientation, or via the Hindley-Milner inferential route) allows you to “say what you mean,” and let the compiler and/or runtime handle the rest.
Regardless, I think Erlang does feel like a lot more like a scripting language, which in the end makes it far less interesting to me. The #1 most interesting feature, Actor-based concurrency, can be trivially implemented in any language with lightweight object serialization and robust TCP/IP support, and robust multiprocessing can be accomplished via a number of methods. Erlang’s bunding of those tools in the language itself does inform a certain design aesthetic, but I don’t know that it outweighs the awkwardness of the language itself.
> The #1 most interesting feature, Actor-based concurrency,
> can be trivially implemented in any language with lightweight
> object serialization and robust TCP/IP support,
…and yet (or exactly because people think this), project after
project run into a brick wall of poorly implemented concurrency
support. If I had $100 for every time someone has said “how hard
can it be?” only to be proven wrong when reality hits, I wouldn’t
have to work anymore.
(Although I’ve come to believe that many don’t even understand
then that it’s the poor concurrency support that causes their
problems, because they haven’t learned to recognize the symptoms,
or don’t even know there’s a better way to do it.)
There are very few implementations of “erlang-style concurrency” that can really compare to the “real thing”. And most of those are hampered by coexisting legacy concurrency models (Scala), zero user base and zero support (Termite) or a staggering learning curve (Mozart). Glasgow Distributed Haskell is an interesting entry, but also lacks user base and commercial appeal, and is less flexible than Erlang in some ways. For now, I’d view it as an alternative for elite programmers only, though.
Messing up concurrency can be a project killer – one of the best there is. At least with Erlang, you get a consistent and fairly safe approach and an active user community where lots of people “get” concurrency and are willing to share their knowledge. For those who have this need and have found a home there, a few warts here and there is nothing. Why, most other langs out there aren’t even starters by comparison.
> and robust multiprocessing can be accomplished via a
> number of methods.
And yet people keep calling wondering about the rumours of painless multiprocessing, when they’re stuck with a Java or C++ program that doesn’t scale, and the best advice they can get from the experts is basically to rewrite large parts of their app.
Yeah, you can do this in other languages, but in my experience, practically noone ever does in real products. Your experience may differ from mine. I’d love to hear about counter-examples.
@Ulf:
My point is not that there isn’t value to the Erlang approach (there is!) or that there’s a single language runtime out there with all its desirable features (though others have nice features of their own, like the ability to directly call C functions, or use traditional side-effects). There isn’t anything magical about Erlang’s approach, though — it’s a natural design given the strict limitations on the language itself.
By disallowing use of traditional threading primitives, Erlang programs are protected from many of the deadlock and resource-contention issues that make multiprocessing programs so notoriously difficult to implement properly. However, the model also only helps with those problems that a) are embarrassingly paralellizable, and b) can be implemented efficiently using cooperative multitasking (i.e., don’t require preemption).
Erlang has its place in that niche, but its limited exploratory programming support and lack of polymorphism probably means it won’t be the first tool I reach for when I’m looking to start work on a new project, unless I know from the beginning that massive concurrency is going to be a core component of the application’s requirements.
@lennon:
You certainly have a point. It could be that many of us who have made our living building complex and massively concurrent applications for so long, forget that there is a vast array of problem domains where concurrency is mainly seen to have marginal influence at best.
What I would like to suggest then, is that you explore the idea of Concurrency Oriented Programming as a modeling paradigm in its own right, which offers a powerful alternative to OO. One example of this is the error handling principles in Erlang, where the concurrency features are mainly used for supervision and isolation. YMMV, but so it does with OO and FP in general.
As for your claims about Erlang being weak in exploratory programming, I’d have to take a deeper dive into the alternatives in order to find out whether or not I agree. It could be as with Erlang’s error messages – lovely if you come from C++, appalling if you come from Python.
(Although there is good reason not to waste time pretty-printing error messages in an unattended real-time system, there are also some bits of information missing.)
I certainly agree that there is nothing magical about Erlang. It was a language project driven in an industrial environment, which is unusual, but by no means unique. There is lots of good engineering in there, and tons of domain knowledge. Some of that domain knowledge works in other domains as well – some of it doesn’t.
Uhm, I don’t follow how Erlang would only help with problems that don’t require preemption. Erlang scheduling is preemptive – conceptually so with only one CPU (the current scheduler uses highly predictable reduction counting, but relying on this has always been strongly discouraged), but in the SMP version you have multiple scheduler threads which are preemptively scheduled. Then you have true preemption, with the exception of thread-safety guarantees given by the runtime system and core libraries.
I just want to second what Ulf wrote about implementing actor-based concurrency primitives. It easy to implement a simple package for most language, we did a few before Erlang as experiments, but to make a *good*, well-integrated package is extremely difficult. Erlang’s concurrency primitives are not something which has been stuck on later but are an integral part of the language.
Another point not mentioned so far is error-handling. To build a fault-tolerant system you need to be able to detect and handle errors is a controlled way. This is *not* the same as sticking checks for error returns everywhere and handling them locally which very quickly makes the code illegible and fault-prone. Erlang’s error handling mechanisms are an integral part of the language.
It is this integration between concurrency, error handling and the rest of the language which provides Erlang with much of its power. It is also this integration which is difficult to do.
The compilation unit of code in Erlang is the module. All functions are part of a module. It is also modules which are the base of code handling. It is therefore not surprising that you cannot define functions in the shell, which module would they be a part of? It is possible to define funs in the shell but they can’t be saved.
And I’m not following the claim that Erlang only helps with embarrassingly parallel problems. ejabberd is not embarrassingly parallel, it’s the exact opposite, it’s fantastically heterogeneous. It might look homogeneous if you look at it from a distance (oh, yeah, route message, route message, route message, how hard is that?) but that would be true of any task. Get up close and into the program and it actually does all kinds of things that interact with other bits of state in every basic multithreading way you can imagine.
Erlang actually shines on these highly heterogeneous tasks; it’s the embarrassingly parallel tasks that are easy in any language where it’s not worth your time to learn Erlang.
It’s worth reading ejabberd’s source if you’re getting into Erlang, because it’s amazing what it accomplishes with such directness and simplicity. You can get so caught up in the code that you never actually notice how there’s no synchronization code. It’s hard for any other language to match that.
(Which makes it a pity that the syntax is so difficult.)
@Robert: fault handling is a part of the language I haven’t really had a chance to explore yet, and one of the things that I still hold out some hope will really “click” and make me stick around.
If I think of Erlang more as a virtualized operating system, rather than a simple programming language, it starts to make more sense. The limitations placed on the language are no more severe than those imposed on kernel-space C code, for example, and the concurrency and error-handling tools are far more sophisticated.
@Jeremy: I’m not sure I follow the transition from “embarassingly parallelizable” into heterogenous; parallel tasks can be highly diverse in terms of the units of work done within each process. I’m also not convinced that ejabberd is a great example of a system that uses the language to its full advantage — I count all of 22 uses of ’spawn’ in the sources, and only nine calls to ‘receive’.
Regardless, my problem is not that the syntax is difficult — I find C++ to be much more “noisy” and difficult to scan. The language simply provides such an impoverished set of abstraction mechanisms that I find myself reading the same code over and over again. Dynamic typing without polymorphism leads to a lot of manual type-checking and case-switch pattern matching.
@Lennon – I think you’ll find that in most production erlang code spawn are receive are very often abstracted away from the implementation. I the case of Ejabberd they make great use of the OTP framework. Try looking for gen_server of gen_fsm in their code base.
Standalone functions in the REPL can be done in the following way:
12> F = fun (A,B) -> A>B end.
#Fun
13> F(2,1).
true
14>
If you’re using emacs there are functions to throw all the boilerplate for many different types of files, including author/date stuff, and automatically naming the module after the file. If you’re not using emacs, you probably won’t like erlang very much in the short or long term.
Agreed, making the arity part of the func name is terrible. (Insofar as the programmer needs to constantly type it. I don’t know enough to know if I’d be opposed to c++-style name rewriting.)
I’ve only been fooling a couple days, but I like the atom tagged tuple and pattern matching idion so far.
Jeez, just noticed how old this post is, but I’ve already typed it, so I’m aposting int.
Chau