Brian Slesinsky's Weblog

Saturday, 20 Nov 2004

Why most of us won't be inventing little languages

Sergey Dmitriev (CEO of JetBrains) recently wrote an article arguing that it should be easier to write computer programs in domain-specific languages. I have great respect for the developers of IntelliJ IDEA and look forward to seeing new tools from them, but I think his justification for writing these new tools doesn't really hold water.

Sergey seems to believe that the main barrier preventing the invention of new computer languages is that it's too difficult to write language-specific tools. While that's a natural argument for a tools vendor to make, l think it misses something fundamental about language. The real downside of inventing a new language isn't that you have to write new editors, compilers, and many other tools. It's that other people won't understand you.

If you're writing a technical article in a natural language such as English, you can easily invent new technical terms - your word processor doesn't care. But if you're the only one who uses those terms, your writing will be obscure and difficult to read. Anyone interested in reading your papers will have to learn your jargon first. You have a lot better chance of success if you write clearly with as little jargon as possible. Or at least, use the jargon that's popular in whatever community you belong to, if you're not trying for wider influence.

We do invent new terms all the time, sometimes for good reasons. But this has to be done slowly because the community needs time to learn them, or reject them, as the case may be. Many popular articles are concerned with introducing just one new concept, such as the recent Wired article populizing the phrase "the long tail" - or Sergy's attempt to popularize "Language Oriented Programming".

Computer languages work similarly; a language becomes more useful when more people understand it. No matter how expressive a language is and how good its compilers are, if your library requires learning a new language, few will use it, and fewer still will be able to maintain it. The difficulty of writing good language-specific IDE's reinforces this tendency, but it existed back when we were all using emacs and vi.

It's quite possible that languages like Lisp failed to become popular (beyond a small community) partially because they encourage programmers to write domain-specific languages. This has a tendency to fragment the community into groups that have trouble communicating (a "tower of babel" effect). It's also self-reinforcing - a language without many libraries attracts people who are predisposed to reinvent the wheel in the first place, so the body of actively-used common code grows very slowly. Much more code is shared in Java, but there are still plenty of evolutionary dead ends - we don't need more reasons to discourage sharing.

Also, Surgey seems to think that basic constructs we use for object-oriented programming (classes and methods) are a limitation to be overcome. While sometimes they can be, they are also a strength, just like any other widely used standard of communication. It's typical for Java applications to use dozens of separately developed libraries. This isn't something to take for granted, and one reason it works is that no matter the domain, at the bottom it's all Java method calls. If each of these libraries were written in its own little language, how would we integrate them? API's need to have something in common and there are few features of Java that don't have something to do with an API.

I find somewhat strange, then, that Sergey believes that object-oriented programming's limitations make libraries more difficult to learn. Learning to use the Swing API properly may be difficult, but would it really be easier if the Swing developers invented a new language to write it in? When learning something new, it's very helpful to be able to rely on what you already know - making things more unfamiliar doesn't seem like much of an improvement.

I certainly don't mean to say that the tools JetBrains will write for Language-Oriented Programming will be useless. But they might be used differently than Surgey imagines. Web developers are already maintaining programs that embed many little languages, including XML, HTML, JavaScript, regular expressions, xpath, and SQL. Based on this experience, I expect the population of little languages to increase slowly, and most teams won't invent their own. The most common case for IDE developers will be to write plugins for little languages that already exist, not to provide tools for language inventors.