www.GBCH.net

Greg Black
Copyright
Software
Papers
Mailing lists
Free software

Contacting me
FAQ list
Personal stuff
GnuPG/PGP keys

Programming Language Selection

Greg Black
gjb@gbch.net

Introduction

I make occasional attacks on some of the newer programming languages -- in particular C++, Perl, Tcl and Java -- in various public places and consequently am called upon to explain myself from time to time. This document is an attempt to provide a useful explanation. However, it is far from the whole story, for that would require a book that I don't have the time to write. And it will not satisfy people who are wedded to one of these languages or people who don't have either the resources or the ability to do some additional research for themselves.

In brief, I try to show that at least some of these new languages do not really provide what they promise; and that what they do provide is at a considerable cost that overwhelms the advantages they claim to offer. I consider some counter-examples in the form of new languages that I believe do offer real benefits. And I make a case for better use of the tools that already exist -- rather than constantly pursuing magic bullets.

How am I qualified to have an opinion?

I have worked as a programmer and trained programmers for over twenty years. During that time I have learned many languages, written software in those languages, and also implemented new languages to meet special needs of my clients. I haven't seen it all -- nobody has -- but I have a sound understanding of the programming process and a deep continuing interest in the philosophy and psychology of programming. In short, I believe I'm qualified to offer useful opinions. But they are opinions, albeit well-informed ones. I welcome other opinions and I strive to see the good points in them.

What's wrong with what we have?

As things stand, today's programmer has a wide range of languages to choose from. None of them is perfect for every job, although some have wide utility. However, a programmer who has C for serious stuff and the combination of sh, awk and sed for the quick hacks or the `glue' tasks is pretty well equipped and it could be argued rather strongly that she doesn't need anything beyond those tools.

On the other hand, it can be -- and has often been -- argued that the people who use those languages are not succeeding in providing the level of software quality that is needed today -- the Y2K bug in its multiple incarnations was just the most glaring illustration of the seriousness of that quality problem. If we accept that current practitioners are not doing quality software (and I certainly do accept that), then we need to identify causes in order to have a chance of impacting on the outcomes. Is the principal (or even a major) cause the inadequacy of the languages that are used to develop this software? I see no evidence for that, at least amongst the `current' languages that I listed above.

What I do see is lots of examples of poor practice, in general `software engineering' terms, amongst many of the programmers working today. This poor practice, although in some ways marginally related to the languages in use, would produce identical results no matter what language was employed -- the problems stem from poor design, inadequate training, unreasonable time pressures, lack of understanding of the essentials of robust software development, and a raft of similar factors. The problem is not the tools, it is the skill levels and work environment of the users of those tools.

Am I suggesting that the existing languages are perfect? Far from it. But, even though I can identify what I see as flaws in my most loved languages, those imperfections are not in any way a real impediment to the development of reliable and correct software, provided they are used by people who are properly skilled in their use and who have the discipline to apply rigorously appropriate processes in their software development environments.

Unless we can demonstrate conclusively that it is impossible to produce satisfactory software with existing tools (when used well), there is no sound argument for abandoning established languages. This is particularly true when you consider the gigantic existing investment in software that is in use right now.

What's wrong with the idea of introducing new and better languages?

There are two fundamental issues here: `new' and `better'. Neither guarantees the other. Each `new' language is inevitably accompanied by so much hype from its proponents that even programmers, who really should know better, are seduced by the easily made but not so readily fulfilled promises of the new magic bullet that is going to solve all their problems. And each wave of hype and quasi-religious fervour is so potent that critics seem to be irrelevant, even sacrilegious, if they dare to find fault.

Part of the process of pushing the new languages involves claims that they are `better' than existing tools. It's easy to make the claims, but much more difficult to produce evidence. In the majority of cases, new languages are `better' only so insofar as they provide various features based on currently popular buzzwords or by wrapping everything up in additional layers of complexity, frequently masquerading as new functionality.

We have been down the path of these enthusiasms before -- anybody who knows even a little history in the software field can give several illustrations -- but still we seem to be reluctant (or even unable) to learn any of the lessons from our past mistakes.

Why not use new language X if it's the right tool?

I'll take an analogy from another domain where religious-style fervour is equally rampant -- motor vehicles. When I go to another city for work, I rent a car. To me, for that purpose, all rental cars are identical and it is not even a matter of interest to know what brand or model I'll end up with. In short, I'm happy to use whatever they can supply. It will do the job and I won't have any difficulty at all in using it, since I know how to drive any car. (This part is the argument for using the languages that we already have at hand -- they do the job and we know how to use them.)

At home, however, since I have a specialised use for my car beyond its role as simple transport, I make a selection of something that I could never get from a rental company -- my car is used regularly on a race track and to be useful for that it needs special suspension and tyres and other modifications; it also needs to be a model that lends itself to easy adaptation to the needs of the track while allowing alternate use as (reasonably) comfortable transport for me and my partner. In this case, because regular cars cannot do the job, I have been willing to go through the pain of selecting and setting up something different. But I have been able to show that it does really do a job that other cars cannot do, and that the cost was not excessive. (And this part is the argument for adopting a new language when we need to go beyond what is possible with what we already have.)

For instance, I could have a car just for the track and make it even better for that purpose, but the cost -- to my way of thinking -- would be too great. I'm only willing to devote so much of my time earning money to pay for this toy because I like to have a life (which means having time to do things with my family) and because I like to feel that I play a useful role in society (which means giving some of my money to worthwhile organisations like Amnesty International and Medecins Sans Frontieres and doing free software development and consulting work for socially useful organisations). In other words, everything needs to be evaluated in context.

As a general rule, the new languages do not allow me to do anything that I can't do with the languages I have been using for years (C, lisp, sh, awk, sed, lex and yacc).

The only area in those languages where I could see an issue that might justify the effort of adopting a new language is for the things that are done in complex shell scripts with lots of awk and sed sub-processes -- not because these things don't work adequately but because there have been occasions when, for efficiency, I have re-written a script in C. So I'm open to a new scripting language that has all the elements of awk and sed built in, especially if it allows efficient access to the operating system and to libraries of C code to do heavy-duty processing.

The stuff I do with lisp will always be done with lisp and there's no call even to consider alternatives for that. That just leaves C to consider.

There are elements of C that I'd be happy to change, if I was starting from scratch. But none of them are significant, by which I mean that there is nothing in C that is so bad that it makes C an unsuitable choice for serious use. It is true that a large number of programmers, despite years of practice, have never actually learned to use C either correctly or well. This is not directly a fault of C, but seems to be a combination of laziness, bad teaching, terrible books on C, and a large degree of inertia -- once particular bad habits have been learned, lots of people cling to them and refuse to consider any change in their ways. (I shall have more to say on this later.)

The great virtue of C is that it's a small and simple language with enough power to do just about anything (except perhaps for a handful of special routines deep inside operating systems which have to be written in assembler). The main argument against C seems to be that it's easy to shoot yourself in the foot with it. That's true, but it's not any kind of argument not to use C -- it's only an argument for good training in C. You can hurt yourself with a sharp knife, but if you really need to cut something, a sharp knife is a far better tool than a blunt one.

Let's return to the car analogy. In addition to my toy and the kids' cars, we have an old Ford truck that I use for hauling junk to the dump and building materials to my house for the never-ending renovations that I seem to be doing. It's a real clunker, with rusty panels, rattling windows, and a tendency to form puddles on the floor in heavy rain. But it does have six new tyres, decent brakes and the other basics are in working order.

Anybody in the family can drive it with ease, even if there's no great rush to get behind the wheel. On the other hand, whenever the less experienced drivers get their hands on the BMW with its heavily modified suspension, huge wheels and tyres, and pinpoint-accurate steering, they find it a real handful -- any driver inputs get instant and significant results. The truck would be a pig on the race track, but the car is at home there -- at the cost of requiring serious skills in the driver. I like that car, and I like C. Each does what it is intended for and does it well in competent hands.

Take a look at the new languages anyway -- maybe they do have merit

Before chasing down every new language in the world, I'm first going to narrow the field a bit by considering what computing platforms are of interest. For everything I do, Unix provides everything I care about and everything my customers care about. Platforms such as the Mac and Amiga are clearly irrelevant jokes now. And Microsoft systems are not worth considering, since they are too unreliable and offer nothing of real value in return for their fragility -- consider that NT-5 or W2K or whatever they end up calling it is far bigger than any known Unix and has more new code than any Unix, but offers less. Therefore, I'll only consider languages that are aimed at Unix -- although portability to other platforms might be considered a virtue.

Given my current tool kit (mentioned above) -- C, lisp, sh, awk, sed, lex and yacc -- what new languages are worth considering? Obviously, C++ is a candidate replacement or companion for C; Tcl/Tk, Perl and Python are obvious candidates in the shell/scripting category; and Java gets a chance, if only in the over-hyped category. I'll consider each of them in turn.

C++

Perhaps the most striking thing about this direct descendant of C -- a small elegant language -- is its size and complexity. These factors are so dramatic that the majority of working programmers that I have met who claim to be proficient in C++ can only understand a subset of the new language -- and actually use a smaller subset. This approach has even been recommended by Bjarne Stroustrup.

Unfortunately, for each person, the subset is different. For instance, the well-known author and language implementor P.J. Plauger has stated that C++ is too big for him to learn, and that he intends to know the library as his special area of expertise. This leads to spectacular difficulties in large projects where the programmers are likely to be drawn from a range of different sources and to have quite different ideas about how to get things done with C++. I have yet to meet anybody who could substantiate a claim to know all of C++ -- but I would not consider employing a C programmer who did not know all of C. The effort to make it easier to build large programs has not, as far as I can see from this, been a success.

The next most remarkable aspect of C++ is the extraordinary amount of time it has taken the standards committees to get something agreed upon. This is not to trivialise the standardisation process -- it's a major undertaking -- but to point out that the difficulties with the C++ standard and its associated components are surprising, given the prior experience in creating the C standard. This drawn-out process is a strong indicator that there is much about C++ that is far from the kind of clean and productive design that would be an essential element of a proper successor to C.

Now it's 2001 and the standard is a few years old, but there are as yet no standard-compliant C++ compilers. And, as far as I can tell, there is no sign that any of the major compilers will have implemented the full standard any time soon. On the other hand, most of the compiler providers are wandering off in their own directions with their own proprietary extensions to the language (as though it's not big enough already). So, as of today, there is no such thing as C++; there are several vendor-specific languages, and it seems improbable that this will change any time soon. This is not a good sign.

What about the other touted advantages of C++? Are abstract data types, data hiding, or `object oriented' features useful? Are they unavailable in C? Are they worth the complexity of C++? In fact, these things can be implemented to some extent in C and without extra complexity; and this is done in those kinds of projects where the benefits seem to be worth the extra work. Of course, C is not `object oriented' -- but the principal proven benefit of OO is compliance with a current buzzword. Once, we had to follow the rules of `structured programming'; now, if it's not `web-something', it must at least be `object oriented'. This is not really more than another fad. It will be replaced, almost certainly by other fads of limited usefulness, in the foreseeable future, no matter how fervently its believers think it won't.

It's certainly true that, in nearly 20 years of regular C programming, I've never come up against a problem that could not be solved elegantly in C or that called out for some feature of C++. It is also true that some problems have initially been expressed in terms that seemed to call for C++ -- but, once properly analysed, it has been apparent that there was a good C idiom to accomplish the real task.

In my experience, C++ programs take longer to write, longer to compile, longer to debug, and longer to load than functionally equivalent C programs. They are also more likely to be thrown away without ever being used. It may be that there are good enough C++ programmers out there who can provide contrary evidence, but these people are in a minority that is so small as to be insignificant. I can relate one instance where a group of 30 experienced C++ programmers worked on a project for six months. They ended up with about 50,000 lines of code and a program that had never looked like working. I was commissioned to `fix' it. I refused to do it in C++ and the need was so urgent that the customer capitulated to my insistence on a fresh start in C. I wrote 5,000 lines of C in two weeks that did the job so well that the program has now been in daily use by a few hundred people for the past seven years and has never been changed. Sure, I'm better at C than those guys were at C++ -- but they were handicapped unfairly by their tool. I see no benefits in C++ that in any way compensate for the down side.

As a final anecdote, I refer to some writings by Bjarne Stroustrup where he has purported to show the superiority of C++ over C, with example code written in both languages. I'm not going to comment on the C++, but I can say that the C was the kind of sloppy wrong-headed stuff that you expect from a 1st year student and could never have been considered a serious implementation of the software in question. I suspect that, if the C was written by a C programmer, the claimed advantages of C++ would have evaporated.

Perl

I avoided Perl for some time, partly because it seemed not to offer any real benefits and partly because it did not compile on the platform I wanted to test it on. I was rather put off by Larry Wall's rude claim that I was wrong about it not building on a Unix platform -- I had the platform and I knew it didn't build, but Larry knew better. However, once it became clear that, hype or not, Perl was being adopted by lots of serious people (and was even starting to find its way into system scripts), it became equally clear that I'd have to look at it.

I started with the Wall/Schwartz book in mid-1992. The Preface gave an interesting, and mostly reasonable, explanation of what the language was all about and the ways it might be of benefit. Not surprisingly, it glossed over the ways in which Perl might be less attractive; and the tone of the comparative discussions with other languages, in which it was claimed that -- even if you could somehow manage to do things in those languages -- it would be ugly and difficult and clunky, seemed to be a case of protesting too much. Once I got to the language, however, these quibbles vanished because I found I was looking at a real disaster of a language.

I've had some experience with bad programming languages -- I've had to use some of them that you've almost certainly heard of, and I've written several myself which you most probably have not met. Largely through the experience of implementing these other languages, usually to some set of customer requirements, I feel that I can recognise a language that I don't want to know about. Perl was one of those languages.

However, because of its popularity and because I had the book, I read the whole thing from cover to cover. I even wrote a few Perl programs and played with them a bit. But there was no way to escape from the write-once nature of the language or the absurd syntax with its apparent determination to use every gadget from every other language on earth. To put it bluntly, the language is just too ugly to consider seriously. And the majority of Perl programmers seem to treat every little script as a challenge for some obfuscated coding contest.

This does not establish that it's impossible to write decent software in Perl, of course. But it's clear that serious Perl code will always be difficult for humans to parse and that will significantly impact on its maintainability.

Tcl

Having abandoned the idea of using Perl, I turned to Tcl, not least because it had by then already been used to implement some applications that I used -- it seemed that it would be worth learning the language, both so that I could more easily adapt these programs to my needs and so that I could use this apparently useful language. The fact that it came with Tk was an additional incentive, but I'll discuss Tk separately.

My first quick read of Ousterhout's book showed a simple and possibly useful language, but it was clear that it suffered from two serious defects -- difficult syntax resulting from its evaluation model and a weakness for managing anything other than strings. I nevertheless went ahead and implemented three applications in Tcl -- two of them are still in use some years later and give no problems. The other was abandoned once its users started wanting extensions that were just too hard to do in Tcl, at least if the final code was to be maintainable.

Python

Having started considering new scripting languages in Perl and Tcl, I began to think that there had to be a model where the undoubted functionality of Perl could be combined with a sane and clear syntax and a simple and powerful mechanism for extending the language -- Tcl allows extensions, but the method is ugly and complex. This led me to Python and this language seems to be everything I wanted. I now write serious application suites for my customers in Python in a fraction of the time it would take in C and it's trivial to add new functions to the Python core -- in C -- when I need to do something tricky that is too slow or difficult in straight Python. And Python works splendidly with Tk, for those GUI-style applications that everybody thinks they want these days.

In particular, where there is some third-party library with a C API that performs some useful function, it is a simple matter to write a module in C to add to Python's core so that the external library is accessible to simple Python code. It's so easy to do, in fact, that I have found myself reimplementing such modules from scratch just to have them work in exactly the way I like.

Python is not perfect, of course. I particularly dislike its vague approach to objects -- it's hard to know whether you're dealing with an instance of something or a reference to it. And that becomes even more of an issue in extension code written in C. Had I written Python, that part would definitely have been handled differently. However, Python is so useful and the task of writing my own language so large, that I'll make do with what Guido van Rossum has given me.

So I have added Python to my toolbox -- not to the exclusion of sh, awk and sed -- but as an adjunct, chiefly to be used for bigger projects. I've been using sh, awk and sed for so long that I can write scripts with those tools in my sleep and be sure they'll be right first time; and I don't want to lose practice with them as I think they'll be an important part of the Unix world for quite some time yet. But, by adding Python to the toolbox, I've given myself a powerful and easily extensible tool that makes writing large programs easy and very quick.

Tk

Tk is effectively the GUI-development part of Tcl. It allows rapid development of X applications and provides all the little widgets and building blocks to get something going in no time. I personally dislike its look and feel -- but it's there, it works, it's easy; other programs use it which makes the look and feel familiar, and it works with other languages besides Tcl. In particular, from my point of view, it works very well with Python. So I use Python to do the work and Tk to do the GUI stuff. Perhaps it's not quite a marriage made in heaven, but it's a pretty good next best thing.

Java

The famous `write once, run anywhere' slogan for Java is attractive. But the similarities to C++, the silly politics between Sun and Microsoft (who are both as bad as each other), the clear failure of the `run anywhere' part of the equation (at least so far), the constantly moving target of an evolving language with a huge user base and no clear design path, and the unsufferable hype are all compelling reasons to let the lemmings play with it while I do real work with real tools.

On top of that, the refusal of Sun to submit to standardisation means that the language is a proprietary language and that has its own problems. I won't go into that here beyond stating that, for software to be truly free, the user must have a reasonable guarantee that she can modify it to suit her needs and then rebuild it. Proprietary languages make this goal more difficult. I'm aware that there are now free Java tools and that many people are happy with the present situation. But it troubles me.

Maybe it will evolve into something useful, maybe it will just die. I can't tell -- and if the success of Windows9x and other junk from that source is any guide then it's clear that neither technical merit nor proven usefulness and robustness are preconditions to success in the weird marketplace that is such a part of the late twentieth century. I can live without Java for now, and probably for ever. I can certainly accomplish anything I might do with Java in other, to my mind better, ways.

Conclusions

This paper is not a detailed analysis of any of the languages it discusses -- its aim is to offer food for thought, based on my experiences with the languages and with the world of programming over many years. At the least, it suggests areas that need further thought and consideration before any of the new languages is adopted.

Above all, it suggests that we need to explore how we can best use the tools we already have before we run after the so-called magic bullets. I don't think there is an easy way to write software. I believe that it takes lots of education, lots of hard work, serious commitment to the process of establishing good practices and of using every element of our programming environment effectively. It involves being open to learning from the experiences of others, and to adopting better practices when they are offered. But it does not involve jumping on every over-hyped bandwagon that comes along, and it will never involve substituting myths and religious beliefs for serious consideration of the real truths about this complex and amazing and increasingly important and pervasive thing called software that we have the privilege to work with.

I welcome feedback -- including alternative viewpoints -- on the ideas presented in this paper. I don't promise to respond to email or to adopt the ideas in such feedback, but I will consider everything that comes in to me. Contact information is at the end of the file.


Availability and updates

The latest version of this document can be obtained from my web site at <http://www.gbch.net/papers/languages.html>.

It used to be available from an email auto-responder, but that service has been discontinued now that the web version is available.

There is a mailing list which carries announcements about updates to this and other documents. Follow this link to learn about that list.


Permission is granted to make and distribute verbatim copies of this document provided the copyright notice and this permission notice are preserved on all copies, but modifications to the document are not allowed.

Permission is granted to make faithful translations of this document into another language under the same conditions as for verbatim copies, except that translations must carry a notice stating that they are derived works and giving the name of the translator and the date of the translation.


Copyright © 1998-2001 Greg Black -- All Rights Reserved
Questions and comments about this page to webmaster@gbch.net
$Id: languages.html 2.1 2001-11-18 17:05:48+10 gjb Exp $

Back to top