on the edge
computers & technology, books & writing, civilisation & society, cars & stuff
Greg BlackContact Me If you’re not living life on the edge, you’re taking up too much space.
Syndication / Categories
Worthy organisationsAmnesty International Australia — global defenders of human rights Médecins Sans Frontières — help us save lives around the world Electronic Frontiers Australia — protecting and promoting on-line civil liberties in Australia Blogroll(Coming soon ) Software resources |
Wed, 19 May 2010Is Clojure the Answer, or Assembler?The ongoing saga of my project of learning new programming languages and eventually getting some real software written with one or more of them has been derailed again—this time by a new(ish) entry in the Lisp family, Clojure. I already knew about it, but had been disinclined to delve into it because of its foundations in Java, a language I really dislike. But I’ve seen a few tempting things about it recently and the stuff I’ve been reading seems to show that you can use it without having to get into real Java. If that’s true, I’m interested. The principal features I need in any language that’s going to engage me are useful tools for managing concurrency, a coherent and not overly verbose syntax to make things easy for the human readers, decent performance and portability to all the free operating system platforms I care about. Of course, there’s also the business of getting close to the machine—something I think all programmers need to be comfortable with—and for that I’m looking at X86 Assembler. I last did lots of coding in Assembler when the Z80 was king of the hill and 2 MHz was fast. Lots of my small business customers had machines that had no hope of running anything serious unless it was written in Assembler. But I’ve hardly looked at it since the decline of the Z80, so it seemed like time to complement my focus on very high level languages with a bit of low level stuff. I’ve found a few references to modern Assembly Languages and plan to get up to speed a bit with that over the next few months as a counterpoint to my functional languages.
Wed, 07 Apr 2010Erlang or Haskell?I recently wrote about my plans to learn some new programming languages and indicated that my thoughts were leaning towards Erlang and Haskell. I’ve now made some progress with what appear to be the most suitable books for me about each language—although I’m yet to write anything beyond the extremely trivial in either. But I am starting to get some initial feelings about which way I want to go, although there are clear concerns with both languages. Nevertheless, based on my reading so far, I think I’ll focus first on Erlang—at least unless I stumble over something that seems like a total impediment to further progress. For the record, I have several books about both languages, but the two I’m using at the moment are Real World Haskell by Bryan O’Sullivan, John Goerzen and Don Stewart; and Erlang Programming by Francesco Cesarini and Simon Thompson.
Tue, 09 Mar 2010My Next Programming LanguagesI’ve been thinking (and talking) about learning some new programming languages over the past couple of years and it seems like time to make a decision about what to tackle. I’m not talking about learning a language just well enough to be able to poke at some crufty code that needs a tweak—what I’m interested in is learning languages well enough to seriously use them. And that, as Peter Norvig says, takes time. Which means I can’t learn every language out there. After a lot of reading and thinking, I’ve decided that it has to be a functional language and that brings me to Erlang and Haskell. There are, of course, other candidates, but these two seem to offer the best opportunities for me at present, not least because both languages now have what appear to be good books available. And I like learning, at least in the beginning, from books. It will be a while before I have anything to report about this plan, as I will first need to fit in my reading and practice until I can get something that I define as interesting ready for testing. At this stage, my tentative plan is to work with one language until I can write something easy but useful like a web server with it and then to reimplement the same thing in the other language and see how the two languages stack up. It might not be a web server, but it will be about that size. And it will be something I don’t need to write (since, to keep with the web server example, I already have mature software handling that task for me), because I want to do this without any time pressure so that I can really delve into it. And it’s quite possible that I’ll just love whichever language I start with and not bother to even learn the other one. I really hope that I don’t hate them both, but I’m ready for that outcome too.
Fri, 05 Mar 2010An Update on Mercurial UpdatesI recently whined about my inability to discover how to keep up to date with Mercurial. And the fine denizens of the intarwebs came—partly at least—to the rescue. I now know how to grab the latest release and have successfully installed it on the seven different systems I cared about. Even better, I also know how to clone the mercurial-stable repository (and, obviously, how to grab updates when I want them). For the benefit of others who may have the same question:
hg clone http://selenic.com/repo/hg-stable
That will create a clone of the stable branch and put it in a directory called ./hg-stable. Then it’s a matter of doing hg incoming in that directory to see if there are any updates, followed by hg pull and hg update to get them into the working directory tree. There’s still one little imperfection: I have not yet found any source of announcements about new releases or even important updates. I suppose I’ll survive without that, although it is nice to receive a notice when there’s an important fix available. And I can always setup a cron job to let me know if there are updates to consider. Anyway, thanks intarwebs, you rock.
Tue, 02 Mar 2010Mercurial UpdatesAs revealed recently, I’ve decided to stick with Mercurial as my DVCS. But I’m not really inclined to use the out of date packages that some of my operating systems provide. Then I had an epiphany: it’s a VCS, so I should be able to just use it to keep itself up to date. Software is such wonderful stuff. Sadly, my Google-fu is not up to the task of finding out how to accomplish this—or else the Mercurial folk don’t support it, but that seems unlikely. Somehow, I haven’t even managed to find out how to subscribe to an RSS feed to tell me about updates. If any Mercurial-using person out there happens to know the secret answers to these puzzles, I’d be grateful for a pointer.
Thu, 18 Feb 2010Chrome Is Useless For PrintingA few days ago, I announced a decision to try Chrome as a replacement for Firefox. I said that I would keep using it unless it fails to do something that I really want and it has now done just that. Chrome has no print dialog or preferences and it insists on sending pages to the printer as US Letter pages. My printer is loaded with A4 paper and it knows that. So my printer—quite correctly—refuses to print the output from Chrome. So I googled for an hour or so and discovered that my experience was common and well-known and that, even in version “5.0.307.5 dev”, there’s no solution. How utterly lame. Since I’ll have to use Firefox whenever I need to print anything, I’ll have to go back to using Firefox. If somebody tells me that Chrome has been fixed, I’ll try it again. But I must say that my general attitude towards Chrome and Google is pretty negative right now. And, even if they give me back the two hours I wasted this afternoon, I’ll still be pretty unimpressed.
Mon, 15 Feb 2010About Turn on Version Control SystemsJust the other day I wrote about my plan to switch from Mercurial to Bazaar for my version control system. Since then, I’ve had a few days away by the sea and away from computers and email and blogs and all that stuff. One of the things I was thinking about during my little break was my ongoing problems with procrastination. And a little light went off in my head—changing from an almost-perfect DVCS to a possibly minutely-better DVCS is almost certainly a ploy to avoid getting on with things that actually matter to me. So I’m going to abandon that plan to switch to Bazaar and I’m going to keep using Mercurial—at least until I find that Mercurial just can’t do what I need. And, of course, I won’t be reporting on the outcome of my experiment with Bazaar.
Wed, 10 Feb 2010Switch from Firefox to ChromeI whine a lot about Firefox and it continues not to improve at a satisfactory rate, so I decided to have a look at Chrome. My first few experiments showed me that it was far from ready for prime time on either OS X or Linux, but various people encouraged me to try the developer version instead of the regular one and so I gave that a fly last weekend. And then I announced three days ago on Twitter that I was switching to Chrome on both OS X and Linux. So far, so good. There are things I don’t particularly like, some of which might change for the better and others of which I’ll obviously have to learn to live with. But, for the most part, I like it better than Firefox. It seems quite a bit faster. And, although it consumes a lot of system resources, it seems to leave me with a system that still allows me to do other things. So far, it hasn’t crashed. Some elements of its handling of tabs please me a lot, other elements not so much. It did a good job of importing my Firefox settings, although it insisted that I had to shut down Firefox before it would do the import. Under Linux, it seems to have trouble getting access to the sound system, although many Youtube videos are better silent. The Linux instance I have running has been going for three days. It is using a bit over 2GB of memory—which I think is rather a lot, but I can live with it on my main machine. It has 45 processes. I have 7 windows and 68 tabs open—light use for me, but I’m not doing much with it at this time of the year. Under OS X, it’s much less busy as I just fire it up when I need it and never leave it running for long—the machine is a laptop which is only used when I’m away from home. The verdict after three days: I’ll keep using it for a while until it either does something dreadful or fails to do something that I really want. It would be really nice to be able to stop whining constantly about my browser.
Thu, 04 Feb 2010Another Look at Version Control SystemsI’ve been using version control systems for ever—well, back to the days of SCCS anyway. Every few years, I survey the scene to see if there’s something that better fits my current needs. That way I came to use RCS instead of SCCS. Then I found CVS and, after some hesitation, migrated all my RCS repos to CVS. And then I found I hated some of the weaknesses of CVS and migrated back to RCS. There things stayed until Subversion was ready for real world use. I chose not to migrate old work, but just started using svn for new projects and then for new work on old RCS-managed projects. That went pretty well and served me for some years. But, as Subversion was hitting its stride, other people were working on distributed revision control systems and I started watching those projects. From time to time, I would spend a few days having a good look at the obvious contenders. A couple of years ago I felt there were a few that were ready to be considered: Git, Mercurial, Darcs, Bazaar all seemed interesting. After some consideration, I chose Mercurial and I have been happy with it. But Bazaar, or bzr as it’s called on the command line, had been a close second in my assessment. Bzr was let down by some performance issues and also appeared to have a few other minor concerns. Recently, I’ve had another look at the various DVCSes as part of another project and I think there’s very little to choose between Git, Mercurial and Bazaar. It comes down to comfort with the command structure and support for the workflows that you might want to adopt. For me, Git is still too clunky to use—it takes more typing to get the same result. But I think Bazaar has just moved ahead of Mercurial in terms of workflow options and it seems to have caught up in the performance area. So I’m going to use Bazaar for a couple of new projects and I’m also going to convert a couple of active Mercurial projects over to Bazaar. And, in a few months, I’ll have an opinion about the wisdom of that choice and I’ll write about that in due course. I know I haven’t exactly explained my choice, but that’s deliberate because it really is a fine distinction and I’m pretty certain that Git, Mercurial and Bazaar are all fine systems.
OS X Fails to PleaseI’ve been using Apple laptops for a number of years in order to have access to some specific capabilities, but I have always found it hard to come to terms with the limited functionality of OS X as a work environment. Nevertheless, when I acquired my MacBook Pro recently, I decided to just go with the flow and learn to use Snow Leopard as it was meant to be used. And that has worked out quite well for the purposes that I normally use the MacBook for—email, IRC and web browsing while on the road. But I recently had a reason to use it for my normal work stuff. I had needed to visit a Mac retailer for some minor item and stopped to look at the 27-inch iMac, where I became entranced by the display and, to a lesser extent, by the neat overall package. This led to thoughts of possibly buying one of these things, which in turn led to thoughts of discomfort with OS X. So I decided to try out OS X on a decent-sized display instead of the teensy thing on the 13-inch MacBook. I hooked the MacBook up to a 24-inch display to see how things might work. This brought me into contact with Apple Fail Number 1—the ability to get stuff onto the display you want it on is a black art and in some cases it’s only possible to start an application, see where it lands and then drag it to the desired display. That was hugely unimpressive, but wasn’t the point of the exercise, so I tried to ignore it while doing my testing. I believe I succeeded in applying my attention to the factors that would be relevant with a single large display running OS X. To give it a fair go, I used this setup for three days as my desktop environment. But that was as much as I could stomach. Gnome—whether under FreeBSD, or OpenSolaris, or Linux—is just so much better to work with than OS X that it’s really not even a contest. The upside of this is that I’ve saved $3k that I had put aside for the iMac which I could now partly apply to the bicycle that I’ve been thinking about buying as part of my fitness program. Another upside is that I won’t be constantly chafing against all the annoying little restrictions that Apple impose on their customers. So, although I will slightly regret the decision not to add something shiny to my desk, I think I’m probably more pleased than sad.
Fri, 29 Jan 2010To Do Lists and LifeIn my life as a stellar-class procrastinator, I have evolved a variety of techniques aimed at convincing myself that I’m doing something about my desire to get stuff done. Probably the most well-used of these—apart from straight-out avoidance—has been via the creation of numerous lists of things to do. At its best, this results in lists of lists and lists of lists of lists. And, since I am a software person, that then results in an occasional burst of time-wasting in search of the ideal software tool for making lists. Mostly, I find a few tools that I haven’t seen before and a few that I have tried but which I hope might have improved just that little bit to make them useful. Always, I spend a day or two playing with the tools I find and sometimes I choose one—only to discard it a week or so later. Usually, I keep these little excursions into procrastination secret. But today’s find—todoist—has already impressed me as being much better than anything I’ve tried previously, so I’m going to give it a mention here in the hope that it will force me to just get on with ticking off stuff from my new lists (and also to let people who like lists know about a new toy). And, to be honest, the other reason for posting about it is so I can tick off the item about blogging every Friday.
Fri, 11 Dec 2009Issues With OpenSolaris --- The GNU ToolsAfter my recent post about giving up on OpenSolaris I received a few requests for more information from some people who were prepared to jump through the hoops of contacting me despite the lack of comments on this blog. This is one such followup. I plan more. For reasons which are probably understood somewhere inside Sun, but which I believe to be at least in part a requirement to support legacy software, OpenSolaris is still delivered with antique, if genuinely Unix, software tools. If an innocent newbie whines about the fact that the supplied awk or tar—to pick just two examples out of many—is unable to handle just about any task that someone might expect in 2009, she will be told that the GNU utilities are available. Not only are they available, but they are available in multiple ways. To take the case of awk, old awk is /usr/bin/awk, GNU awk is /usr/gnu/bin/awk and that one turns out to be a symlink to /usr/bin/gawk. So you can get GNU awk either by calling it gawk or /usr/gnu/bin/awk, which might tempt you to put /usr/gnu/bin in your $PATH before /usr/bin. If you do that, you might manage for hours or days until, for example, you needed to read a man page. At that point, man will be mysteriously broken, because it depends on the old Solaris versions of some of the formatting tools rather than the GNU versions, but the idiots who built man for current versions of OpenSolaris have apparently forgotten something I thought all Unix people had known for at least the last 25 years—in system tools, you exec full pathnames rather than relying on the user’s $PATH to find the right ones. One last note for today: despite the fact that “everybody” knows that shell scripts start with #!/bin/sh, in OpenSolaris they start with #!/usr/bin/sh despite the fact that the historical formulation would work. Why? Because at some point, they did away with /bin and moved everything into /usr/bin. While it’s true that they do provide a symlink from /usr/bin to /bin, gratuitous changes like that really don’t help anybody. Sane readers will quite possibly feel that this little essay is hardly sufficient reason to abandon OpenSolaris—and I would agree with that. But there is much more and I’ll try to cover some other issues in the near future.
Fri, 20 Nov 2009Premature UpgradeYou are a technically-competent geek who has been in the sysadmin world for decades. You have many machines under your care. One of your machines is a fax server that sends hundreds of faxes a day. Your operating system is going through the pre-release stages of getting a new major release out which has many new features and many changed features, as you might expect in a change from release 7.2 to 8.0. A release candidate for 8.0 is announced, so you grab it. So far, all is good. But then you blindly upgrade your one fax server to the release candidate and discover that the completely new (and not at all secret) serial I/O system doesn’t work quite right with your hylafax setup. You already know, from at least 10 years of experience with it, that hylafax is demanding and that issues with the serial hardware or software result in bad things happening. This is where you are supposed to say, “Oops, silly me. I should have learned not to do that by now. Quick, let’s unwind that to a known working setup real fast before this turns into a disaster.” But no, this person decides to conduct pointless experiments instead of unwinding his mistake. And he also finds spare time to complain to the providers of the free operating system he has relied on for so long. This just doesn’t make any sense to me. I’d love to think that people could learn from their own mistakes and even from other people’s mistakes—but sometimes that seems like a foolish dream. And yes, I deliberately avoided mentioning names or providing URLs. I’m not interested in having a go at any individual, just using a real current case as a cautionary tale.
Thu, 05 Nov 2009Firefox Keeps Finding New Ways to FailI have been using—and whining about—Firefox since it first appeared. It has improved in many ways since the early days, and I am pleased about that. But it still finds new and astonishing ways of driving me crazy. I have been using a Ubuntu-badged variant with the ridiculous name Shiretoko, and it appears to be based on Firefox-3.5.3 (which I know is not the latest, but it’s pointless going through upgrade pain when I am on the verge of changing a few other elements of my desktop—new motherboard, new memory, and a non-Linux operating system). The new misfeature in this Firefox is that it constantly appears to freeze, for between 8 and 25 seconds. This is sometimes accompanied by greying out the Firefox windows, which seems to say that Firefox knows that it’s dragging its feet. The only good thing is that, whatever it happens to be doing while it’s doing nothing useful, it doesn’t also elect to bring the rest of the machine to its knees—all other windows and processes are able to operate quite normally while Firefox is thinking. (Which is why I’m able to write this as I wait.) The machine is a quad-core 64-bit Intel thing with 8 GB of memory, so it should be more than enough to handle simple web browsing. By simple web browsing, I include accessing static web pages on my LAN, which is also afflicted by this bizarre behaviour. A partial list of the activities that provoke this behaviour includes: pressing a key to scroll the page down; clicking on anything; attempting to grab a scroll-bar to do the obvious; typing in an input field; and so on. This makes net banking a fraught and perilous exercise, as it’s necessary to wait for up to 30 seconds to see if the click you made was actually registered or not—you don’t want to be clicking bank buttons more than once, but it’s not a lot of fun if the bank website times you out near the end of some complex transaction while you waited to see if your click was being processed. It’s almost enough to make me go back to bricks and mortar banks. It’s certainly enough to make me hope I can get some other software installed pretty soon. And it’s certainly sufficiently annoying that I’d be hard-pressed to maintain my normal exemplary politeness if I met a Firefox developer any time soon. Oh, that bit about my normal exemplary politeness was a joke. Just so you know.
Wed, 28 Oct 2009Two Weeks of DspamIt’s now two weeks since I setup dspam-3.9.0-BETA1 to handle my home network’s incoming email and it’s time for a review. I began by training dspam with a recent corpus of about 70k spam messages and 10k ham. Then I passed everything through dspam and checked its accuracy. In my home situation, we can live with some missed spam turning up in our inboxes, but we can’t live with false positives. Dspam made one false positive out of 11,873 messages processed and that was in the first few hours. I’m ready to stop checking for false positives now and have started just dropping the spam on the floor. Over the two weeks, I’ve only seen 17 spams per day out of the 435 that get delivered; and my wife has only seen 4 per day out of the 320 that are delivered for her. I’m calling this a great success and have decided that it’s sufficiently good that I don’t need to implement any other anti-spam measures at all. The minor downside with the methodology I’m using is that any false positives will never be reported to anybody now that the testing phase is over. If anybody sends me a genuine email that dspam thinks is spam, I won’t see it and the sender won’t get a bounce. I can live with that.
Sat, 10 Oct 2009Mail Client Software Keeps Getting ClunkierThere was a time, before the WWW, when email client software was clumsy beyond belief—those who remember the original mail command and UUCP bang-path email addresses will know that things improved over a decade or two. Then, with the growth of the Internet, graphical mail user agents (MUAs) appeared. Some of them were better than others, but they all suffered from some irritations. At the same time, the text-based MUAs continued to be developed. Then, just when you might have expected that we were on the threshold of some really good software, things just stopped. I like to blame Microsoft for developments that I don’t like, but I don’t know if that’s fair in this instance and it’s not really important. One of the early graphical MUAs was Exmh and, despite some clunkiness, it was a pretty useful utility. So much so that I persuaded my wife to use it when she decided to enter the email age. And she has been happy with it for about twelve years. I also used it for a few years, but eventually changed to a text-based MUA as I found myself dealing with ever-increasing quantities of email and discovered that I preferred the speed of the keyboard over the purported convenience of the mouse. And there things stayed for several years. Recently it became necessary to update my wife’s computer—it was a seven-year-old box running an almost equally old operating system and the hardware was almost on its last legs and some of the software (e.g., Mozilla-1.x, OpenOffice-1.0) was simply inadequate to handle modern websites and data. And there were also more than a few security vulnerabilities in the operating system. The search for a replacement, which she wanted to be silent, first led to selection of a Sunray thin client workstation. A number of factors resulted in the abandonment of that plan, but one thing that happened while I was exploring it was the discovery that Exmh, which has not been further developed since version 2.7.2 was released in January 2004, was probably not going to be an option on the intended Solaris platform. That was no surprise, since it’s what happens to older software that doesn’t match the dominant design. So it became necessary to research alternatives that she could live with. There’s no shortage of choice and I won’t list any of them here. Where there is no choice is in the user interface—yes, there are differences, but they are insignificant against the overall architecture. And all of them, although faster in things like actual message display than Exmh, are much slower and more painful to use. I tested several and reviewed all those I could discover and they were all the same. Eventually, I set up the one I thought best (for various other technical reasons not relevant to this discussion) and tried to teach my wife how to work with it. This was a disaster. She was already upset about the other changes I was going to force on her for the “upgrade”, but she uses email frequently now for her work and the modern software simply didn’t cut it for her. Fortunately, the Sunray project died for other reasons and I had to find an alternative. And that machine, an Eee PC that was originally intended for me, runs Ubuntu and still provides Exmh as an optional package. Crisis averted. For now. Sadly, I see no signs of any of the MUA authors making any effort to make their software more functional—adding bling is popular, but you’d think these people would use their software and would get frustrated with its clunky behaviour and would therefore want to improve it. I still hope that something better than anything we have now will arrive in the next three to four years so that, the next time I have to upgrade my wife’s computer, I’ll be able to introduce her to a new MUA that she will be able to learn to like.
Wed, 30 Sep 2009Another Go At ZFS?A few days ago, I wrote: “I’m going to install Ubuntu and fuse-zfs on one of my machines …” To my surprise, especially in the absence of comments on this blog, I got quite a bit of feedback about that idea—all of it indicating that this idea would fall well short of my expectations for ZFS. What to do? One idea that has occurred to me is to tentatively blame the motherboard/memory in the system that was giving me grief under OpenSolaris. That then allows me to justify buying another motherboard, new memory and video card and having another crack at running OpenSolaris with native ZFS on my workstation. I’ve bought those bits and pieces and plan to install them and experiment in the very near future. In the meantime, I will continue my planned server setup on a different box that will also run OpenSolaris and ZFS, but that machine won’t be bothered by pesky irritants such as a screen or keyboard. I expect it to behave nicely. As for my desktop box, we’ll have to wait and see. That will also force me to learn to love OS-X since my MacBook Pro will be pressed into service on the desktop for a while. Perhaps I’m not too old to learn new tricks …
Fri, 25 Sep 2009Using Linux and Wanting ZFSI’ve been using Linux in a limited manner for about four years, meaning that it has been installed on at least one machine that I use fairly regularly over that period. In common with all the other operating systems that I have used, it has its good points and its shortcomings. But, in recent years, the strengths have got stronger while also becoming more important to me and many of the weaknesses have been addressed. However, it has been what I now regard as the failure of my attempts to make friends with Solaris Express and OpenSolaris during some quite intense attempts over the last year that has forced me to look hard at Linux as my main operating system for the near future. There is just one serious fly in the ointment—the unfortunate fact that none of Linux’s multiplicity of file systems is a match for Sun’s ZFS. At first, I thought ZFS was something that was at least theoretically a good thing, but its unfamiliarity made it seem like something that you could live without. However, in a remarkably short time, ZFS becomes ridiculously easy to use and that’s when I started to see just how big a step forward it is. I really don’t want to go back to old-style Unix file systems. Unfortunately, due to the old wrangles over which open source licences are good and which are not, the Linux people don’t feel able to adopt ZFS. I see that btrfs is being developed and that it is hoped that it will bring the features that I love in ZFS to Linux. But btrfs is years away, and I need a file system today. I’m going to install Ubuntu and fuse-zfs on one of my machines at the start of next week to see how well that combination works. It’s far from ideal—fuse-zfs has been pretty well abandoned, as far as I can tell; and it is well behind the zpool/zfs versions that are now in Solaris. But if it works well enough I’ll give it a go and then I’ll cross my fingers hoping that Sun might fix the ZFS licence problem as they have finally managed to do with Java. My other option would be to setup a server running FreeBSD with their implementation of ZFS and to use it as a file store for my Linux desktop machines. I’d rather just run Linux, but we’ll have to wait and see.
Wed, 08 Aug 2007Code Craft falls down hardI know it’s not possible to write a big book without having any errors fall through the cracks, and I don’t make a habit of public excoriation of people for things that can be forgiven — but there are unforgiveable things. Take Code Craft by Pete Goodliffe, published by No Starch Press as an illustration. Here we have a 580-page tome dedicated to the practice of writing excellent code and on page 13 it has an egregious example of unforgiveable content. Before getting to the details, I would mention that neither the book nor the website give me any information that I could find in a reasonable amount of time about how to report errata. Had there been such an avenue, I’d have taken it. As it is, this seems the easiest approach. This is in Chapter 1, On the Defensive, subtitled Defensive Programming Techniques for Robust Code. Under the heading Use Safe Data Structures, he gives the following example of some C++ code:
char *unsafe_copy(const char *source)
{
char *buffer = new char[10];
strcpy(buffer, source);
return buffer;
}
He then gives the correct explanation of the problem with this code when the length of the string in source exceeds 9 characters. After some discussion, he then says it’s easy to avoid this trap by using a so-called “safe operation” and offers this idiotic solution:
char *safer_copy(const char *source)
{
char *buffer = new char[10];
strncpy(buffer, source, 10);
return buffer;
}
In case the reader doesn’t know how the C string library (which is what is being used here, despite the otherwise C++ content) works, let me point out that strncpy is guaranteed not to solve the problem under discussion. The strncpy function will only copy at most the specified number of characters, but — in the critical case where the source string is too long — it will not add the very important NUL-terminator character. And so users of the returned buffer will still fall off the end of it and cause breakage. Every C or C++ programmer who has been paying attention knows what is wrong with the C string library and knows how to use it correctly. So an error of substance like this should simply never have happened. It’s not a typo. It’s not a trivial error. It’s just plain wrong. And there’s no excuse for it. I’m sure the author has many good things to say in this book and many of the sentences I have skimmed certainly do make sense. But stuff like this makes it impossible for me to suggest that it has any place on the budding programmer’s bookshelf. That’s a shame, because we need books that do what this book purports to do. What irritates me most about this is that none of the book’s reviewers spotted this glaring error and none of the online reviews that I found noticed it either. This means that nobody with even a tiny clue has been looking at it.
Wed, 04 Jul 2007Python 3000About three years ago, I announced my plan to move away from Python for future development work. I returned to that theme twelve months ago in a couple of posts about recent experiences with Python. It seems time to update things now. I have just been reading Guido van Rossum’s Python 3000 Status Update in an attempt to understand what the future holds for Python. Clearly, the Python people have decided to make major changes to Python, such that software written for Python-2.x will need work if it’s to be expected to run on Python-3. Equally clearly, a great deal of work has gone into creating mechanisms to assist programmers with the necessary translations when the time comes and that’s something I applaud. However, I have long been unhappy with Python’s continual introduction of what I see as gratuitous changes and have been looking at alternatives. Now seems like the time to jump ship. My plan now is to do some serious testing with alternative languages so that—when the time comes for me to write some new thing—I will be ready to do it in some non-Python language. This post is just to mark the point where that decision was finally made and to link to the Python 3000 paper that marked the tipping point.
Wed, 30 May 2007A new approach to spam filteringAbout three years ago, I first considered DSPAM as a potential solution to the incoming tide of spam that was drowning me and that was increasingly overwhelming SpamAssassin, my then tool of choice. I wrote a couple of blog entries that discussed my research and included references to papers by Jonathan A. Zdziarski (the author of DSPAM) and Gordon Cormack (who, with Thomas Lynam, wrote an evaluation of anti-spam tools). I also mentioned some of my discussions with both Zdziarski and Cormack and said I would report more when I had more information. Much time has passed and the spam problem has, as we all know, continued to get worse. Last December, having become completely fed up with the worsening performance of SpamAssassin, I decided to install DSPAM for testing. I elected not to bother training it, but allowed it to do its thing and contented myself with informing it of its errors. The downside was that I had to look at every incoming message, whether spam or not, to be sure of the classification. I have examined 82,931 messages in the last five months and I’m amazed at how well DSPAM works. Overall, it has caught 98.92% of all spam and its false positive rate has been 0.02%. Most of the errors were in the first and second months while it was learning. Now, it is catching over 99.2% of spam with a false positive rate below 0.01% and there have been no false positives at all for a couple of months. For my wife, the learning was a little slower because she receives much less total email than me and her legitimate email volume is so small that it’s a bit of a challenge to get enough for training. However, even in her case, the detection rate is up to 98.90% and false positives have also disappeared. I was going to modify qmail to reject messages that were deemed to be spam, but I’ve decided that it’s too much work, given the ickiness of the qmail code and the excellent performance of DSPAM. I also toyed with the idea of changing MTA, but I have not found an MTA that I would be willing to use that also has the ability to do what I want. I may one day decide to write my own MTA for in-house use, but for now I’m going to stick with qmail and the other modifications I had made to it in the past and—starting right now—I’m going to stop my practice of reviewing incoming spam in case any legitimate email is lurking there. In other words, from now on, if anybody emails us and DSPAM thinks it’s spam, nobody will ever see the message. There will be no bounce, there will be no error message, there will be no sign that the message was lost. But it will be irretrievably lost. I have decided that the time spent on reviewing the spam is not worth the rewards, when the chances of finding a real message seem to be less than one in a million now. This is especially true when it’s also true that anybody who might need to contact us and who we would care to hear from has other methods of doing so. I am really delighted to have got to the point where my spam load consists of hitting ‘S’ once a day to tell DSPAM about something it has missed.
Fri, 20 Oct 2006Software qualityIt’s hard to find examples where the two words in my title belong together. People of all kinds–users of software, software developers, and those who teach the next generation of developers–have been pontificating about both the problems with software and various approaches that might help to solve the problems for decades. But, as a general rule with almost no exceptions, software still sucks. And it’s getting worse, not better. Anybody who happens to read this already knows that software is a problem since they have to be using quite a bit of software just to be reading a blog–and it’s my contention that just using software is enough to drive you to drink. I’m a software developer, so I am fully aware of the difficulty of creating high quality software. It is indeed difficult to produce software with no bugs and, for most software, it’s probably impractical–or at least not worth the cost. But that’s not to say that the quantity of bugs in the stuff we all have to deal with every day is justifiable. Here are a couple of examples from some of my appliances. I have a DVR. It has a 60G hard disk and a DVD writer. And it doesn’t have to do much. So why does it take 30 seconds to boot up? Why does it have to boot up after making a recording? Why can’t I set it to record something that starts less than five minutes after the previous item? Why can’t it sort titles alphabetically using the same rules as anybody else? To elaborate on the last point, as it’s a classic example, consider the following list of titles:
That’s sorted in the way that any sane person would expect. Getting software to sort it that way is child’s play. The Unix sort program will sort it that way by default. So what does my DVR do? Behold:
I’ve had enough time to study its behaviour now, so I can choose titles for things that it can sort the way I want–but it’s completely crazy that I should ever have been driven to think about this. I don’t want to belabour the point about this one appliance too much, so I’ll limit myself to one other bizarre fault. It has, as I mentioned earlier, a 60G disk. So it can store quite a number of off-air programs for viewing at more convenient times. Well, it could do that if it didn’t have a limit of 7 recording slots. That’s right–seven. The first VCR I ever owned, more than 20 years ago, could be programmed for more than seven recordings. I’m sorry, I lied. I’m going to mention one more thing about this device–because it’s faintly possible that this is a deficiency in the hardware rather than the software (not that I believe that for a minute). The advertising material and the manual for my DVR claim that it can copy between media at accelerated speed. That seems to be a reasonable capability, given what we know about other such equipment, so I expected it to work. But it doesn’t. Copy an hour of video from the hard disk to a DVD and it takes exactly one hour. Copy an hour of video from a DVD to the hard disk and that also takes exactly one hour. And, for another instance that might be hardware but probably isn’t, consider the fact that it makes DVDs that quite likely can’t be read by at least a few of the 8 other DVD readers in the house. Sometimes, to show just how clever it is, it can make a DVD that it can’t read itself. And when that happens, the only way to get control of it again is to remove the power and physically remove the offending disk. That device has a flotilla of other hideous bugs that make it a nightmare. My wife just leaves it to me to drive it. She is pretty smart. Now, let’s turn to my phone. No, let’s not. At least, not beyond mentioning that it’s slow and as buggy as hell too. Having once worked as a consultant for one of the mobile phone manufacturers–where my task was to teach their programmers C and to help them with some of the interesting bits of their software–I’m not surprised to see that this phone has software that I wouldn’t spit on. The trouble is that software is ubiquitous in our world. Even if you never touch a thing called a computer, you can’t escape it. So you’d imagine that we–the software creators of the world–would have figured out some of the basics of making software by now. But we simply haven’t come close. And nothing I see in our tertiary institutions makes me think that’s about to change all of a sudden. One of the good things about the world of software is Free Software. (For anybody who doesn’t recognise why those two words were capitalised, have a look at this definition for some insight.) Sadly, the Free Software people are at least as bad as the rest of the software community when it comes to quality. The Free Software crowd have a bunch of silly slogans written by would-be philosophers without much insight such as the famous “Given enough eyeballs, all bugs are shallow”, usually attributed to one of the worst poseurs in the community. The problem is that bad software is much easier to get done than good software. Of course, if you consider the subsequent investment of time by the software authors while they try to address the worst bugs, then the apparent speed of the favoured method seems less of a sure thing. And, heaven forbid, if we considered the time lost by the unfortunate users of this software, then the equation becomes ridiculous. Time spent on whatever process we can find that results in fewer bugs going into the product will be amply rewarded. And, as many people are now showing, it is almost certain that taking the time to get it right in the first place will be quicker than rushing bug-infested rubbish out the door–certainly once the developers have had time to become established in this new pattern of work. Just as I avoided listing brand names of my appliances above, I’m not going to single out individual pieces of software for criticism here. But it’s pretty safe to say that any high-profile piece of software is almost certainly riddled with thousands of maddening bugs. And the bigger the software, the worse it will be, for reasons that will be obvious. As a final comment on the sad state of things, I’m going to look at the state of programming languages–again without naming names. I’m not talking about the suitability of our languages for developing great software, although that is indeed an important matter. I’m just talking about the woeful state of the software in the interpreters and compilers themselves. I recently acquired a 64-bit computer, not so much because I needed the extra capabilities it had, but to use as a platform for me to use to weed out any little buglets in my own code that might be exposed by a 64-bit machine. As it happens, I have not found any. But I have been amazed at the number of languages that I wanted to use in my testing that are simply not able to run on a 64-bit platform, despite the fact that 64-bit systems have been around for years. And not to mention all the other applications that are not yet available for my 64-bit platform. This is really sad. And it’s so bad that, in the next few weeks when my operating system comes out with its next release, I’m going to install the 32-bit version on my workstation so that I’ll be able to use all the stuff I want to use. I’ve been working all my life at continuing to improve the way I do things. I will keep doing that. I’m happy to talk with people about ways of improving software. And I really think it’s way past time for the software development community to get off its collective butt and to start looking hard at injecting quality into software.
Sat, 15 Jul 2006Further thoughts on PythonI posted an article recently that took a swipe at the direction of Python development. Had I realized that it would be seen by people on the Python developers list, I’d have phrased things differently—I treat this blog as a private repository of my thoughts about various things and assume it will be mostly read only by people who know me. This post is intended to explain my position a bit more clearly and to take into account some of the responses from the Python developers. I’m going to start by outlining my position and my expectations so that those people who seemed baffled by my stance can have a better opportunity to understand this stuff. Then I’m going to discuss a small fragment of the responses on the mailing list. And I’ll finish up with my thoughts about the future. At the outset, it would be useful to understand that there’s nothing personal in anything I say here—I’m simply stating my take on things. I know I’m far from a typical Python user, and I’m certainly not trying to suggest that everybody should do things my way or even agree with me. Although I would hope that people would agree that I’m entitled to hold my own opinions. I’ve been developing software for commercial clients for about 25 years and some of the code I wrote twenty years ago is still in daily use in business environments. I take a lot of care to make my software into a stable and useful tool that allows its users to conduct their normal processes in the way they want—I do not believe that customers should twist themselves in knots to adapt their way of doing things to the software that somebody tosses on their lap. Sometimes, this whole process is straightforward. The customer has a stable hardware platform and I provide stable software and their process undergoes minimal change—and their entire platform is isolated from the big bad world. In these cases, we might see the same old hardware running for 15 years or more with unchanged operating system and application software. Nothing will go wrong there. In other circumstances, customers run their business processes on systems that are exposed to the Internet and need to keep their operating system and basic utilities up to date in order to avoid exploits. This can result in unexpected updates of things that my software might depend on—such as a new version of Python or, to take another real example, an updated Unix C compiler that introduced a gratuitous change in the format of floating point values that resulted in a database where all values were suddenly multiplied by 4. Since I use a large number of software packages on my systems (over 500 at present), it’s completely impossible for me to keep fully informed about all the evolution that goes on in all of them. I am a contributor to a small number of free software projects and I do take seriously my responsibility to test them. But I just can’t do that for everything I use if I am also going to do my day job. So I have an expectation that my tools won’t introduce gratuitous change into my world. What puzzled me about some of the responses on the Python developers list was that people felt entitled to take a swipe at me for expecting bug free software, despite the fact that I had clearly explained that I was not complaining about a bug—all software has bugs and I understand that they must be fixed when found. My complaint was about a change in behaviour from a function that had no bug in it. Fortunately, Guido van Rossum (the Python benevolent dictator) is a lot smarter than the chief Perl weenie and knows how to read for the real content. He recognised that the issue I was complaining about was something that had bitten him in the past and he requested that it now be fixed. I understand that a fix is scheduled for Python-2.5 when it comes out. I understand the desire of the Python community to continue to
develop their language.
(I think they’re wrong, but I’m in a tiny minority and I have no
intention of trying to convert the majority to my opinion.)
What I find problematical, however, is their willingness to break
working code as part of this process.
I complained the other day about the change in behaviour of
It’s one thing to extend the language and its support libraries. And I have no argument with that at all. And it’s fine to fix actual bugs in the existing code. But making changes that are guaranteed to break existing correct code is just insane, as far as I’m concerned. As another example of inexplicable change, I would mention the change in meaning of the division operator. It doesn’t matter if, in hindsight, you see that it would have been nicer to do something differently—once people are using your language, you have to leave it alone. Or else they will do what I’m going to do. I’m going to lock my customers into Python-2.3 for now and then I’m going to migrate all my Python code to a language that doesn’t go in for this kind of breakage. Ironically, had I used awk for the software in question, I’d have had no problems at all. But Python was new at the time and arguably nicer to write and had two minor but useful features that were missing from awk, so I decided to develop a collection of software using Python. I don’t regret doing that, but it’s definitely time for me to move on now. That’s not just because of the small number of issues that I’ve discussed here, but because of the looming arrival of Python 3000 which sounds like far too dramatic a change for me to want to keep up with it. If I have to deal with that level of change, I’m going to be far better served by choosing a more stable environment for the future work.
Sun, 09 Jul 2006Python loses the plotIn Python’s early days, I saw it as a fine addition to the programmer’s toolkit—it seemed to offer the good things that Perl offered, but without the gruesome syntax and other Perl perversions, and the Python benevolent dictator and community seemed to have a good plan for the future. As a result, I began developing most new large applications in Python, with the occasional bit of heavy lifting in C as needed. This approach worked well for some years. But then the wheels slowly started falling off—and yesterday’s experience has pushed me to the point where I’ve decided not to use Python for any new development. This leaves me with a dilemma, of course, as I don’t have any suitable candidate for a replacement. So what’s my beef? In a nutshell, it’s gratuitous changes that break code that was once correct when it’s exposed to a newer Python release. This disease has afflicted Python for some time, although I have been lucky enough to have only been bitten once before. Yesterday was my second experience with this kind of breakage and I’m going to make it my last. For those who care, the behaviour of Clearly, I can work around this. But then they’ll break some other standard function that I’ve been using and I’ll have to work around something else. And so on. There is no legitimate excuse for this kind of arbitrary change. It’s impossible to code in such a way that you won’t be bitten, and there’s too much new software coming out every day for developers to have the time to waste reading all the fine print just in case some idiot has broken some standard API. My interim solution is to change the first line of all my scripts
from
Tue, 30 May 2006Native driver for nVidia NIC saves the dayAs distributed, FreeBSD for the AMD64 platform comes with a rather dodgy driver for the on-board nVidia nForce MCP NIC that appears on many motherboards for this CPU. In my case, the symptoms were trillions of device timeouts and weird unresponsiveness under Gnome and bizarre keyboard malfunctions—lost keystrokes and occasional cases of keys repeating hundreds or thousands of times. Some research quickly established that these issues were well-known and that there was a revised version of if_nve.c that was supposed to address these concerns. Unfortunately, simply replacing that file with the updated version resulted in a kernel build failure, as other stuff was required. Since that other stuff was supposed to live in a directory that doesn’t even exist on my machine, I decided to try plan B. Shigeaki Tagahira has developed a FreeBSD native driver, based on the OpenBSD driver. Today, I built that and patched my kernel with his ciphy patch for the Cicada PHY and am pleased to say that the odd behaviour I was seeing now seems to have been cured. This is great news. And yes, this is a bit boring, but I wanted to record the essential data in case I blow away my installation and forget how to rescue it. Aren’t blogs wonderful?
Fri, 26 May 2006Rewrite it or fix it?I’ve been reading Adrian’s series of articles on XP with considerable interest. I’ve found it interesting to see how somebody I know has got involved with this approach to software development and I’ve felt that there were lots of good lessons there.
But I was a bit taken aback by something in a recent entry, Framing The XP Principles: Netscape has all but gone out of business because of one bad technical decision to rewrite their entire browser instead of taking small steps and fixing the existing one. Now, before I launch into my speculations here, I should point out that I’ve never worked at Netscape and I’ve never worked on any of the Mozilla products—although I have occasionally skimmed parts of their source code. So what follows is purely my guesses, based on what I can see from outside. On the other hand, I have worked on a number of big projects where a decision was made to throw out some existing implementation in favour of a complete rewrite—and I’ve seen such projects succeed. I am satisfied that the old Netscape code base was rubbish. I am also convinced that it would have been insane to attempt to fix it. Where I think Netscape went wrong was in failing to learn anything from the first time through. From where I stand, they seem to have embarked on another gigantic piece of junk without getting hold of the right people or developing a sensible plan. It’s clear to anybody who uses or who has the misfortune to need to build something like the current version of Firefox that this new code is pretty nearly as bad as the old code. Of course, it is better—but not significantly better. It’s much more like the old rubbish than like something that we’d all be proud to be a part of. I have no way of proving my point, of course. And I don’t care that much about the specific case. And I certainly agree that it’s better to refine a code base as a general principle rather than to automatically throw it all away. But I think it’s important to recognise that there are real cases where it is better to discard even a huge code base than to get lost in a vain attempt to “fix” it. Naturally, if the decision to rewrite is taken, it should only be done if there is a clear commitment to first learn the lessons from the failed implementation and to create a design and a methodology that have a reasonable likelihood of success. Having said that, I now await Adrian’s next instalment with interest.
C is harder than it looksI recently saw a nice example on a mailing list of the kind of problem dilettantes run into when they play with C. The coder naturally failed to show what he’d done, but described his problem in terms of “what’s wrong with the implementation of sockets in this operating system?” and went on to describe an impossible scenario. This led to a variety of responses from people who like to be seen to be able to help and don’t strongly feel the need to be right. It was pretty obvious that the original coder had made a standard beginner’s mistake with C syntax and had compounded his error by failing to turn on warnings in his compiler. In essence, he claimed that the socket(2) call would return a descriptor that was already in use because, after opening his socket, data written to the descriptor would appear instead on the standard output. Fortunately this was seen by somebody who actually knows C in time to stop too much silly speculation. This guy suggested that the coder must have done something like:
if (sd = socket(AF_INET, SOCK_STREAM, 0) != -1) {
/* do stuff */
}
That code will always assign the value 1 to sd, except for the rare case where it’s not possible to open a socket (in which case the reader, having been alerted to the error, will now know what value it will get). Had our coder turned on compiler warnings in gcc, he would have been told, “warning: suggest parentheses around assignment used as truth value” which might not have been enough, but would have suggested that he needed to get help. Real C programmers have that extra set of parentheses burned into their fingertips and don’t need to think about them—and, on the rare occasions when they do forget them, know instantly what that warning means. And real C programmers always run their compilers with all warnings turned on. Unfortunately, too many people take a quick look at C, see something quite simple, and decide that they can safely work with it. It is indeed a simple language, but it’s also a demanding language that provides no training wheels for learners. It takes time to really understand it and it takes regular practice to make good use of it. It’s my opinion that the time required is well spent because you then have a very powerful tool at your disposal—but, like all powerful tools, it can hurt the unskilled operator.
Thu, 27 Apr 2006Customers can drive you to distractionEverybody who deals with customers knows that my title is a truism, but sometimes it can be interesting to examine a case. Recently, I told the story of a customer who was in a state because the power went out suddenly and they had been seriously inconvenienced as a result of lying to me about having finally got all their equipment protected by UPSes. Yesterday, they arranged for the electricians to come back to do properly what had been arranged so long ago. As instructed, they shut down everything while the electricians were on the premises. Late this afternoon, they finally rang me to ask what I was doing about getting everything going again. I was waiting to be told the electrical work had been done. They said they had emailed me yesterday. I asked why they had not followed up, or at least checked that the email had left their systems. This upset them, so I let it go, beyond telling them that they knew how to do these things and that they also knew how to check that their ADSL modem was functioning—which it wasn’t. To cut a very long story short, they said they were desperate to have all their machines running now that the UPS stuff was in hand. (I had told them that I’d only sanction one machine running until the work was done, because that one machine was indeed on its UPS.) So I said that we now needed to get a lamp or something else that didn’t draw too much current and go around to check that the new outlets were really connected to the UPS. At this point the customer went ballistic, saying that they’d already spent three days with most of their computers unavailable and they just couldn’t spare the time to do this silly checking. Of course, the sole reason they had a problem was their earlier refusal to follow the agreed plans and get all the machines protected by the various UPSes that sit there in their building for that purpose. I then said, “OK, if you really want to power up the machines without us first checking the electrical work, all you need to do is to sit down for a minute and write me a quick email stating that you are happy with the electrical work and that you take full responsibility for anything that may happen in future if it turns out that the work was not done right.” “But that’s not fair,” protested the customer. So I explained that what wasn’t fair was the fact that they expected me to drop everything and jump through all kinds of hoops because they refused to follow my advice. I said that they could get the machines running in a couple of minutes if they sent me the email. And then, surprise! They decided that I’d better step one of their staff through the process of testing the wiring after all. This time the job had been done right, so now all their machines are humming away and I can be comfortable that the next power outage won’t result in me having to spend hours solving problems that should never happen. Of course, the owners of the business have gone home in a snit and they will be even more cranky tomorrow when they tell their staff to chase me up about other things that were put on hold while the crisis was fixed—as I told the staff who stayed behind to sort out the UPS testing, tomorrow is my wedding anniversary and I just won’t be willing to take their calls.
Fri, 21 Apr 2006That went wellGot a call in the car this afternoon, with a customer in a panic and Friday peak hour traffic buzzing about me. Their building managers had announced they’d be “working on the power” over the weekend and gave them twenty minutes to prepare. According to the customer, they then immediately cut the power. (Based on what happened later, I think they probably got the twenty minutes and failed to act promptly.) But not to worry, since they had finally got around to doing the necessary wiring and now all their equipment was protected by their UPS devices. So I commented on their good fortune. And then discovered that they’d lied. Perhaps the wiring was done but they hadn’t moved the machines over, or perhaps I’m going to find out next week that the wiring wasn’t done at all. In any case, the machines were not protected by the UPS boxes that are sitting there humming away in their offices. Since it was Friday and since the building’s power will be on and off over the weekend, I told them to run around the offices immediately and yank the power cords from every bit of equipment and then call me back. I stressed the “immediate” bit several times. When they called back, about fifteen minutes later, they asked if they should keep pulling the cords now that the power was back on. Clearly, “immediate” had a different meaning for them than it has for me. Eventually, as the power went out again, and the machines all went down again in the middle of starting up, they got around to pulling the power cords. I told them we’d sort things out on Monday morning. Fortunately, they’re in the same time zone as me. But they have a shock coming. We’re going to get one machine, their main server, running and we’re going to prove that it’s connected to its UPS. And that’s all we’re going to do. I’m going to explain that none of the others will go back into service until they are connected to a UPS. And I’ll make it clear that I will be able to verify this and that the deal is absolutely non-negotiable. We have stuffed about for years over this issue and I’m just not going to be the idiot who runs around in a panic because they are too lazy or too pig-headed or too stubborn about what they imagine are their own priorities to manage something so simple and so essential to their own well-being. They are going to be angry with me on Monday. That’s OK with me. But this time I’ve been handed a lever to use to make them do the only sane thing and I’d be remiss if I failed to get something done.
Face Time and Free StuffI am frequently asked questions by other software developers about the issues one faces in the process of moving away from a job into the scary world of running one’s own business. Today I saw a nice post by Christopher Hawkins of Cogeian Systems entitled Face Time and Free Stuff. You’ll have to scroll down a bit to get to it, as he seems to have combined two posts in one permalink for some reason. I think he has a lot of good ideas there and I recommend it to all the people who have been asking me questions.
Fri, 24 Mar 2006Respect must be earnedI offended a customer today by failing to show respect for some really appalling software.
Wed, 01 Mar 2006Somebody please kill all the Perl weeniesIf ever there was a day when it was going to become unambiguously clear to me that all the “Perl programmers” need to be taken out and shot, today was that day. Sure, go ahead and use Perl if it’s the only hammer in your toolbox and you just need to whip up a quick personal script and can’t be bothered learning how to do things properly and haven’t got the time or money to get somebody else to do it properly. But please, all of you, just stop writing big programs in Perl and passing them off as serious software. My specific beef today is with the complete barking idiots who are responsible for SpamAssassin. Not only do they introduce command line incompatibilities for no good reason, but they make it barf over no-longer-supported options in such a way that a message gets delivered containing only the contents of the SMTP MAIL FROM and RCPT TO commands. No headers, no body, and no sign that there’s a problem. How utterly stupid. And then, when the command line is fixed, it barfs because the new version uses some alternate database storage for its Bayes stuff. And this barfage is even more brilliant. It spams the MTA’s logs with lengthy messages, but exits with a success code and this time outputs an entirely empty message. Even a child could work out the appropriate failure modes for such a tool. But not those idiots. After missing out on a chance to find out what was in the 350-odd messages that arrived while I upgraded my workstation, I can tell you that I was mightily pissed off. And it’s nice to see the Apache crowd have indeed brought something else into their fold that seems to match their approach to software. Be a shame if something good had wandered into that hole by mistake.
Sat, 04 Feb 2006Seven Secrets of Successful ProgrammersIn a recent blog post, Lars Wirzenius took a swipe at a post entitled Seven Secrets of Successful Programmers by Duncan Merrion. Lars “found it to be simplistic enough to be inane” and I certainly agree with that assessment. And I mostly agree with his other criticisms of the original article. But then Lars goes on to give his own list and that’s always going to lead to people wanting to disagree—something I’m about to do. His fourth point is to develop debugging muscles. Lars suggests that this is difficult, since nobody teaches it and that does seem to be true, to some degree at least. But then he says “you’ll be spending most of your time doing it.” And that’s where we diverge. In my view, if you’re a “successful programmer”, then you’ll spend most of your time programming. It’s the unsuccessful programmers who spend most of their time debugging. So, if you’re one of the people who knows your debugger better than your editor, you really need to learn more about the practice of programming—from books like The Practice of Programming by Kernighan and Pike, or Bentley’s Programming Pearls or even a classic like The Elements of Programming Style by Kernighan and Plauger. That’s not intended to be an exhaustive list, of course. You need a more complete grounding than that and you need domain-specific knowledge as well. But, regardless of where you go to learn your craft, you need to develop your skills to a level where you hardly know how to drive the debugger because it’s so long since you last used it.
Sat, 31 Dec 2005Technical writing is harder than people thinkI’m always on the lookout for good technical writing—whether it be in books, magazines or online. I have several motivators for this:
It won’t surprise anybody that I find plenty of technical writing that fails to impress. I don’t write about everything I find that I don’t like—I’d rather be spending my time on things that interest and challenge me. And I have previously written about flawed technical papers and about the stupidity of treating C and C++ as interchangeable. But those items were written nearly 16 months ago, so it seems reasonable to revisit the themes again now. Right now, I’m particularly interested in looking at the technical book publishers—I have a couple of books planned [1] and I’m thinking about how to pitch them to the publishers. So I was pleased to see an article in Linux Journal entitled The Arnold Robbins Book Series: A Review. The review covers the recent Open Source Development Series, edited by Arnold Robbins and published by Prentice Hall. It incorporates an interview with Arnold Robbins and reviews of two books in the series. My interest is in the series as a whole and one of the books, Linux Programming by Example: The Fundamentals, also written by Arnold Robbins. After skimming the interview and the review, I was sufficiently interested to have a look at the book. Although the Prentice Hall link above claims the book is available on Safari Books Online, it was not available via any of my Safari subscriptions. However, Prentice Hall do offer a sample chapter online. Since that chapter covers memory management, one of the three basics that almost all books covering C programming get wrong, I thought I’d have a look. Memory management is not only one of the big three for technical errors, but it is also frequently associated with another of the big three—imagining that it’s possible to write usefully about both C and C++ in the same place. And so it is here. As so often happens, Robbins recommends unwarranted casts in his discussion of malloc(3). Oddly enough, he incorporates some sample code from Geoff Collyer that eschews the cast and comments unfavourably on that, rather than learning from it. This is not the place for a detailed explanation of the reasons why a cast is useless and potentially harmful, but it is worth mentioning that in C—as distinct from C++, a quite different and arguably significantly inferior language—use of the cast operator is almost always a sign of programmer ignorance. It is rarely needed in correct code, and the circumstances where it is needed are clearly understood by competent practitioners. Part of the problem is caused by people who dabble in both C and C++ and who have fallen for the strange propaganda from the C++ zealots that C++ is an “improved C”. It’s not. As a wild guess, I’m willing to bet that the reason for the requirement for all the casts in C++ is another consequence of fact that Stroustrup, despite purporting to write a successor to C, never managed to master that simple and elegant language. (Last time I speculated on something like that, the interested party wrote to me and put me straight; expect a correction here if that happens this time around.) Of course, Arnold Robbins is not the only person who has made this mistake of thinking it’s a good idea to cast every pointer returned by malloc. The mistake was most famously made in the second edition of the C bible, known to all as K&R2. When I first pointed out the error to Dennis Ritchie, he defended the book; but, in the face of my persistent nagging, he re-thought it and eventually decided that the book was wrong. This is now documented in the Errata for The C Programming Language, Second Edition. (Search for the references to malloc on pages 142 and 167 for the details.) Returning to the book, the other concerns I had were that the author spent a lot of time discussing things that he suggested (quite rightly, in general) that you should not do, even with some lengthy and detailed example code. This seems pointless to me. And he failed to explain reasons for other things he recommended. For instance, he suggests that it’s always a good idea to zero newly allocated memory. (In fact, I don’t agree with the “always” part, but it is often true.) But he fails to explain either of the fundamental reasons why it’s sometimes a good idea. That seems unforgiveable. The final issue from the sample chapter is more difficult to be sure about, because it’s impossible for me to read this book with the same state of mind as its intended audience. But, from my reading, it lacked bite. I think I’d have wanted something that stated its goals clearly, tackled the useful issues head on, explained good practice and equally explained the reasons for the possible pitfalls. This chapter seemed weak in those areas—and the specific failings I described certainly leave me disinclined to recommend the book or the series (although I’d have to look at some of the other books before I came out strongly against the whole series). [1] As a renowned procrastinator, I may well never get around to completing these books; but I like to prepare for things in case I do go on to finish them.
Fri, 04 Nov 2005Shell horrorsI borrowed my title from Sarah’s story, but I’m not sure I share the angst. Of course, since we don’t get to see the “22 lines of code including if’s and for’s”, it’s impossible to know if there is reason for concern beyond the obvious—what happens to the output file if we bail in the middle because of a bug or unexpected data? As a general matter, wrapping up relatively lengthy script sections inside a loop or a sub-shell and collecting all the output into a single redirection or pipe is common shell programming practice. Whether it’s advisable depends on many factors:
There are more questions, but that list gives the flavour of the ones that would need to be addressed. My own shell horror from yesterday was a script to start a binary that built a loop with:
while /bin/true ; do
# stuff
done
What’s so bad about that? It’s bad because the installer didn’t bother to discover if there was a /bin/true, but just blindly assumed it. (I will say that the decision by the FreeBSD folk to move true to /usr/bin—which happened many years ago—seems to me to have been an error, but it happened.) And it’s bad because the “standard” idiom is so much better for several reasons:
while : ; do
# stuff
done
This is faster and neater and it will always work—certainly on any sh that came out in the last 20 years (and that’s got to be good enough). (I’ve run out of time to research just when “:” first appeared, but it’s in at least one 1987 reference next to my desk.)
Tue, 04 Oct 2005del.icio.usWell, I’ve finally got with the program and signed up with del.icio.us.
Thu, 30 Jun 2005Ruby is not the wayWhen I was a kid, I loved my mother’s engagement ring with its brilliant ruby surrounded by a cloud of tiny diamonds. Ever since, I’ve had a soft spot for rubies. And I was predisposed favourably towards the Ruby language when it appeared, in particular because at least one person I respected wrote some words of praise about it a few years ago. I thought that person was Henry Spencer, but my googling can’t find the article now, so either I misremembered or it was in a print medium that has escaped Google. Recently, I’ve been playing with some software written in Ruby and found that it met my needs rather well—modulo some minor modifications I’d need to make to fit it into my way of working. And I’d also read quite a bit of praise for the language and its “Ruby on Rails” web framework which had made me consider it as a possible tool for some new development I wanted to do. So, as is my way, I went looking for books about Ruby and found two on the web. I’ve now read one and about half of the other one. I’ll probably finish my reading, but I don’t see myself adopting Ruby for my own work and I don’t want to use software written in Ruby either. One of the great crimes of the Perl push was the idiotic mantra “there’s more than one way to do it.” And the Ruby bunnies seem to have taken this to extreme lengths—only to end up with a language that cannot be parsed reliably by human readers. Regardless of practices that people may have adopted in the distant past, in today’s world the single most important thing in software development is simple, clear, obvious code—code that anybody with an appropriate background can just read and understand. Consistency is an important factor in readability, not just in the usual areas of white space and indentation, but in the overall syntax of the language. And a language that encourages variant syntax (or even allows it) is just a menace in terms of reliable code and practical maintenance. I’m going to give just one example, but there are many similar cases in Ruby. The syntax for a function call is free (except where it might confuse the parser, meaning that the programmer has to know far too much about the internals of the parser for comfort). So you might have a call to the foo function in the common form:
foo(bar, baz)
But you can drop the parentheses if you like and if (insert some complex and rather ridiculous set of rules here). I might have mis-described that slightly, and can’t be bothered checking the syntax now, but the bit about the optional parentheses is correct as is the bit about bizarre rules for when you might need them. And, even if your call is right today, a modification elsewhere in the file might make the code wrong tomorrow. When you think how simple it would have been to declare a single, simple and unambiguous syntax for a function call, this kind of design just makes me weep. Maybe I’ll have to learn to like Python more (or Scheme or Common Lisp or Erlang)…
Wed, 25 May 2005Firefox still has issuesWhile gathering the facts for my previous post, I opened a new Firefox window with a few tabs to the articles of interest. Shortly afterwards, my gkrellm monitors started yelling at me about my CPU temperature rising. It had suddenly jumped to 42C (from the 29 to 30 that it’s running now). A quick look at a top display also showed that we were running at about 97% CPU utilisation (where 5% would be normal). Those figures remained until I closed the extra Firefox window and now everything is back to normal. At least things did go back to normal and I was able to leave Firefox running, so this is better than earlier releases—although it could be better. In fairness, I’m not running the latest release today. This is 1.0.2 and I do have 1.0.4 and would normally be using it, but this particular instance of 1.0.2 has been running since 11 April, so I just put it to use out of laziness.
Mon, 07 Mar 2005Too many ways to do itAs a longstanding member, I generally read the
Usenix
magazine, The February 2005 issue contains an article entitled Error Handling Patterns in Perl, which I thought might be interesting. Although I don’t use Perl myself (for reasons that I have frequently enumerated and won’t repeat here), I’m quite happy to read about good programming practices, regardless of the language chosen for exposition, because ideas usually translate to other languages as well. If one were to write an article that was intended to show the superiority of one language over another, it would make sense to provide examples in both the languages—unless one was Bjarne Stroustrup writing one of his silly things purporting to show the superiority of C++ over C. But, where the objective is to demonstrate good practice in one language, it would be wise to restrict oneself to examples in that language rather than illustrating one’s lack of knowledge of other languages. In what follows, I’m going to assume the Perl is correct on the basis that the author claims to know Perl, but the correctness of the Perl has no impact on what I’m about to say. To get things started, the author gives an example of what he thinks is the C idiom for opening a file (slightly cleaned up to avoid confusing the issue with extraneous material):
FILE *fp;
fp = fopen("/my/file", "r");
if (fp == NULL)
return -1;
Then he explains that you can do the same thing, using the same idiom, in Perl. The example code is obvious, so I won’t repeat it here. Then he explains why a simpler idiom might be useful and he gives the following Perl code as an illustration of a better way to do it:
open(my $fh "/my/file") or die "Cannot open '/my/file'\n";
Fair enough. But it’s just silly to try to make that look better than the C code when anybody who is capable of getting a job as a C programmer would have used the real C idiom below:
if ((fp = fopen("/my/file", "r")) == NULL)
err(1, "cannot open '/my/file'");
Clearly, that is identical to the Perl example. Why make a fuss about this? Had the point been “don’t do it the way we showed in the first Perl example,” that would have made some kind of sense. But the real point was obscured by the silly pretence of showing something that purported to be better then the C way, making this a choice of language issue rather than simple good practice independent of language. Perhaps it doesn’t really matter all that much, but I find it difficult to take seriously somebody who takes the time to give example code in a language he doesn’t use without making the slightest effort to discover if what he’s saying is plain silly. And that makes me doubtful about the value of the rest of his paper—which might just have some good stuff in it. As it happens, I did read the rest of the paper because I still had coffee in my cup. That’s when I came across some of the real reasons why I hate Perl (and provided my title for this rant). There really are too many ways to do things, and some of them affect you even if you don’t specifically choose to use them, being built in to the language. Nobody is forced to use C’s ternary operator, and many people just don’t use it. Everybody who uses C understands the concepts of true and false as expressed in the language. But Perl has so many ways of “helping” you that it also has a million ways of expressing false, some of which work all the time and some of which don’t. How silly is that?
Mon, 03 Jan 2005Testing softwareSome Humbug bloggers have recently discussed software testing and made some interesting points. I’m keen to play devil’s advocate on some of these issues and will do so, but perhaps not for another week—by then, my study’s grand re-organisation should be finished and it might be possible for me to find a few minutes to put together some sort of case.
|