Perl vs. Java code

2003-03-17

Rafe Colburn explains why Algol-like languages are far superior to Perl for working on large scale, multi-programmer, long-term projects. I'd go further. If you use an outliner to edit your source code, his multi-line Java example shrinks down to one line, just like his Perl example. If you don't program in an outliner I'm sure you have no idea what I just said. If you do, you're probably chortling and guffawing and pointing at the screen saying "See what I said." [Scripting News]

For what it's worth, both the Perl and Java in the linked article are wrong. The correct solution in both cases is to write a function "dirDepth" or something that takes a path and counts the depth, then call that function. The Java one may be more complicated-looking, but that's OK. The original Blosxom solution unnecessarily ties the program to the UNIX platform, which is the only one you can depend on for '/' to be the path delimiter with no exceptions.

The reason for this is that the code then becomes depth = dirDepth(dir), which is clear in both languages.

Once the intent is clearly labelled by the function name, the code calling the function becomes self-documenting and the next person to work with the code can much more easily figure out what is going on. This helps them fix bugs in both the caller and the function (for instance, the algorithm being used here requires an absolute path to work right, suppose someone passes a relative one?) and perhaps help make it cross-platform.

This is indeed a great example of why perl is such a disaster for multi-user use, and I strongly disagree with Rafe's last throwaway comment that if he sees Perl code that doesn't use the Perlish idioms, it must have been written by a bad programmer. If you see a significantly sized program written without many Perl idioms, it is more likely that the developer was aware of the pitfalls of using those idioms. I remember seeing one Slashdot post proud that he had written some perl modules totalling 4000 lines of code. I have this one module at work that totals that alone, there's no easy way to break it up, it's a critical part of the system, and I'm leaving in no more then six months. Writing that much code in cutesy Perl idioms would leave me unable to debug it, let alone the next guy who has to fix it, even if it might cut the apparent code size down by 2 or 3. Writing everything explicitly also lets me avoid problems with implicit variables leaking around, and when I cut and paste a snippet of code into a new location that's a few lines forward or back, I don't have to worry about the implicit variables at all.

Generally it's a good idea to learn the idioms of a language and use them in preference to longer ways of saying the same thing, but Perl is actually an exception to this rule. There are so many idioms, often so many for the same thing, that you simply can't assume that the next person reading the code will understand what you did. Even running through it with a debugger can leave the code opaque because each line does so much (been here!). Perl should automatically turn off implicit variables if the program is big ("use strict" should probably do it), and having tr return the count is a real hack. (Of course, the whole scalar/array dichotomy is a real hack.)

As for an outliner, I totally agree. The only problem I've seen with outlining is that you tend to write the code inline, instead of wrapping it in a function for re-use later. Userland code has a lot of problems with this; take a random function in Radio and do an "Expand All", and it's likely to be hundreds of lines long. Now, the ability to collapse does indeed render this code comprehensible, so the usual objection to long functions on the basis of comprehensibility doesn't really apply. But if you're not careful in the Frontier environment, you end up with too many useful functions firmly embedded into the scripts when they should have been seperate, which is a problem when an external developer like me comes along and says, "Gee, I'd like to have that code snippet there, but it's embedded and I can't get at it." Again, been here several times. (Copy and paste is generally the only solution but then you're responsible for it; the Userland updates don't affect your code.)

An interesting re-factoring tool for strongly-structured programs like Python or Usertalk would be to highlight a program node and click a menu option that said "Make This a Function", which automatically looked at all the variables the function uses, wrapped them into a call, tried to determine the "return value" or "values", and automatically replaced the code with the right function call. I'll have to remember that.