Приглашаем посетить
Шолохов (sholohov.lit-info.ru)

Section 6.8.  Iterator Variables

Previous
Table of Contents
Next

6.8. Iterator Variables

Use named lexicals as explicit for loop iterators.

From a readability standpoint, $_ is a terrible name for a variable, especially for an iterator variable. It conveys nothing about the nature or purpose of the values it stores, except that they're currently being iterated in the innermost enclosing loop. For example:

    for (@candidates) {
        if (m/\[ NO \] \z/xms) {
            $_ = reconsider($_);

            $have_reconsidered{lc(  )}++;
        }
        else {
            print "New candidate: $_\n";

            $_ .= accept_or_reject($_);

            $have_reconsidered{lc(  )} = 0;
        }
    }

This piece of code starts off well enough: "For each of these candidates, if it[*] matches a certain pattern...". But things go downhill very quickly from there.

[*] In Perl, the best way to pronounce $_ is usually "it".

On the third line, the call to lc has its argument omitted, so the function defaults to using $_. And the maintainability of the code immediately suffers. Whoever wrote the code obviously knew that lc defaults to $_ in this way; in fact, that's probably part of the reason they used $_ as the loop iterator in the first place. But will future maintainers of the code know about that default behaviour? If not, they'll have to look up lc to check, which makes their job just a little harder. Unnecessarily harder.

The usual reply at this point is that those maintainers should know Perl well enough to know that lc defaults to lowercasing $_. But that's the crumbling edge of a very slippery slope. Which of the following built-in functions also default to $_?


    abs          close           printf        sleep
    chdir        die             require       -t
    chroot       localtime       select        -T

Even if you knew[Section 6.8.  Iterator Variables], and were confident that you knew, are you equally confident that your teammates know?

[Section 6.8.  Iterator Variables] Of the 12 builtins listed, only abs, chroot, require, and -T default to operating on $_.

Taking advantage of the default behaviour of any builtin forces the reader to "fill in the gaps", and therefore makes the resulting code harder to read. And much harder to debug. Saying explicitly what you mean takes only a little extra coding efforteffort that is repaid many times over when the code is being maintained.

Meanwhile, the example gets worse:

    else {
        print "New candidate: $_\n";
        $_ .= accept_or_reject($_);

        $have_reconsidered{lc(  )} = 0;
    }

Because $_ isn't a meaningful name in the context, it's hard to tell at a glance exactly what information is being printed by the first line of the else block. Likewise, it's impossible to tell what's being accepted or rejected by the following line, or what the result is being appended to.

All the problems of maintainability described so far can be eliminated simply by giving the iterator variable a human-readable name:


    for my $name (@candidates) {
        if ($name =~ m/\[ NO \] \z/xms) {
            $name = reconsider($name);

            $have_reconsidered{lc($name)}++;
        }
        else {
            print "New candidate: $name\n";

            $name .= accept_or_reject($name);

            $have_reconsidered{lc($name)} = 0;
        }
    }

In this version it's clear that the name of each candidate in the array is matched against a pattern. If the match succeeds, then the candidate's name has been (case-insensitively) reconsidered one more time. Otherwise, something is appended to their name, and it's noted that the name has not yet been reconsidered.

Now the reader doesn't need to remember default behaviours, because the code can't use them (as the iterator variable is no longer $_). Nor is there any need to mentally keep track of what the iterator variable actually contains, because it now has an explicit name that describes its contents.

Note that the recommendation to avoid $_ applies even if $_ is entirely implicit in the loop, and regardless of the type of loop:

    while (<$bookmarks>) {
        if (m/phoenix|firebird/xms) {
            s/\s* \z/ (see firefox)\n/xms;
        }
        print;
    }

For maintainability, that's still better written as:


    while (my $record = <$bookmarks>) {
        if ($record =~ m/phoenix|firebird/xms) {
            $record =~ s/\s* \z/ (see firefox)\n/xms;
        }

        print $record;
    }

    Previous
    Table of Contents
    Next