Ïðèãëàøàåì ïîñåòèòü
Ëåñêîâ (leskov.lit-info.ru)

4.8 Line Editing

Previous Table of Contents Next

4.8 Line Editing

Line editing is not a process of combining statements to fit as many as possible on each line. It is the conscious conversion of lengthy code into something shorter—or in certain cases, longer—that is at least as clear to you as the original. This can be as simple as replacing an if (! expr) with unless (expr), converting lengthy manual output formatting to use printf(), or something more complicated. Now I'll cover some opportunities to look out for performing line editing.

4.8.1 Needless Repetition

This principle is so fundamental that is is known by an acronym: DRY (Don't repeat yourself; see [HUNT00], p. 27). Any repetitive pattern in code should trigger the urge to see if you can elide it. Perl provides a plethora of tools that help in these situations; be especially familiar with:

  • foreach loops and their aliasing property: If you modify the loop variable while it points to a writable variable, that variable will be updated. Try this test:

    
    my $x = 5;
    
    my @y = 6..8;
    
    for my $z ($x, @y, 17)
    
    {
    
      $z++ if $z < 10;
    
    }
    
    print "$x @y\n";
    
    

    and then see what happens if you remove the test if $z < 10.

    This feature of foreach loops even helps you shorten code when there's no loop to speak of, if you have multiple references to a long variable name. Instead of writing:

    
    $music{$artist}{MP3} =~ s#\\#/#g;
    
    $music{$artist}{MP3} = "$MUSICPATH/$music{$artist}{MP3}";
    
    

    you can say:

    
    for ($music{$artist}{MP3})
    
    {
    
      s#\\#/#g;
    
      $_ = "$MUSICPATH/$_";
    
    }
    
    
  • map() and how it lets you transform one list into another; the output list doesn't even have to have the same number of elements as the input list.

  • How to write subroutines and how to implement named parameter passing for clarity, for example:

    
    make_sundae(nuts => 'pecan', cherries => 2);
    
    ...
    
    sub make_sundae
    
    {
    
      my %arg = (%DEFAULT_SUNDAE, @_);
    
      if ($arg{bananas}) ...
    
    }
    
    

    Beginners often don't realize that you can pass hashes around as parts of lists and reconstitute them from arrays.

4.8.2 Too Many Temporary Variables

The less a language intrudes between your understanding of a problem and your expression of it in a program, the better. Temporary variables—also known as synthetic variables—are there just because the program required them. They don't actually have any meaning in the language of the problem. Any variable with "temp" in its name is likely a temporary variable (or a temperature). Here's an example of a temporary variable I just inherited:


if ($!)

{

  my $msg = "$? $!";

  print ERRFILE "$0: $msg\n";

}

Clearly $msg is not used for anything else (because it immediately goes out of scope), so why not save a line:


if ($!)

{

  print ERRFILE "$0: $? $!\n";

}

Now we can see that it could be shortened still with either a postfixed if:


print ERRFILE "$0: $? $!\n" if $!;

or a logical and:


$! and print ERRFILE "$0: $? $!\n";

Sometimes you can avoid temporary variables by crafting a humongous statement involving many function calls and have it still make sense. If you can think of the statement as a pipeline, this is a case where Perl's syntactic generosity in making parentheses on function calls optional can be most helpful. Just put line breaks in the right places. A classic example of this is the Schwartzian Transform (see [HALL98], p. 49):[9]

[9] See http://groups.google.com/groups?selm=4b2eag%24odb%40csnews.cs.colorado.edu for the original reference.


@sorted = map  { $_->[0] }

          sort { $a->[1] <=> $b->[1] }

          map  { [ $_, func($_) ] }

               @unsorted;

That would look uglier if parentheses were mandatory:


@sorted = map ({ $_->[0] }

               sort({ $a->[1] <=> $b->[1] }

                    map ({ [ $_, func($_) ] }

                         @unsorted

              )));

A pipeline is a natural way to think of the act of using successive functions and operators to transform a list. Here's a really long example for finding particular subdirectories of a directory $dir:


1  opendir DIR, $dir;

2  my @kids = map  File::Spec::Link->resolve($_)

3          => grep -d

4          => map  { tr/\n/?/

5                  ? do { warn "Embedded N/L at ? in $_\n"; () }

6                   : $_ }

7             map  "$dir/$_"

8          => grep !/^(?:\.\.?|RCS)$/

9          => readdir DIR;

With some creative indentation this is quite easy to read: Some people prefer to read it from top to bottom; most people understand it better when read from bottom to top, that being the order of execution:

9

Return the list of files in $dir.

8

Ignore the current and parent directories, and any file or directory called RCS.

7

Form the full path to the file.

4

Ignore files with embedded newlines, producing a warning instead and substituting an empty list at this point in the pipeline.

3

Select just directories.

2

Use a CPAN module to resolve any symbolic links in the filename.[10]

[10] http://search.cpan.org/dist/File-Copy-Link/

Yes, newlines are valid characters in UNIX filenames, but they can give some utilities heartburn; this example was designed to protect those utilities while reporting such filenames.

4.8.3 Not Enough Temporary Variables

Conversely, writing obfuscated code to avoid creating a temporary variable is no savings, especially if there is some kind of meaningful name you can assign to that variable. For instance:


push @{$person{$ssn}{PHONES}},

  substr($dbh->selectcol_arrayref("SELECT raw_text FROM logs \

                                   WHERE SSN = '$ssn'")->[0],

         $offsets[$WORK_PHONE],

         $offsets[$WORK_PHONE+1] - $offsets[$WORK_PHONE]);

Even creative indentation hasn't helped enough. That doesn't mean that every compound expression needs to be replaced by a temporary variable, nor that statements need to be constrained to fit on single lines. Here's one way of making that code clearer:


{

  my $sql = "SELECT raw_text FROM logs WHERE SSN = '$ssn'";

  my $record = $dbh->selectcol_arrayref($sql)->[0];

  my $offset = $offsets[$WORK_PHONE];

  my $length = $offsets[$WORK_PHONE+1] - $offset;

  push @{$person{$ssn}{PHONES}},

       substr($record, $offset, $length);

}

Notice how the temporary variables are declared in a block that limits their scope.

    Previous Table of Contents Next