Section 9.3.  Sorting Efficiently

Table of Contents

9.3. Sorting Efficiently

As the Professor tries to maintain the community computing facility (built entirely out of bamboo, coconuts, and pineapples, and powered by a certified Perl-hacking monkey), he continues to discover that people are leaving entirely too much data on the single monkey-powered filesystem and decides to print a list of offenders.

The Professor has written a subroutine called ask_monkey_about( ), which, given a castaway's name, returns the number of pineapples of storage they use. We have to ask the monkey because he's in charge of the pineapples. An initial naive approach to find the offenders from greatest to least might be something like:

my @castaways =
  qw(Gilligan Skipper Professor Ginger Mary_Ann Thurston Lovey);
my @wasters = sort {
  ask_monkey_about($b) <=> ask_monkey_about($a)
} @castaways;

In theory, this would be fine. For the first pair of names (Gilligan and Skipper), we ask the monkey "How many pineapples does Gilligan have?" and "How many pineapples does Skipper have?" We get back two values from the monkey and use them to order Gilligan and Skipper in the final list.

However, at some point, we have to compare the number of pineapples that Gilligan has with another castaway as well. For example, suppose the pair is Ginger and Gilligan. We ask the monkey about Ginger, get a number back, and then ask the monkey about Gilligan...again. This will probably annoy the monkey a bit, since we already asked earlier. But we need to ask for each value two, three, or maybe even four times, just to put the seven values into order.

This can be a problem because it irritates the monkey.

How do we keep the number of monkey requests to a minimum? Well, we can build a table first. We use a map with seven inputs and seven outputs, turning each castaway item into a separate array reference, with each referenced array consisting of the castaway name and the pineapple count reported by the monkey:

my @names_and_pineapples = map {
  [ $_, ask_monkey_about($_) ]
} @castaways;

At this point, we asked the monkey seven questions in a row, but that's the last time we have to talk to the monkey! We now have everything we need to finish the task.

For the next step, we sort the arrayrefs, ordering them by the monkey-returned value:

my @sorted_names_and_pineapples = sort {
  $b->[1] <=> $a->[1];
} @names_and_pineapples;

In this subroutine, $a and $b are still two elements from the list of things to be sorted. When we're sorting numbers, $a and $b are numbers; when we're sorting references, $a and $b are references. We dereference them to get to the corresponding array itself, and pick out item 1 from the array (the monkey's pineapple value). Because $b appears to the left of $a, it'll be a descending sort as well. (We want a descending sort because the Professor wants the first name on the list to be the person who uses the most pineapples.)

We're almost done, but what if we just wanted the top names, rather than the names and pineapple counts? We merely need to perform another map to transform the references back to the original data:

my @names = map $_->[0], @sorted_names_and_pineapples;

Each element of the list ends up in $_, so we'll dereference that to pick out the element 0 of that array, which is just the name.

Now we have a list of names, ordered by their pineapple counts, and the monkey's off our backs, all in three easy steps.

Table of Contents
© 2000- NIV