Ïðèãëàøàåì ïîñåòèòü
Áðþñîâ (bryusov.lit-info.ru)

Exercise: Cleaning Up Input Data

Previous Table of Contents Next

Exercise: Cleaning Up Input Data

The "blind" substitutions in the preceding example—where substitutions are made, but the exit status isn't checked—are common when you're trying to cook data. Cooking data is taking data from a user or a file that is not formatted exactly the way you would like and reformatting it. Listing 6.1 shows a routine to convert your weight on the earth to your weight on the moon, which demonstrates data manipulation.

Using your text editor, type the program in Listing 6.2 and save it as Moon. Of course, don't type the line numbers. Be sure to make the program executable according to the instructions you learned in Hour 1, "Getting Started with Perl."

When you're done, try running the program by typing the following at a command line:


Moon


or, if you cannot make the program executable,


perl -w Moon


Some sample output is shown in Listing 6.1.

Listing 6.1. Sample Output from Moon

1:  $  perl Moon

2:  Your weight:  150lbs

3:  Your weight on the moon: 25.00005 lbs

4:  $ perl Moon

5:  Your weight: 90 kg

6:  Your weight on the moon: 30.00066 lbs


Listing 6.2. Your Moon Weight

1:   #!/usr/bin/perl -w

2:

3:   print "Your weight:";

4:   $_=<STDIN>;

5:   chomp;

6:   s/^\s+//;   # Remove leading spaces, if any.

7:   if (m/(lbs?|kgs?|kilograms?|pounds?)/i) {

8:           if (s/\s*(kgs?|kilograms?).*//) {

9:                  $_*=2.2;

10:           } else {

11:                  s/\s*(lbs?|pounds?).*//;

12:           }

13:   }

14:   print "Your weight on the moon: ", $_*.16667, " lbs\n";


Line 1: This line contains the path to the interpreter (you can change it so that it's appropriate to your system) and the -w switch. Always have warnings enabled!

Lines 3–5: These lines prompt the user for his weight, assign the input to $_, and chomp off the newline character. Remember that chomp changes $_ if no other variable is specified.

Line 6: The pattern /^\s+/ matches whitespace at the beginning of the line. No replacement string is listed, so the portion of $_ matching the pattern is simply removed.

Line 7: If a unit of measurement is found in the user's input, this if block removes the unit and converts it, if applicable.

Lines 8–9: The pattern /\s*(kgs?|kilograms?)/i matches whitespace and then either kg or kilogram (each with an optional s on the end). This means that if the input contains kg or kg (with no space), it is removed. If the pattern is found and removed, what is left over in $_ is multiplied by 2.2—in other words, converted to pounds.

Line 11: Otherwise, lbs or pounds is removed from $_ (along with optional leading whitespace).

Line 14: The weight in $_—converted to pounds already—is multiplied by 1/6 and printed.

    Previous Table of Contents Next