Ïðèãëàøàåì ïîñåòèòü
Ëèòåðàòóðà (lit-info.ru)

Reading

Previous Table of Contents Next

Reading

You can read from Perl's filehandles in a couple of different ways. The most common method is to use the file input operator, also called the angle operator (<>). To read a filehandle, simply put the filehandle name inside the angle operator and assign the value to a variable:


open(MYFILE, "myfile") || die "Can't open myfile: $!";

$line=<MYFILE>;        # Reading the filehandle


The angle operator in a scalar context reads one line of input from the file. When called after the entire file has been read, the angle operator returns the value undef.

By the Way

A "line of input" is usually considered to be a text stream until the first end-of-line sequence is found. In Unix, that end-of-line sequence is a newline character (ASCII 10); in DOS and Windows, it's the sequence of carriage return and newline characters (ASCII 13,10). This default end-of-line value can be manipulated by Perl to achieve some interesting results. This topic will be covered in Hour 12, "Using Perl's Command-Line Tools."


To read and print the entire file, you can use the following if MYFILE is an open filehandle:


while(defined($a=<MYFILE>)) {

    print $a;

}


As it turns out, a shortcut for reading the filehandle is to use a while loop. If the angle operators are the only elements inside the conditional expression of a while loop, Perl automatically assigns the input line to the special variable $_ (described in Hour 2, "Perl's Building Blocks: Numbers and Strings") and repeats the loop until the input is exhausted:


while(<MYFILE>) {

    print $_;

}


The while takes care of assigning the input line to $_ and making sure the data in the file hasn't been exhausted (called end of file). This magic behavior happens only with a while loop and only if the angle operators are the only characters in the conditional expression.

Watch Out!

Remember that every line of data read in with a filehandle in Perl contains the end-of-line characters in addition to the text from the line. If you want just the text, use chomp on the input line to get rid of the end-of-line characters.


In a list context, the angle operators read in the entire file and assign it to the list. Each line of the file is assigned to each element of the list or array, as shown here:


open(MYFILE, "novel.txt") || die "$!";

@contents=<MYFILE>;

close(MYFILE);


In the preceding snippet, the remaining data in the filehandle MYFILE is read and assigned to @contents. The first line of the file novel.txt is assigned to the first element in @contents: $contents[0]. The second line is assigned to $contents[1], and so on.

In most cases, reading an entire file into an array (if it isn't too large) is an easy way for you to deal with the file's data. You can go back and forth through the array, manipulate the array elements, and deal with the array's contents with all the array and scalar operators without worrying because you're actually working with just a copy of the file in the array. Listing 5.1 shows some of the manipulations possible on in-memory files.

Listing 5.1. Reversing a File

1:   #!/usr/bin/perl -w

2:

3:   open(MYFILE, "testfile") || die "opening testfile: $!";

4:   @stuff=<MYFILE>;

5:   close(MYFILE);

6:   # Actually, any manipulation can be done now.

7:   foreach(reverse(@stuff)) {

8:           print scalar(reverse($_));

9:   }


If the file testfile contains the text

I am the very model of

a modern major-general.

the program in Listing 5.1 would produce the output


.lareneg-rojam nredom a

fo ledom yrev eht ma I


Line 1: This line contains the path to the interpreter (change it so that it's appropriate to your system) and the -w switch. Always have warnings enabled!

Line 3: The file testfile is opened with the filehandle FH. If the file doesn't open properly, the die function is run with an error message.

Line 4: The entire contents of testfile are read into the array @stuff.

Line 7: The array @stuff is reversed—the first line becomes the last line, and so on—and the resulting list is traversed by the foreach statement. Each line of the reversed list is assigned to $_ and the body of the foreach loop is executed.

Line 8: Each line (now in $_) is itself reversed—from left-to-right to right-to-left—and printed. The scalar function is needed because print expects a list; also, reverse used in a list context reverses a list, so nothing would happen to $_. The scalar function forces reverse into a scalar context, and it reverses $_ character by character.

Probably, only small files should be reading in their entirety into array variables for manipulation. Reading a very large file into memory, although allowed, might cause Perl to use all the available memory on your system.

If you ever exceed Perl's memory by reading too large a file into memory, or do anything else to exceed your system's memory, Perl displays the following error message:


Out of memory!


and your program terminates. If this happens when you are reading an entire file into memory at once, you should probably consider processing the file one line at a time.

    Previous Table of Contents Next