11.1. File Test OperatorsBefore we start a program that creates a new file, let's make sure the file doesn't already exist so that we don't accidentally overwrite a vital spreadsheet data file or that important birthday calendar. For this, we use the -e file test, testing a filename for existence: die "Oops! A file called '$filename' already exists.\n" if -e $filename; We didn't include $! in this die message since we're not reporting that the system refused a request in this case. Here's an example of checking if a file is being kept up to date. In this case, we're testing an already opened filehandle instead of a string file name. Let's say that our program's configuration file should be updated every week or two. (Maybe it's checking for computer viruses.) If the file hasn't been modified in the past 28 days, then something is wrong: warn "Config file is looking pretty old!\n" if -M CONFIG > 28; The third example is more complex. Let's say disk space is filling up; rather than buy more disks, we've decided to move any large, useless files to the backup tapes. So let's go through our list of files[*] to see which of them are larger than 100 KB. But even if a file is large, we shouldn't move it to the backup tapes unless it hasn't been accessed in the last 90 days (so we know it's not used too often):[]
my @original_files = qw/ fred barney betty wilma pebbles dino bamm-bamm /; my @big_old_files; # The ones we want to put on backup tapes foreach my $filename (@original_files) { push @big_old_files, $filename if -s $filename > 100_000 and -A $filename > 90; } This is the first time that you've seen it, so maybe you noticed that the control variable of the foreach loop is a my variable. That declares it to have the scope of the loop, so this example should work under use strict. Without the my keyword, this would be using the global $filename. The file tests look like a hyphen and a letter, which is the name of the test, followed by a filename or a filehandle to test. Many of them return a true/false value, but several give something more interesting. See Table 11-1 for the complete list and read the following discussion to learn more about the special cases.
The tests -r, -w, -x, and -o tell if the given attribute is true for the effective user or group ID,[*] which essentially refers to the person who is in charge of running the program.[] These tests look at the permission bits on the file to see what is permitted. If your system uses Access Control Lists (ACLs), the tests will use those as well. These tests generally tell if the system would try to permit something, but it doesn't mean that it really would be possible. For example, -w may be true for a file on a CD-ROM, though you can't write to it, or -x may be true on an empty file, which can't truly be executed.
The -s test does return true if the file is non-empty, but it's a special kind of true. It's the length of the file, measured in bytes, which evaluates as true for a nonzero number. A Unix filesystem[] has seven types of items, represented by the seven file tests -f, -d, -l, -S, -p, -b, and -c. Any item should be one of those. If you have a symbolic link pointing to a file, that will report true for -f and -l. So if you want to know whether something is a symbolic link, you should generally test that first. (You'll learn more about symbolic links in Chapter 12.)
The age tests, -M, -A, and -C (yes, they're uppercase) return the number of days since the file was last modified, accessed, or had its inode changed.[*] (The inode contains all of the information about the file except for its contents. See the stat system call manpage or a good book on Unix internals for details.) This age value is a full floating-point number, so you might get a value of 2.00001 if a file were modified two days and one second ago. These "days" aren't necessarily the same as a human would count. For example, if it's 1:30 a.m. when you check a file modified at about an hour before midnight, the value of -M for this file would be around 0.1, even though it was modified "yesterday."
When checking the age of a file, you might get a negative value like -1.2, which means that the file's last access timestamp is set at about thirty hours in the future. The zero point on this timescale is the moment your program started running,[*] so that value might mean a long-running program was looking at a file that had just been accessed. Or a timestamp could be set (accidentally or intentionally) to a time in the future.
The tests -T and -B determine if a file is text or binary. But people who know a lot about filesystems know there's no bit (at least in Unix-like operating systems) to indicate that a file is a binary or text file, so how can Perl tell? The answer is that Perl cheats: it opens the file, looks at the first few thousand bytes, and makes an educated guess. If it sees a lot of null bytes, unusual control characters, and bytes with the high bit set, then that looks like a binary file. If there's not much weird stuff, then it looks like text. It sometimes guesses wrong. If a text file has a lot of Swedish or French words (which may have characters represented with the high bit set, as some ISO-8859-something variant, or perhaps even a Unicode version), it may fool Perl into declaring it binary. So it's not perfect, but if you need to separate your source code from compiled files, or HTML files from PNGs, these tests should do the trick. You'd think that -T and -B would always disagree since a text file isn't a binary and vice versa, but there are two special cases where they're in complete agreement. If the file doesn't exist, or can't be read, both are false since it's neither a text file nor a binary. Alternatively, if the file is empty, it's an empty text file and an empty binary file at the same time, so they're both true. The -t file test returns true if the given filehandle is a TTYif it's interactive because it's not a simple file or pipe. When -t STDIN returns true, it generally means that you can interactively ask the user questions. If it's false, your program is probably getting input from a file or pipe, rather than a keyboard. Don't worry if you don't know what some of the other file tests meanif you've never heard of them, you won't be needing them. But if you're curious, get a good book about programming for Unix. (On non-Unix systems, these tests all try to give results analogous to what they do on Unix, or give undef for an unavailable feature. Usually, you'll be able to guess what they'll do.) If you omit the filename or filehandle parameter to a file test (that is, if you have -r or just -s), the default operand is the file named in $_.[*] So, to test a list of filenames to see which ones are readable, you type the following:
foreach (@lots_of_filenames) { print "$_ is readable\n" if -r; # same as -r $_ } But if you omit the parameter, be careful that whatever follows the file test doesn't look like it could be a parameter. For example, if you wanted to find out the size of a file in KB rather than in bytes, you might be tempted to divide the result of -s by 1000 (or 1024), like this: # The filename is in $_ my $size_in_K = -s / 1000; # Oops! When the Perl parser sees the slash, it doesn't think about division. Since it's looking for the optional operand for -s, it sees what looks like the start of a regular expression in forward slashes. To prevent this confusion, put parentheses around the file test: my $size_in_k = (-s) / 1024; # Uses $_ by default Explicitly giving a file test a parameter is safer. |