HTML CSS PHP PERL

Getting a Directory Listing

 
Previous Table of Contents Next

Getting a Directory Listing

The first step in obtaining directory information from your system is to create a directory handle. A directory handle is something like a filehandle, except that instead of a file's contents, you read the contents of a directory through the directory handle. To open a directory handle, you use the opendir function:


opendir dirhandle, directory


In this syntax, dirhandle is the directory handle you want to open and directory is the name of the directory you want to read. If the directory handle cannot be openedbecause you don't have permission to read the directory, the directory doesn't exist, or because of some other reasonthe opendir function returns false. Directory handle variable names should be constructed similarly to filehandlesusing the rules for variable names outlined in Hour 2, "Perl's Building Blocks: Numbers and Strings"and, like filehandles, they should be all uppercase to avoid conflicts with Perl's keywords. The following is an example:


opendir(TEMPDIR, '/tmp') || die "Cannot open /tmp: $!";


All the examples in this hour use forward slashes (/) in the Unix style because it is less confusing than the backslashes (\) used by Windows and MSDOS and works just as well with those operating systems as with Unix.

Now that the directory handle is open, you use the readdir function to read it:


readdir dirhandle;


In a scalar context, readdir returns the next entry in the directory, or undef if none are left. In a list context, readdir returns all the (remaining) directory entries. The names returned by readdir include files, directories, and (for Unix) special files; they are returned in no particular order. The directory entries . and .. (representing the current directory and its parent directory) are also returned by readdir. The directory entries returned by readdir do not include the pathname as part of the name returned.

When you're done with the directory handle, you should close it by using closedir:


closedir dirhandle;


The following example shows how to read a directory:


opendir(TEMP, '/tmp') || die "Cannot open /tmp: $!";

@FILES=readdir TEMP;

closedir(TEMP);


In this preceding snippet, the entire directory is read into @FILES. Most of the time, however, you're not interested in the . and .. files. To read the filehandle and eliminate those files, you can enter the following:


@FILES=grep(!/^\.\.?}$/, readdir TEMP);


The regular expression (/^\.\.?$/) matches a leading literal dot (or two) that is also at the end of the line, and grep eliminates them. To get all the files with a particular extension, you use the following:


@FILES=grep(/\.txt$/i, readdir TEMP);


The filenames returned by readdir do not contain the pathname used by opendir. Thus, the following example will probably not work:


opendir(TD, "/tmp") || die "Cannot open /tmp: $!";

while($file = readdir TD) {

    # The following is WRONG

    open(FILEH, $file) || die "Cannot open $file: $!\n";

    # Process the file here...

}

closedir(TD);


Unless you happen to be working in the /tmp directory when you run this code, the open(FILEH, $file) statement will fail. For example, if the file myfile.txt exists in /tmp, readdir returns myfile.txt. When you open myfile.txt, you actually need to open /tmp/myfile.txt using the full pathname. The corrected code is as follows:


opendir(TD, "/tmp") || die "Cannot open /tmp: $!";

while($file=readdir TD) {

    # Right!

    open(FILEH, "/tmp/$file") || die "Cannot open $file: $!\n";

    # Process the file here...

}

closedir(TD);


Globbing

The other method of reading the names of files in a directory is called globbing. If you're familiar with the command prompt in MS-DOS, you know that the command dir *.txt prints a directory listing of all the files that end in .txt. In Unix, the globbing (sometimes called wildcard matching) is done by the shell, but ls *.txt has nearly the same result: The files whose names end in .txt are listed.

Perl has an operator for doing just this job; it's called glob. The syntax for glob is


glob pattern


where pattern is the filename pattern you want to match. The pattern can contain directory names and portions of filenames. In addition, the pattern can contain any of the special characters listed in Table 10.1. In a list context, glob returns all the files (and directories) that match the pattern. In a scalar context, the files are returned one at a time each time glob is queried.

Table 10.1. Globbing Patterns

Character

Matches

Example

?

Single character

f?d matches fud, fid, fdd, and so on

*

Any number of characters

f*d matches fd, fdd, food, filled, and so on

[chars]

Matches any of chars; this feature is not supported in MacPerl

f[ou]d matches fod and fud but not fad

{a,b,...}

Matches either of the strings a or b; this feature not supported in MacPerl

f*.{txt,doc} matches files that begin with f and that end in either .txt or .doc


Watch Out!

Glob patterns are not the same as regular expressions.


Watch Out!

Unix fans, please note: Perl's glob operator uses C shell-style file globbing, as opposed to Bourne (or Korn) shell file globbing. This is true on any Unix system in which Perl is installed, regardless of whatever shell you personally use. The Bourne shell globbing and Korn shell globbing are different from the C shells. They are very similar in some respects* and ? behave the samebut quite different in others. Beware.


Now check these examples of globbing:


# All of the .h files in /usr/include

my @hfiles=glob('/usr/include/*.h');

# Text or document files that contain 1999

my @curfiles=glob('*1999*.{txt,doc}')

# Printing a numbered list of filenames

$count=1;

while( $name=glob('*') ) {

    print "$count. $name\n";

    $count++;

}


An important difference between glob and opendir/readdir/closedir is that glob returns the pathname used in the pattern, whereas the opendir/readdir/closedir functions do not. For example, glob('/usr/include/*.h') returns '/usr/include' as part of any matches; readdir does not.

So which should you use? It's completely up to you. However, using the opendir/readdir/closedir functions tends to be a much more flexible solution and will be used in most of the examples throughout this book.

Perl offers an alternative way to write pattern globs. Simply placing the pattern inside the angle operator (<>) makes the angle operator behave like glob:


@cfiles = <*.c>;  # All files ending in .c


The syntax that uses the angle operator for globbing is older and can be confusing. In this book, I will continue to use the glob operator instead for clarity.

    Previous Table of Contents Next
    © 2000- NIV