7.1. What Are Regular Expressions?A regular expression, often called a pattern in Perl, is a template that matches or doesn't match a given string.[] An infinite number of possible text strings exist, and a given pattern divides that infinite set into two groups: the ones that match and the ones that don't. There's never any kinda-sorta-almost-up-to-here wishy-washy matching: either it matches or it doesn't.
A pattern may match one possible string, two or three, a dozen, a hundred, or an infinite number. It may match all strings except for one, except for some, or except for an infinite number.[*] We've referred to regular expressions as being little programs in their own simple programming language. It's a simple language because the programs have one task: to look at a string and say "it matches" or "it doesn't match".[] That's all they do.
One of the places you're likely to have seen regular expressions is in the Unix grep command, which prints out text lines matching a given pattern. For example, if you wanted to see which lines in a given file mention flint and, somewhere later on the same line, stone, you might do something like this with the Unix grep command: $ grep 'flint.*stone' chapter*.txt chapter3.txt:a piece of flint, a stone which may be used to start a fire by striking chapter3.txt:found obsidian, flint, granite, and small stones of basaltic rock, which chapter9.txt:a flintlock rifle in poor condition. The sandstone mantle held several Don't confuse regular expressions with shell filename-matching patterns, called globs. A typical glob is what you use when you type *.pm to the Unix shell to match all filenames that end in .pm. The previous example uses a glob of chapter*.txt. (You may have noticed that you had to quote the pattern to prevent the shell from treating it like a glob.) Though globs use many of the same characters you use in regular expressions, those characters are used in different ways.[] You'll visit globs in Chapter 12.
|