Ïðèãëàøàåì ïîñåòèòü
Áðþñîâ (bryusov.lit-info.ru)

Searching Scalars

Previous Table of Contents Next

Searching Scalars

Regular expressions are nice for searching scalars for patterns, but sometimes they're overkill. In Perl, some overhead—but not much—is involved with assembling the pattern and then searching for the pattern within scalars. Also, you can easily make mistakes when writing regular expressions. Perl provides several functions for searching and extracting simple information from scalars.

Searching with index

If you merely want to find one string within another scalar, Perl provides the index function. The syntax for index is as follows:


index string, substring

index string, substring, start_position


The index function starts at the left of string and searches for substring. The index function returns the position at which substring is found, with 0 being the leftmost character. If the substring is not found, index returns –1. The string to be searched can be a string literal, a scalar, or any expression that returns a string value. The substring is not a regular expression; it's just another scalar.

The following are some examples. Remember that you can write Perl's functions and operators with or without parentheses enclosing the arguments.


index "Ring around the rosy", "around";    # Returns 5

index("Pocket full of posies", "ket");     # Returns 3

$a="Ashes, ashes, we all fall down";

index($a, "she");                          # Returns 1

index $a, "they";                          # Returns -1 (not found)

@a=qw(oats     peas      beans);

index join(" ", @a), "peas";               # Returns 5


Optionally, you can give the index function a start position in the string to start searching, as shown in the following example. To start searching at the left, you use the start position of –0.


$reindeer="dasher dancer prancer vixen";

index($reindeer, "da");         # Returns 0

index($reindeer, "da", 1);      # Returns 7


You also can use the index function with a starting position to "walk" through a string and find all the occurrences of a smaller string, as shown here:


$source="One fish, two fish, red fish, blue fish.";

$start=-1;

# Use an increasing beginning index, $start, to find all fish

while( ($start=index($source, "fish", $start)) != -1) {

    print "Found a fish at $start\n";

    $start++;

}


The preceding example slides through $source, as shown here:

Searching Scalars

Searching Backward with rindex

The function rindex works the same as index, except that the search starts on the right and works its way left. The syntax is as follows:


rindex string, substring

rindex string, substring, start_position


When the search is exhausted, the rindex function returns –1. The following are some examples:


$a="She loves you yeah, yeah, yeah.";

rindex($a, "yeah");       # Returns 26.

rindex($a, "yeah", 25);   # Returns 20


The walk-through loop used with index looks a little different searching backward with rindex. The rindex start position must start at (or after) the end of the string—length($source) in the following example—but still finishes when –1 is returned. After each find, $start must be decremented by 1, rather than incremented as it was with index.


$source="One fish, two fish, red fish, blue fish.";

$start=length($source);

while( ($start = rindex($source, "fish", $start)) != -1) {

    print "Found a fish at $start\n";

    $start--'

}


Picking Apart Scalars with substr

The substr function is often overlooked and easily forgotten, but it provides a general-purpose method for extracting information from scalars and editing scalars. The syntax of substr is as follows:


substr string, offset

substr string, offset, length


The substr function takes string, starting at position offset, and returns the rest of the string from offset to the end. If length is specified, then length characters are taken—or until the end of the string is found, whichever comes first—as shown in this example:


#Character positions in $a

#   0         10        20        30

$a="I do not like green eggs and ham.";

print substr($a, 25);      # prints "and ham."

print substr($a, 14, 5);   # prints "green"


If the offset specified is negative, substr starts counting from the right. For example, substr($a, -5) returns the last five characters of $a. If the length specified is negative, substr returns from its starting position to the end of the string, less length characters, as in this example:


print substr($a, 5, -10);    # prints "not like green egg"


In the preceding snippet, substr starts at position 5 and returns the rest of the string except the last 10 characters.

You can also use the substr function on the left side of an assignment expression. When used on the left, substr indicates what characters are to be replaced in a scalar. When substr is used on the left side of an assignment, the first argument must be an assignable value—such as a scalar variable—and not a string literal. The following is an example of editing a string with substr:


$a="countrymen, lend me your wallets";

# Replace first character of $a with "Romans, C"

substr($a, 0, 1)="Romans, C";



# Insert "Friends" at the beginning of $a

substr($a, 0, 0)="Friends, ";



substr($a, -7, 7)="ears.";         # Replace last 7 characters.


    Previous Table of Contents Next