Ïðèãëàøàåì ïîñåòèòü
×óêîâñêèé (chukovskiy.lit-info.ru)

Locking

Previous Table of Contents Next

Locking

Imagine that you've written a wonderful Perl program and that the whole world wants to use it. If you're on a Unix or Windows NT machine, or even on a Windows 95 or 98 machine, more than one person might be running your program at the same time. Or, your program may be put onto a Web server, and it's run so frequently that instances of your program overlap.

Now suppose that your program uses a database for its work, such as the text file database just described—but this discussion applies to any kind of database. Look at the following code, which uses functions described in the preceding section:


chomp($newrecord=<STDIN>);          #  Get a new record from the user

@PHONEL=readdata();                 #  Read data into @PHONEL

push(@PHONEL, $newrecord);          #  Put the record into the array

writedata(@PHONEL);                 #  Write out the array


Looks harmless, doesn't it? But if two people run your program at nearly the same time and try adding different records, it's not harmless at all; it's quite buggy. In the following diagram, this particular set of Perl statements is run at nearly the same time, on the same system, by two different people (Person 2 is working slightly behind Person 1). Watch carefully.

Locking

From Person 1's perspective, the data is read in at step 2, and the new record ("David") is added to @PHONEL in step 3 and written at step 4.

From Person 2's perspective, the data is read in at step 3, the new record ("Joy") is added to @PHONEL in step 4, and @PHONEL is written out in step 5.

Here's the bug: The data read in by Person 2 in step 3 does not contain the record "David." That record hasn't been written yet by Person 1. So Person 2 adds "Joy" to the array @PHONEL, which does not contain "David." At the same time, Person 1 writes a copy of @PHONEL to the database—which does contain "David."

When Person 2's instance of the program finally makes it to step 5, it overwrites the data written by Person 1. The database winds up with "Joy", but not "David"—clearly a bug.

Watch Out!

The problem is actually worse than you just learned; the preceding explanation is vastly oversimplified. The additional headaches come from the fact that the writedata() functions seemingly open and write the data all in one burst—but they don't. Multiprocessing operating systems can actually stop a program in the middle of writing data, go run another program for a moment, and resume it later—milliseconds later, but later nonetheless. Both programs can be writing to the same file, with different data, at the same time. This can cause your data file to become corrupted or even erased.


The kind of problem we have just seen has a formal name; it's called a race condition. That is, what gets stored in the file depends one which user "wins" or "loses" the race. Race conditions are difficult to debug in programs, because they come and go depending on how many instances of the program are running at the same time, and because the bugs associated with race conditions aren't always obvious.

Allowing multiple programs to update the same data at the same time is a tricky proposition, but it can be handled using a mechanism called locks. File locks are used to prevent multiple instances of a program from altering a file at the same time.

Locking files poses several problems, but the foremost among them is that different operating systems and different kinds of file systems require different types of locking mechanisms. The next sections describe how to lock files to prevent this kind of disaster.

Locking with Unix and Windows

To lock files under Unix and Windows, you can use Perl's flock function. The flock function provides an advisory locking mechanism. This means that any programs you write that need to access the file must also use flock to make sure that no one else is writing to the file at the same time. However, other programs can still modify the file; this is why the mechanism is called advisory locking, not mandatory locking.

You're familiar with one kind of advisory lock already: a traffic stop light. The signal is there to prevent multiple vehicles from entering the same part of the intersection at the same time. But the signal works only if everyone obeys it. The same holds true for file locks. Every program that can potentially access a file at the same time must use flock to prevent an accident. An advisory lock doesn't keep other processes from accessing the data; it prevents other processes only from obtaining a lock.

The flock function takes two arguments—a filehandle and a lock type—as you can see in the following syntax:


use Fcntl qw(:flock);



flock(FILEHANDLE, lock_type);


The flock function returns true if the lock is successful; it returns false otherwise. Sometimes calling flock causes your program to pause and wait for other locks to clear (more about this in a moment). The use Fcntl qw(:flock) allows you to use symbolic names for the lock_type instead of harder-to-remember numbers.

There are two kinds of locks: shared and exclusive. Normally, you get a shared lock when you want to read a file and an exclusive lock when you want to write to it. If a process has an exclusive lock on a file, then that's the only lock there is—no other process can have locks at all. But many processes can have shared locks at the same time as long as there are no exclusive locks. In such cases, it's safe to have many processes reading the file at the same time as long as nobody is writing.

Some possible values for lock_type are as follows:

  • LOCK_SH— This value requests a shared lock on the file. If another process has an exclusive lock on the file, then flock pauses until the exclusive lock is cleared before taking out the shared lock on the file.

  • LOCK_EX— This value requests an exclusive lock on a file opened for writing. If other processes have a lock (either shared or exclusive), then flock pauses until those locks are cleared.

  • LOCK_UN— This value releases a lock, but it is rarely needed; simply closing the file writes out any unwritten data and releases the lock. Releasing the lock on a file that is still open can cause data corruption.

Did you Know?

Locks taken with flock are released when you close the file or when your program exits—even if it exits with an error.


Taking out a lock on a file that you're also trying to read or write can be tricky. Problems arise because opening a filehandle and locking the file is at least a two-step process: You have to have the file open before you can lock it. If you open a file with open(FH, ">filename") and then get a lock with flock, you've modified the file (truncated with >) before you obtain the lock. This could potentially modify the file (by truncating it) while some other process has a lock on it.

Solving this problem involves something called a semaphore file. A semaphore file is just a sacrificial file whose contents aren't important; whoever holds a lock on that file can proceed.

To use a semaphore file, all you need is a filename that can be used as a semaphore and a couple of functions to lock and unlock the semaphore file, as shown in Listing 15.3. This is not a complete program; it is meant to be included as a part of other programs.

Listing 15.3. General-Purpose Locking Functions

1:  use Fcntl qw(:flock);

2:  # Any file name will do for semaphore.

3:  my $semaphore_file="/tmp/sample.sem";

4:

5:  # Function to lock (waits indefinitely)

6:  sub get_lock {

7:     open(SEM, ">$semaphore_file") 

8:         || die "Cannot create semaphore: $!";

9:     flock(SEM, LOCK_EX) || die "Lock failed: $!";

10:  }

11:

12: # Function to unlock

13:  sub release_lock {

14:     close(SEM);

15:  }


These locking functions can surround any code that you do not want to be run concurrently, even if that code has nothing to do with reading or writing files. For example, this snippet—even if run by several processes at the same time— allows only one process at a time to print a message:


get_lock();      # waits for a lock.

print "Hello, World!\n";

release_lock();  # Let someone else print now...


The get_lock() and release_lock() functions we have just seen will be used throughout the rest of this book for locking files when locks are needed.

Watch Out!

Waiting for user input (or any other potentially slow events) while holding a lock is not a good idea. All other programs that need that lock will stop and wait for the lock to be released. You therefore should obtain your lock, do your locking-sensitive code, and then release your lock.


Reading and Writing with a Lock

Now is a good time to show the text file database readdata() and writedata() functions being used with file locking. To do this, all you'll need is a semaphore file and the get_lock() and release_lock() subroutines from the previous section.

The first part of Listing 15.4 is the locking code from the previous section.

Listing 15.4. Demonstration of Text File I/O with Locking

1:  #!/usr/bin/perl -w

2:  use strict;

3:  use Fcntl qw(:flock);

4: 

5:  my $semaphore_file="/tmp/list154.sem";

6: 

7:  # Function to lock (waits indefinitely)

8:  sub get_lock {

9:      open(SEM, ">$semaphore_file")

10:          || die "Cannot create semaphore: $!";

11:      flock(SEM, LOCK_EX) || die "Lock failed: $!";

12:  }

13: 

14:  # Function to unlock

15:  sub release_lock {

16:      close(SEM);

17:  }

18: 

19:  sub readdata {

20:      open(PH, "phone.txt") || die "Cannot open phone.txt $!";

21:      my(@DATA)=<PH>;

22:      chomp(@DATA);

23:      close(PH); 

24:      return(@DATA);

25:  }

26:  sub writedata {

27:       my(@DATA)=@_;

28:       open(PH, ">phone.txt") || die "Cannot open phone.txt $!";

29:       foreach(@DATA) {

30:           print PH "$_\n";

31:       }

32:       close(PH);   # Releases the lock, too

33:  }

34:  my @PHONEL;

35: 

36:  get_lock();

37:  @PHONEL=readdata();

38:  push(@PHONEL, "Calvin 555-1012");

39:  writedata(@PHONEL);

40:  release_lock()


Most of Listing 15.4 is code you've already seen. The functions get_lock(), release_lock(), readdata(), and writedata() were all outlined earlier in this hour.

The meat of this program starts at line 34. There, a lock is taken out with get_lock(). The file is then read into @PHONEL with readdata(), the data is manipulated, and then written back out to the same file with writed
Previous Table of Contents Next