9.4 Debugging

Debugging is just an extreme case of dynamic analysis. Third-party code can be extremely convoluted (so can your own code, of course, but you don't usually think of it that way because you're familiar with it; you knew it when it was just a subroutine); sometimes you just can't tell how part of the code fits in, or whether it's called at all. The code is laid out in some arrangement that makes no sense; if only you could see where the program would actually go when it was run.

Well, you can, using Perl's built-in debugger. Even though you're not actually trying to find a bug, the code-tracing ability of the debugger is perfect for the job.

This isn't the place for a full treatment of the debugger (you can see more detail in [SCOTT01]), but fortunately you don't need a full treatment; a subset of the commands is enough for what you need to do. (Using the debugger is like getting in a fight; it's usually over very quickly without using many of the fancy moves you trained for.)

You invoke the debugger with the -d command-line flag; either edit the program to add -d to the shebang line, or run the program by invoking Perl explicitly:


% perl -d program argument argument...

Make sure that the perl in your path is the same one in the shebang line of program or you'll go crazy if there are differences between the two perls.

The commands in Table 9-1 will account for nearly all of your use of the debugger:

Table 9-1. Basic Debugger Commands
Command
Description
h / h h
Brief / verbose help (verbose / brief help prior to 5.8.0)
b subroutine
Set breakpoint at first executable statement of subroutine
b line
Set breakpoint for line line
b place condition
Set breakpoint for place (either line or subroutine) but trigger it only when the Perl expression condition is true
C
Continue until end of program or breakpoint
c line
Continue until line line, end of program, or earlier breakpoint
x expression
Examine the value of a variable or expression
N
Execute current statement, skipping over any subroutines called from it
S
Execute next Perl statement, going into a subroutine called from the current statement if necessary
L
List source code from current line
R
Execute statements until return from current subroutine, end of program, or earlier breakpoint
T
Display stack trace
Q
Quit

Armed with these commands, we can go code spelunking. Suppose you are debugging a program containing the following code fragment:


77 for my $url (@url_queue)

78 {

79   my $res = $ua->request($url);

80   summarize($res->content);

81 }

and you know that whenever the program gets to the URL http://www.perlmedic.com/fnord.html something strange happens in the summarize() subroutine. You'd like to check the HTTP::Response object to see if there were any redirects you didn't know about. You start the program under the debugger and type:


DB<1> b 80 $url =~ /fnord/

DB<2> c

The program will run until it has fetched the URL you're interested in, at which point you can examine the response object—here's an example of what it might look like:


DB<3> x $res

0  HTTP::Response=HASH(0x70f340)

   '_content' => '\cF\cN\cO\cR\cD'

   '_headers' => HTTP::Headers=HASH(0x70f790)

      'cache-control' => 'private,max-age=60'

      'client-date' => 'Tue, 01 Jul 2003 01:21:55 GMT'

      'client-peer' => '204.95.83.7:80'

      'client-response-num' => 1

      'connection' => 'close'

      'content-type' => ARRAY(0x23f470)

         0  'text/html'

         1  'text/html; charset=iso-8859-1'

      'date' => 'Tue, 01 Jul 2003 01:21:54 GMT'

      'expires' => 'Tue, 01 Jul 2003 01:22:54 GMT'

      'last-modified' => 'Tue, 01 Jul 2003 01:21:54 GMT'

      'refresh' => 1800

      'server' => 'Apache/1.3.27 (UNIX)'

      'title' => 'This page intentionally left blank'

   '_msg' => 'OK... sort of...'

   '_previous' => HTTP::Response=HASH(0x6af660)

      '_content' => ''

      '_headers' => HTTP::Headers=HASH(0x6af66c)

         'client-date' => 'Tue, 01 Jul 2003 01:21:49 GMT'

         'client-peer' => '204.95.83.7:80'

         'client-response-num' => 1

         'connection' => 'close'

         'content-length' => 0

         'content-type' => 'text/html'

         'date' => 'Tue, 01 Jul 2003 01:21:48 GMT'

         'location' => 'http://www.perlmedic.com/fnord/'

         'server' => 'Apache/1.3.27 (UNIX)'

      '_msg' => 'Moved Temporarily'

      '_protocol' => 'HTTP/1.1'

      '_rc' => 302

      '_request' => HTTP::Request=HASH(0x4a60d0)

         '_content' => ''

         '_headers' => HTTP::Headers=HASH(0x4a60a0)

            'user-agent' => 'libwww-perl/5.69'

         '_method' => 'GET'

         '_uri' => URI::http=SCALAR(0x27e668)

            -> 'http://www.perlmedic.com/fnord.html'

   '_protocol' => 'HTTP/1.1'

   '_rc' => 299

   '_request' => HTTP::Request=HASH(0x623e40)

      '_content' => ''

      '_headers' => HTTP::Headers=HASH(0x68e6f4)

         'user-agent' => 'libwww-perl/5.69'

      '_method' => 'GET'

      '_uri' => URI::http=SCALAR(0x6e55e8)

         -> 'http://www.perlmedic.com/fnord/'

You can see from this that there was indeed a redirect, and find out exactly what was returned. (You can gain a fine control over this procedure for web page fetching with the LWP::Debug module, but the approach I have illustrated will work for any type of object.) Then you can see what happens to the content in summarize() by stepping into the routine:


DB<4> s

main::summarize(/home/peter/bin/url_summarizer):177:

    my $content = shift;

Find out more about how to use the debugger in perldebtut, perldebug, or [SCOTT01].

There are a few special tricks you can use to enhance your use of the debugger. For instance, suppose you're getting a warning, but it's emerging from deep in code that you didn't write and you can't tell where; you'd like the debugger to breakpoint at that place so you can display a stack trace and find out how you got there.

Perl 5.8.0 and later will give you a stack trace anyway if you run a program under the debugger and some code triggers a warning. But suppose you are either running under an earlier perl, or you'd really like to have a debugger prompt at the point the warning was about to happen.

You can combine two advanced features of Perl to do this: pseudo-signal handlers, and programmatic debugger control.

A signal handler is a subroutine you can tell Perl to execute whenever your program receives a signal. For instance, when the user interrupts your program by pressing Control-C, that works by sending an INT signal to your program, which interprets it by default as an instruction to stop executing.

There are two pseudo-signals, called __WARN__ and __DIE__. They aren't real signals, but Perl "generates" them whenever it's told to issue a warning or to die, respectively. You can supply code to be run in those events by inserting a subroutine reference in the %SIG hash (see perlvar) as follows:


$SIG{__WARN__} = sub { print "Ouch, I'm bad" };

(Try it on some code that generates a warning.)

The next piece of the solution is that the debugger can be controlled from within your program; the variable $single in the special package DB determines what Perl does at each statement: 0 means keep going, and 1 or 2 mean give a user prompt.^[1] So setting $DB::single to 1 in a pseudo-signal handler will give us a debugger prompt at just the point we wanted.

^[1] The difference between the two values is that a 1 causes the debugger to act as though the last n or s command the user typed was s, whereas a 2 is equivalent to an n. When you type an empty command in the debugger (just hit Return), it repeats whatever the last n or s command was.

Putting the pieces together, you can start running the program under the debugger and give the commands:


DB<1> $SIG{__WARN__} = sub { warn @_; $DB::single = 1 }

DB<2> c

Now the program will breakpoint where it was about to issue a warning, and you can issue a T command to see a stack trace, examine data, or do anything else you want.^[2] The warning is still printed first.

^[2] Under some circumstances, the breakpoint might not occur at the actual place of warning: The current routine might return if the statement triggering the warning is the last one being executed in that routine.

Unfortunately, no __DIE__ pseudo-signal handler will return control to the debugger (evidently death is considered too pressing an engagement to be interrupted). However, you can get a stack trace by calling the confess() function in the Carp module:


DB<1> use Carp

DB<2> $SIG{__DIE__} = sub { confess (@_) }

The output will look something like this:


DB<3> c

Insufficient privilege to launch preemptive strike at wargames line 

109.

        main::__ANON__[(eval 17)[/usr/lib/perl5/5.6.1/

perl5db.pl:1521]:2]('Insufficient privilege to launch preemptive 

strike at wargames line 109.^J') called at wargames line 121

        main::preemptive('Strike=HASH(0x82069d4)') called at wargames 

line 109

        main::make_strike('ICBM=HASH(0x820692c)') called at wargames 

line 74

        main::icbm('Silo_ND') called at wargames line 32

        main::wmd('ICBM') called at wargames line 22

        main::strike() called at wargames line 11

        main::menu() called at wargames line 5

Debugged program terminated.  Use q to quit or R to restart,

  use O inhibit_exit to avoid stopping after program termination,

  h q, h R or h O to get additional info.

I've often found it amusing that the debugger refers to the program at this point as "debugged."

9.4.1 Debugging Regular Expressions

Not for the fainthearted, if you want to see how a regular expression runs when used in a match or substitution, use the core re pragma with its debug option:


% perl -Mstrict -Mwarnings

use re qw(debug);

$_ = "cats=purr, dog=bark"; 

my %sound = /(\w+)=(\w+)/g;

^D

Compiling REx `(\w+)=(\w+)'

size 15 first at 4

   1: OPEN1(3)

   3:   PLUS(5)

   4:     ALNUM(0)

   5: CLOSE1(7)

   7: EXACT <=>(9)

   9: OPEN2(11)

  11:   PLUS(13)

  12:     ALNUM(0)

  13: CLOSE2(15)

  15: END(0)

floating `=' at 1..2147483647 (checking floating) stclass `ALNUM' plus 

minlen 3 

Guessing start of match, REx `(\w+)=(\w+)' against `cats=purr, 

dog=bark'...

Found floating substr `=' at offset 4...

Does not contradict STCLASS...

Guessed: match at offset 0

Matching REx `(\w+)=(\w+)' against `cats=purr, dog=bark'

  Setting an EVAL scope, savestack=3

   0 <> <cats=purr, d>    |  1:  OPEN1

   0 <> <cats=purr, d>    |  3:  PLUS

                           ALNUM can match 4 times out of 32767...

  Setting an EVAL scope, savestack=3

   4 <cats> <=purr, d>    |  5:    CLOSE1

   4 <cats> <=purr, d>    |  7:    EXACT <=>

   5 <cats=> <purr, d>    |  9:    OPEN2

   5 <cats=> <purr, d>    | 11:    PLUS

                           ALNUM can match 4 times out of 32767...

  Setting an EVAL scope, savestack=3

   9 <=purr> <, dog=b>    | 13:      CLOSE2

   9 <=purr> <, dog=b>    | 15:      END

Match successful!

Guessing start of match, REx `(\w+)=(\w+)' against `, dog=bark'...

Found floating substr `=' at offset 5...

By STCLASS: moving 0 --> 2

Guessed: match at offset 2

Matching REx `(\w+)=(\w+)' against `dog=bark'

  Setting an EVAL scope, savestack=3

  11 <urr, > <dog=bar>    |  1:  OPEN1

  11 <urr, > <dog=bar>    |  3:  PLUS

                           ALNUM can match 3 times out of 32767...

  Setting an EVAL scope, savestack=3

  14 <rr, dog> <=bark>    |  5:    CLOSE1

  14 <rr, dog> <=bark>    |  7:    EXACT <=>

  15 <rr, dog=> <bark>    |  9:    OPEN2

  15 <rr, dog=> <bark>    | 11:    PLUS

                           ALNUM can match 4 times out of 32767...

  Setting an EVAL scope, savestack=3

  19 <rr, dog=bark> <>    | 13:      CLOSE2

  19 <rr, dog=bark> <>    | 15:      END

Match successful!

Freeing REx: `(\w+)=(\w+)'

If you use the debugcolor option instead of debug, you'll get some form of highlighting or coloring in the output that'll make it prettier, if not more understandable.

Table of Contents