Ïðèãëàøàåì ïîñåòèòü
Êëþåâ (klyuev.lit-info.ru)

9.1 Static Analysis

Previous Table of Contents Next

9.1 Static Analysis

You'd like to know more about the program you're working on. But it's so long and boring that reading it all is likely to cause premature senility. Fortunately there are some tools that can help you get a handle on your morass of Perl source code.

If you're looking for an interactive development environment (IDE) for Perl, the bad news is that there aren't many; not nearly as many as for languages like Java and C++. One very good reason for this is that static analysis of Perl is rather unsatisfying if you're an IDE developer because so much of the environment of a Perl program is not determined until run time (late binding). Considering that Perl programs can not only create subroutines at run time, but import new classes and even modify the inheritance hierarchy while they are running, it's not surprising that IDE developers don't find Perl attractive. A class browser would have to be capable of updating itself at run time in order to be reliably accurate, and that implies a positively incestuous relationship between the IDE and the Perl run-time system. Most Perl IDEs consist of bundling a syntax highlighting editor, an interface to the Perl debugger, and perhaps a browser for Perl documentation. Getting really good at an editor like vi or emacs that can let you switch between multiple buffers and shell command modes quickly is what you'll find most of the top developers doing.

The good news is that large projects can be developed in Perl without the need for an IDE, as long as the programmers agree to use the same layout style and code to a particular level of sophistication (see Section 4.2.2). If you want an IDE, there are a number of free ones and a very capable commercial tool called Visual Perl from ActiveState Corporation (see Figure 9.1).

Figure 9.1. ActiveState's Visual Perl IDE.

graphics/09fig01.jpg

Now I'll list some of the modules that can help tell you things about a program you're trying to analyze.

9.1.1 YAPE::Regex::Explain

Jeff Pinyan wrote this module, which provides plain English explanations of regular expressions. If you've got some legacy code that employs regexes beyond your current comprehension, YAPE::Regex::Explain can give you a useful head start on understanding them, by feeding them to its explain() method. Here's an example:


% perl -Mstrict -Mwarnings

use YAPE::Regex::Explain;

my $regex = qr/\G([\w-]+).+?\b/;

my $xplnr = YAPE::Regex::Explain->new($regex);

print $xplnr->explain;

^D

The regular expression:



(?-imsx:\G([\w-]+).+?\b)



matches as follows:



NODE                     EXPLANATION

----------------------------------------------------------------------

(?-imsx:                 group, but do not capture (case-sensitive)

                         (with ^ and $ matching normally) (with . not

                         matching \n) (matching whitespace and #

                         normally):

----------------------------------------------------------------------

  \G                       where the last m//g left off

----------------------------------------------------------------------

  (                        group and capture to \1:

----------------------------------------------------------------------

    [\w-]+                   any character of: word characters (a-z,

                             A-Z, 0-9, _), '-' (1 or more times

                             (matching the most amount possible))

----------------------------------------------------------------------

  )                        end of \1

----------------------------------------------------------------------

  .+?                      any character except \n (1 or more times

                           (matching the least amount possible))

----------------------------------------------------------------------

  \b                       the boundary between a word char (\w) and

                           something that is not a word char

----------------------------------------------------------------------

)                        end of grouping

----------------------------------------------------------------------

Note that we need to feed a regex object or a string to the new() method, but either way, if we need to specify regex modifiers, they have to be embedded in the regex using the (?:xxx-xxx) notation. This leads to a somewhat esoteric discussion (level 7) that you only need heed if you're really interested in how to explain modifiers. Suppose the regex in the original code appears as follows:


while ($line =~ /\G([\w-]+).+?\b/csig)

then to get YAPE::Regex::Explain to explain the modifiers, they have to be embedded like this:


my $regex = qr/(?si:\G([\w-]+).+?\b)/;

Why did I leave the c and g modifiers out? Because they only affect the operation of the m/// operator and are not part of the regular expression itself. See perlre for more information on (?:xxx-xxx), perlop for more information on m///, and especially [FRIEDL02] for the definitive treatment of regular expressions.

9.1.2 Benchmark::Timer

This module by Andrew Ho is like a stopwatch for your program. You can time the duration of specific sections of code and get a report on how much time your program spent in them and how many times it executed them. You label each section with a unique tag in a call to the start() method, and then call the stop() method when the code you're measuring is done. Multiple sections of code (or the same section of code executed multiple times) can receive the same label and therefore be lumped together. Here's an example program that performs some relatively time-consuming operations (sending GET and HEAD requests for a web page):

Example 9.1. Demonstration of Benchmark::Timer

#!/usr/bin/perl

use strict;

use warnings;



use Benchmark::Timer;

use LWP::Simple;



my $timer = Benchmark::Timer->new;



my $url = "http://www.perlmedic.com/";



$timer->start('head');

head($url);

$timer->stop();



$timer->start('get');

get($url);

$timer->stop();



$timer->start('head');

head($url);

$timer->stop();



$timer->report;

When we run this program, we see:

Example . Benchmark::TimeTick

% ./timertest

2 trials of head (1.192s total), 596.092ms/trial

1 trial of get (177.013ms total)

demonstrating that for some counterintuitive reason, LWP::Simple's head() function takes longer than its get() function.

9.1.3 Benchmark::TimeTick

Here's a simple little module along similar lines that I wrote for this book. Its purpose is to output a list showing exactly how long your program took to reach each of a series of points you mark with a call to its timetick() method. Its source code is in the Appendix; here is its documentation:

NAME

Benchmark::TimeTick—Keep a tally of times at different places in your program

SYNOPSIS


use Benchmark::TimeTick qw(timetick);

# Your code...

timetick("Starting phase three");

# More of your code...

DESCRIPTION

Benchmark::TimeTick provides a quick and convenient way of instrumenting a program to find out how long it took to reach various points. Just use the module and call the timetick() method whenever you want to mark the time at a point in your program. When the program ends, a report will be output giving the times at which each point was reached.

The times will be recorded using Time::HiRes::time() if Time::HiRes is available, otherwise time() will be used. (Because time() has one-second granularity this is unlikely to be useful.)

CONFIGURATION

You can customize the action of Benchmark::TimeTick via the package hash %Benchmark::TimeTick::Opt. Recognized keys are:

suppress_initial If true, do not put an initial entry in the report when the module is loaded.

suppress_final If true, do not put a final entry in the report when the program terminates.

reset_start If true, report all times relative to the time that Benchmark::TimeTick was loaded rather than the actual start of the program.

format_tick_tag If set, should be a reference to a subroutine that will take as input a $tag passed to timetick() and return the actual tag to be used. Can be helpful for applying a lengthy transformation to every tag while keeping the calling code short.

format_report If set, should be a reference to a subroutine that will take as input a list of time ticks for reporting. Each list element will be a reference to an array containing the time and the tag, respectively. The default format_report callback is:


sub { printf("%7.4f %s\n", @$_) for @_ }

suppress_report If true, do not output a report when report() is called; just reset the time tick list instead.

For either suppress_initial or reset_start to be effective, they must be set in BEGIN blocks before this module is used.

METHODS

Benchmark::TimeTick::timetick($tag) Record the time at this point of the program and label it with the string $tag.

Benchmark::TimeTick::report() Output a report (unless suppress_report is set) and reset the time tick list.

Benchmark::TimeTick::end() Add a final time tick (unless suppress_final is set), and output a report. Called by default when the program finishes.

OPTIONAL EXPORTS

Benchmark::TimeTick::timetick() will be exported on demand. This is recommended for the sake of brevity.

EXAMPLE


BEGIN { $Benchmark::TimeTick::Opt{suppress_initial} = 1 }

use Benchmark::TimeTick qw(timetick);

# ... time passes

timetick("Phase 2");

# ... more time passes, program ends

Output from Benchmark::Timetick:


0.7524 Phase 2

0.8328 Timeticker for testprog finishing

9.1.4 Debug::FaultAutoBT

Stas Bekman wrote this module, which intercepts signals that would otherwise cause a core dump, and attempts to extract and dump a gdb backtrace instead. Use this if you are trying to debug some XS code.

9.1.5 Devel::LeakTrace

This module by Richard Clamp requires Perl 5.6.0 or later, and will report on any storage allocated after the program started but not freed by the time the program went through its END blocks. If it finds any they are probably due to cyclic references, in which case the cure is Tuomas J. Lukka's WeakRef module (http://search.cpan.org/dist/WeakRef/).

9.1.6 Module::Info

Mattia Barbon maintains this module created by Michael Schwern; it scans the source of a module to provide information about it. Some of that information may not be completely reliable, just because of how infernally difficult it is to parse Perl. It can tell you things like the list of package declarations in the module, the subroutines defined in it, and the subroutines called by it.

9.1.7 Class::Inspector

Adam Kennedy wrote this module that provides convenience methods for reflective functions about loaded modules: source filename, subroutines, and methods.

9.1.8 B::Xref

This module by Malcolm Beattie was added to the Perl core in version 5.005. It will generate a report that can help you track where variables and subroutines are defined and referenced. Because it's a compiler back end like B::Deparse, it doesn't actually run your code. Invoke it via the O front end:


% perl -MO=Xref program arguments...

and what will appear is the report, showing variables and subroutines and the lines they appeared on, broken down by file.

    Previous Table of Contents Next