Приглашаем посетить
Лесков (leskov.lit-info.ru)

9.3 Finding Inefficient Code

Previous Table of Contents Next

9.3 Finding Inefficient Code

When an application is taking too long to run, one cure starts with finding out how long each part takes to run. Knowing where to expend your optimization efforts is key; if you happen to pick a piece of code that only consumes 1 percent of the total run time, then no matter how efficient you make it you cannot achieve more than a 1 percent increase in speed (roughly 1.01 percent, for the pedants). Optimizing a piece that consumes 90 percent of the run time can lead to a speed-up of up to a factor of 10. However, you should not rely on being able to determine by inspection alone which piece of code is which. People are notorious for being wrong in such assessments and the only thing that separates the experts from the neophytes is that the experts know that they shouldn't trust their guesses. What they use instead is a profiling tool, and there are two available for Perl: Devel::DProf and Devel::SmallProf. Be aware that both of these modules unavoidably introduce inaccuracies into the times they report. Because they involve running additional code for every subroutine or statement that Perl executes (and because both invoke Perl's internal debugging hooks), profiled programs will take longer to execute than their unaltered forms.

9.3.1 Devel::DProf

Devel::DProf is included in the Perl core; its purpose is to show you the execution profile at the level of subroutines. This means that it will show you the number of times those subroutines are called and the time spent executing them. Like Devel::Coverage and Devel::Cover, it is invoked as a debugger plug-in:


% perl -d:DProf program arguments...

It will log timing information to a file tmon.out that can rapidly get extremely large for even a moderately complicated run; monitor the size of the file from another window and terminate the run when you don't like how large it's become. The filename can be changed by setting the environment variable PERL_DPROF_OUT_FILE_NAME.

The tmon.out file is not for human consumption and you need to run the program dprofpp to generate a report. As with any Perl program that's installed with perl, you can use perldoc to view its documentation.

dprofpp will give good output without any arguments; you can give it several options to change its default behavior. By default it shows only the time spent inside the lexical scope of each subroutine, not the cumulative time spent in that subroutine and all subroutines called from it; but that's what you want anyway.

Regrettably there is no way to pipe the DProf output into dprofpp and avoid creating tmon.out, because DProf seek()s to near the beginning of the file periodically.

9.3.2 Devel::SmallProf

Ted Ashton's Devel::SmallProf is being maintained by Salvador Fandi–o and is available from CPAN (http://search.cpan.org/dist/Devel-SmallProf). Its purpose is similar to Devel::DProf, but it can show results at the finer granularity of execution lines (hence, "small"). Ensuring that you have at most one statement per line will maximize the benefit of using this module, so make sure you have deobfuscated and beautified the code first.

Devel::SmallProf generates a file called smallprof.out. It is not possible to cause the output to go elsewhere, but unlike Devel::DProf, the size of that file is not dependent on the length of run time but instead on the amount of source code, because what it contains is all the source code, annotated with the number of times each line was called and how much wall clock time and CPU time was spent executing it.

The smallprof.out file is ordered by source code line number, which isn't terribly helpful; the point of profiling is to find out what code is taking the longest to run. The commands for sorting the file given in the Devel::SmallProf documentation aren't as helpful as one would like, because smallprof.out contains page headers that clutter the output, and the filenames get separated from the source code lines when sorted.

I therefore wrote a program that will sort either by CPU or wall clock time, and output the filename of the source as well. (The truncated line of original source code that was at the end of each smallprof.out line had to be removed to make room for the filename.) It's called smallprofpp and you'll find it in the Appendix.

Here is an example of the first few lines of output from that program, sorted by wall clock time:


% ./smallprofpp | head -5

    count wall tm  cpu time line

     1140 0.999952 0.150000 /usr/lib/perl5/5.6.1/Exporter.pm:57

      545 0.999952 0.050000 substr2:36

        1 0.999952 0.000000 /usr/lib/perl5/site_perl/5.6.1/LWP/Simple.pm:319

        1 0.238549 0.000000 substr2:9

and here by CPU time:


% ./smallprofpp -c | head -5

    count wall tm  cpu time line

      727 0.000000 0.360000 substr2:33

     1140 0.999952 0.150000 /usr/lib/perl5/5.6.1/Exporter.pm:57

      247 0.000000 0.110000 substr2:43

      397 0.000000 0.090000 /usr/lib/perl5/5.6.1/Exporter.pm:55

Note that there is a significant difference between the lines that took the most wall clock time and the ones that took the most cpu time. Generally, you should investigate the highest CPU time consumers first when seeking to improve performance; the excess of wall clock time over CPU time may be due to I/O or other processes getting attention. Improving time spent doing I/O is a very tricky activity, and even determining whether I/O is the reason for the delay is problematic.

9.3.3 Optimization Tips

If your efforts to improve the speed of a Perl program aren't approaching the target, there are a number of strategies that may get you unstuck:

  • Rewrite parts of it in another language. Having identified a section of code that's taking a large proportion of execution time, if it is small enough, consider rewriting it in, say, C. Note that if the code is just performing text manipulation this might not improve the running speed because Perl works like a Complex Instruction Set Computer (CISC); its operators are implemented by highly optimized compiled C code. You can most easily include your C code by using Brian Ingerson's Inline::C (http://search.cpan.org/dist/Inline/).

  • Rewrite all of it in another language. Some applications are just better off that way. I would not write a windowing system in Perl, for instance, nor a guidance system for a cruise missile. If it has to run as fast as possible, C is a better bet. Beware that many people will say "it has to run as fast as possible" when that isn't really the case, however. They usually have a standard in mind and if you can meet it with Perl, they'll be happy even if you could have shaved some time off using another language.

  • Improve the algorithms. Algorithms are an often-neglected but critical component of the education of any programmer. The right algorithm can make the difference between a program finishing in less than a second or exceeding the lifetime of the universe. See [AHO74], [AHO83], [BENTLEY82], and [ORWANT99] for a start.

    Previous Table of Contents Next