|Previous||Table of Contents||Next|
9.3 Finding Inefficient Code
When an application is taking too long to run, one cure starts with finding out how long each part takes to run. Knowing where to expend your optimization efforts is key; if you happen to pick a piece of code that only consumes 1 percent of the total run time, then no matter how efficient you make it you cannot achieve more than a 1 percent increase in speed (roughly 1.01 percent, for the pedants). Optimizing a piece that consumes 90 percent of the run time can lead to a speed-up of up to a factor of 10. However, you should not rely on being able to determine by inspection alone which piece of code is which. People are notorious for being wrong in such assessments and the only thing that separates the experts from the neophytes is that the experts know that they shouldn't trust their guesses. What they use instead is a profiling tool, and there are two available for Perl: Devel::DProf and Devel::SmallProf. Be aware that both of these modules unavoidably introduce inaccuracies into the times they report. Because they involve running additional code for every subroutine or statement that Perl executes (and because both invoke Perl's internal debugging hooks), profiled programs will take longer to execute than their unaltered forms.
Devel::DProf is included in the Perl core; its purpose is to show you the execution profile at the level of subroutines. This means that it will show you the number of times those subroutines are called and the time spent executing them. Like Devel::Coverage and Devel::Cover, it is invoked as a debugger plug-in:
% perl -d:DProf program arguments...
It will log timing information to a file tmon.out that can rapidly get extremely large for even a moderately complicated run; monitor the size of the file from another window and terminate the run when you don't like how large it's become. The filename can be changed by setting the environment variable PERL_DPROF_OUT_FILE_NAME.
The tmon.out file is not for human consumption and you need to run the program dprofpp to generate a report. As with any Perl program that's installed with perl, you can use perldoc to view its documentation.
dprofpp will give good output without any arguments; you can give it several options to change its default behavior. By default it shows only the time spent inside the lexical scope of each subroutine, not the cumulative time spent in that subroutine and all subroutines called from it; but that's what you want anyway.
Regrettably there is no way to pipe the DProf output into dprofpp and avoid creating tmon.out, because DProf seek()s to near the beginning of the file periodically.
Ted Ashton's Devel::SmallProf is being maintained by Salvador Fandi–o and is available from CPAN (http://search.cpan.org/dist/Devel-SmallProf). Its purpose is similar to Devel::DProf, but it can show results at the finer granularity of execution lines (hence, "small"). Ensuring that you have at most one statement per line will maximize the benefit of using this module, so make sure you have deobfuscated and beautified the code first.
Devel::SmallProf generates a file called smallprof.out. It is not possible to cause the output to go elsewhere, but unlike Devel::DProf, the size of that file is not dependent on the length of run time but instead on the amount of source code, because what it contains is all the source code, annotated with the number of times each line was called and how much wall clock time and CPU time was spent executing it.
The smallprof.out file is ordered by source code line number, which isn't terribly helpful; the point of profiling is to find out what code is taking the longest to run. The commands for sorting the file given in the Devel::SmallProf documentation aren't as helpful as one would like, because smallprof.out contains page headers that clutter the output, and the filenames get separated from the source code lines when sorted.
I therefore wrote a program that will sort either by CPU or wall clock time, and output the filename of the source as well. (The truncated line of original source code that was at the end of each smallprof.out line had to be removed to make room for the filename.) It's called smallprofpp and you'll find it in the Appendix.
Here is an example of the first few lines of output from that program, sorted by wall clock time:
% ./smallprofpp | head -5 count wall tm cpu time line 1140 0.999952 0.150000 /usr/lib/perl5/5.6.1/Exporter.pm:57 545 0.999952 0.050000 substr2:36 1 0.999952 0.000000 /usr/lib/perl5/site_perl/5.6.1/LWP/Simple.pm:319 1 0.238549 0.000000 substr2:9
and here by CPU time:
% ./smallprofpp -c | head -5 count wall tm cpu time line 727 0.000000 0.360000 substr2:33 1140 0.999952 0.150000 /usr/lib/perl5/5.6.1/Exporter.pm:57 247 0.000000 0.110000 substr2:43 397 0.000000 0.090000 /usr/lib/perl5/5.6.1/Exporter.pm:55
Note that there is a significant difference between the lines that took the most wall clock time and the ones that took the most cpu time. Generally, you should investigate the highest CPU time consumers first when seeking to improve performance; the excess of wall clock time over CPU time may be due to I/O or other processes getting attention. Improving time spent doing I/O is a very tricky activity, and even determining whether I/O is the reason for the delay is problematic.
9.3.3 Optimization Tips
If your efforts to improve the speed of a Perl program aren't approaching the target, there are a number of strategies that may get you unstuck:
|Previous||Table of Contents||Next|