Ïðèãëàøàåì ïîñåòèòü
Ïðîñâåùåíèå (lit-prosv.niv.ru)

4.10 Evolution

Previous Table of Contents Next

4.10 Evolution

Imagine a very adaptable organism of indeterminate species responding to changes in its environment. Temperature rising? Evolve a thicker skin. Gravity increasing? Add more legs. Water table rising? Grow fins.

Similarly, our code changes as various demands upon it increase: size of input data, number of concurrent users, and so on. You can think of those demands as happening along different dimensions; in other words, they can vary independently of each other and any given program may be subject to one or more type of demand.[12]

[12] This is the "ontogeny recapitulates phylogeny" argument applied to software. I've always wanted an excuse to use that phrase in a book.

How far along each of those dimensions you are able to evolve your code before it succumbs to the pressures of change depends on your skills; the further along each dimension you go, the more skillful you need to be.

When you're rewriting legacy code, it may help to bump it along these dimensions one step at a time until you get it to where you want. Space doesn't permit me to go into much detail, so I'll just cover some steps of some dimensions broadly.

4.10.1 Modularity

  1. Flat scripting: One monolithic block of code.

  2. Subroutines: Abstracting common functionality; procedural programming.

  3. Using objects, making libraries of reusable subroutines.

  4. Subclassing existing modules.

  5. Creating application-specific object-oriented modules.

  6. Using tieing and overloading (see Section 10.2).

4.10.2 Input Interfaces—Complexity

  1. Hand parsing of @ARGV.

  2. Getopt::Std.

  3. Getopt::Long.

  4. Configuration files: see Simon Cozens' module Config::Auto (http://search.cpan.org/dist/Config-Auto/).

4.10.3 Error Handling—Application Size

  1. No error handling at all—"program and pray."

  2. Checking return codes from built-in functions.

  3. Writing subroutines to return distinct error conditions.

  4. Exception handling: using die() and eval() for throwing and catching exceptions.

  5. Throwing objects as exceptions, either with die(), or using a module such as Dave Rolsky's Exception::Class (http://search.cpan.org/dist/Exception-Class/). Graham Barr's Error.pm, maintained by Arun Kumar U (http://search.cpan.org/dist/Error/) provides nice try and catch keywords, but has a memory leak triggered by nested closures. Such usage is rare, but if you don't want to worry about whether you've unwittingly created a nested closure, don't use it.

4.10.4 Logging—Volume and Granularity

  1. What's "logging"? Wait for something to go wrong or evidence needed of past behavior, then panic.

  2. Ad-hoc print() and warn() statements.

  3. Ad-hoc print() statements directed to specific files, possibly with hand-rolled timestamping, process/host/user identification, and log file rotation.

  4. Mike Schilli's Log::Log4perl (CPAN and http://log4perl.sourceforge.net) is based on Java's Log4j utility, and provides an object-oriented logging capability with severity levels and completely configurable multiplexing of messages to all kinds of destinations. See Section 10.1.4.

  5. Mark Pfeiffer's Log::Dispatch::FileRotate (http://search.cpan.org/dist/Log-Dispatch-FileRotate/) automates log file rotation and provides file locking into the bargain. Can be plugged into Log::Log4perl.

4.10.5 External Information Representation—Extensibility

  1. External data represented in plain text, likely implicitly assumed to be in the ISO Latin-1 encoding, or (especially in some locales) Unicode UTF-8, but with ad-hoc syntax.

  2. Various configuration file formats such as the core modules Data::Dumper and Storable, and CPAN modules for more specific applications such as Config::IniFiles, Config::Auto, DBD::CSV, and so on.

  3. When you want a structured markup format that's easy to read, get Brian Ingerson's YAML (originally Yet Another Markup Language; now, hopping on the already overcrowded recursive retronym bandwagon, YAML Ain't Markup Language) from http://search.cpan.org/dist/YAML/. A number of Perl modules use it as their data interchange format, such as Ingerson's CGI::Kwiki, which can be used for building Wiki sites [LEUF01].[13]

    [13] As a digression, the need for a human-readable markup format was echoed in the compact format of the RELAX-NG specification for XML schemas (http://www.oasis-open.org/committees/relax-ng/compact-20020607.html). Evidently even with so many tools that understand XML, people still want to get their eyes on the raw characters.

  4. For maximum interoperability, XML (see Section 8.3.7). YAML is extraordinarily capable, but when you need to speak to a third party, they're more likely to understand XML than YAML.

4.10.6 External Data Storage—Complexity, Concurrency, and Volume

  1. Plain files—no locking, no structured data.

  2. DBM files written by one of the AnyDBM_File modules (perldoc AnyDBM_File).

  3. DBM files with home-grown locking to avoid concurrent access race conditions.[14]

    [14] http://www.nightflight.com/foldoc-bin/foldoc.cgi?race+condition

  4. Accessing database servers with Tim Bunce's DBI.pm (see Section 8.3.6).

  5. Integrating object classes with databases: Tangram (http://search.cpan.org/dist/Tangram/), Alzabo (http://search.cpan.org/dist/Alzabo/), and so on.

4.10.7 Web Server Processing—Scalability

  1. Processing form inputs and generating content for a browser by hand.

  2. Using Lincoln Stein's CGI.pm to do either or both.

  3. Using Sam Tregar's HTML::Template (http://search.cpan.org/dist/HTML-Template/) or Andy Wardley's Template Toolkit (http://search.cpan.org/dist/Template-Toolkit/) to separate the output appearance from the code.

  4. Using mod_perl on an Apache server for improved performance (see [BEKMAN03]).

  5. Using Dave Rolsky's HTML::Mason (http://search.cpan.org/dist/HTML-Mason/) for advanced templating under mod_perl.

  6. Integrating XML with Matt Sergeant's AxKit (http://search.cpan.org/dist/AxKit/) or extending into vertical applications too numerous to mention.

4.10.8 Event Handling—Application Complexity and Interoperability

  1. Punt. Event handling is not for the fainthearted.

  2. Using Joshua N. Pritikin's Event.pm (http://search.cpan.org/dist/Event/).

  3. Using Rocco Caputo's Perl Object Environment (POE). See CPAN and http://poe.perl.org/. POE can wrap an event loop around just about any kind and number of asynchronous activity, usually without even needing to start a new thread or process. See also Uri Guttman's Stem (http://www.stemsystems.com and CPAN).

4.10.9 Client/Server or Interprocess Communication—Interoperability

  1. Again, punt—beginning programmers don't try this.

  2. Except at primitive levels like leaving messages in files for other processes to find. Usually has race conditions and other bugs.

  3. Using fork(). Handcrafted use of signals, pipes, and semaphores.

  4. Using higher-level protocols for interprocess communication; see, for example, Graham Barr's Net::Cmd (in the Perl core as of version 5.8.0).

  5. Interoperating with other applications using standard remote procedure call (RPC) mechanisms such as XML-RPC [STLAURENT01] and SOAP (both covered in Paul Kulchenko's distribution at http://search.cpan.org/dist/SOAP-Lite/).

The dimension of robustness deserves space all to itself, and I'll get to it in Section 10.1.

    Previous Table of Contents Next