Section 4.4.  Database Abstraction

Table of Contents

4.4. Database Abstraction

Tangram has given us a way to store and retrieve objects in a database. The other side of the coin is the situation of having an existing database and wanting to get a view of it in terms of Perl objects. This is a very subtle distinction, but an important one. In the case of Tangram (and indeed, Pixie), we didn't really care what the database schema was, because the database was just an incidental way for Tangram to store its stuff. It could create whatever tables and columns it wanted; what we really care about is what the objects look like. In the current case, though, we already have the database; we have a defined schema, and we want the database abstraction tool to work around that and tell us what the objects should look like.

There are several good reasons why you might want to do this. For many people, database abstraction is attractive purely because it avoids having to deal with SQL or the relatively tedious process of interacting with the DBI; but there's a more fundamental reason.

When we fetch some data from the database, in the ordinary DBI model, it then becomes divorced from its original database context. It is no longer live data. We have a hash reference or array reference of datawhen we change elements in that reference, nothing changes in the database at all. We need a separate step to put our changes back. This isn't the paradigm we're used to programming in. We want our data to do something, and data that do something are usually called objectswe want to treat our database rows as objects, with data accessors, instantiation and deletion methods, and so on. We want to map between relational databases and objects, and this is called, naturally, object relational mapping.


SQLite (http://www.hwaci.com/sw/sqlite/) is a self-contained relational database that works on a simple file in the filesystem, and it's getting ever more sophisticated. It's also incredibly fast. Instead of having a separate database daemon that listens for and responds to queries, SQLite takes the DBM approach of providing a C library that acts on the data directly. If you install the DBD::SQLite module from CPAN, you'll have everything you need to use relational databases without the hassle of installing one of the bigger database engines:

    use DBI;
    my $dbh = DBI->connect("dbi:SQLite:dbname=music.db");
    $dbh->do("CREATE TABLE cds ( ... )");

4.4.1. Trivial Mapping

We'll demonstrate some of the principles of an object-relational mapper by creating a very, very simple object-relational mapper that is read-onlyit doesn't allow us to make changes to the database. Then we'll show how to add this functionality, and look at Class::DBI, a very similar mapper that does it all for us.

Before I heard of Class::DBI, I actually implemented something like this in production code. The basic idea looks like this:

    package CD::DBI;
    our $dbh = DBI->connect("dbd:mysql:music");

    sub select {
        my ($class, $sql, @params) = @_;
        my $sth = $dbh->prepare($sql);

        my @objects;
        while (my $obj = $sth->fetchrow_hashref(  )) {
            push @objects, (bless $obj, $class);

    package CD;
    use base 'CD::DBI';

    package CD::Artist;
    use base 'CD::DBI';

    package main;

    my @cds = CD->select("SELECT * FROM cd");

fetchrow_hashref is a very useful DBI method that returns each row as a hash:

        id => 180,
        title => "Inside Out",
        artist => 105,
        publishdate => "1983-03-14"

This looks rather like our CD objects, so we simply bless this into the right class, and all the accessors work as normal. This is actually very close to what we want. There are two things we can improve: artist now returns an ID instead of a CD::Artist object and any changes we make don't get written back to the database.

So, to deal with the first problem, we can modify the artist accessor like so:

    package CD;
    sub artist {
        my $self = shift;
        my ($artist) = CD::Artist->select(
            "SELECT * FROM artist WHERE id = ?",
        return $artist;

This time, we retrieve an individual record from the artist table and bless it into the CD::Artist class. We can write similar accessors for other relationships. For instance, to get all the tracks belonging to a specific CD:

    sub tracks {
        my $self = shift;
        CD::Track->select("SELECT * FROM track WHERE cd = ?",

To make this whole system read-write instead of read-only, we need to update our accessors again, something like this:

    package CD;
    sub title {
        my ($self, $title) = @_;
        if ($title) {
            $CD::DBI::dbh->do("UPDATE cd SET title = ? WHERE id = ?",
                              undef, $title, $self->{id});

But here we're writing a lot of code; the purpose of using automated accessor generators was to avoid going through all this rigmarole. Perhaps there should be a module that generates database-aware accessors . . . .

4.4.2. Class::DBI

By far my favorite of the object-relational mapping modules is Michael Schwern and Tony Bowden's Class::DBI. It is very easy to learn and to set up, highly extensible, and supported by a wide range of auxilliary modules. It is also, not entirely coincidentally, rather like the simple mapper we just created. To set it up, we subclass Class::DBI to create a driver class specific to our database:

    package CD::DBI;
    use base 'Class::DBI';
    _ _PACKAGE_ _->connection("dbi:mysql:musicdb");

We do this so that when we implement the table classes, they all know where they're connecting to. Now let's take the first table, the artist table:

    package CD::Artist;
    use base 'CD::DBI';
    _ _PACKAGE_ _->table("artist");
    _ _PACKAGE_ _->columns(All => qw/artistid name popularity/);

Here we're using our own CD::Artist class and the other classes we will generate, instead of the classes we wrote in the earlier chapter. The interface will be just the same as our original CD::Artist, because Class::DBI uses the same Class::Accessor way of creating accessors.

It also adds a few more methods to the CD::Artist class to help us search for and retrieve database rows:

    my $waits = CD::Artist->search(name => "Tom Waits")->first;
    print $waits->artistid; # 859
    print $waits->popularity; # 634

    my $previous = CD::Artist->retrieve(858);
    print $previous->name; # Tom Petty and the Heartbreakers

    # So how many Toms are there?

    my $toms = CD::Artist->search_like(name => "Tom %")->count;
    print $toms; # 6

    for my $artist ( CD::Artist->retrieve_all ) {
        print $artist->name, ": ", $artist->popularity, "\n";

We can also create a new artist by passing in a hash reference of attributes:

    $buff = CD::Artist->create({
       name => "Buffalo Springfield",
       popularity => 10

Class::DBI automatically creates data accessors for each of the columns of the table; we can update columns in the database by passing arguments to the accessors. Here's a program that uses Mac::AppleScript to ask iTunes for the currently playing artist, and then increments the artist's popularity:

    use Mac::AppleScript qw(RunAppleScript;
    my $current = RunAppleScript(<<AS);
      tell application "iTunes"
        artist of current track
      end tell

    my $artist = CD::Artist->find_or_create({ name => $current });
    $artist->popularity( $artist->popularity(  ) + 1 );

This uses find_or_create to first search for the name, then retrieve the existing row if there is one, or create a new one otherwise. Then we increment the popularitynormally we'd think about race conditions when updating a database like this, but in this case, we know that nothing else is going to be updating the library when the script is run. We explicitly update the row in the table with a call to update. I dislike doing this, so I often tell Class::DBI to do it automatically with autoupdate:

    package CD::Artist
    use base 'MusicDB::DBI';
    _ _PACKAGE_ _->table("artist");
    _ _PACKAGE_ _->columns(All => qw/artistid name popularity/);
    _ _PACKAGE_ _->autoupdate(1);

Now we can dispense with the update callsupdates to accessors are instantly reflected in the database.

Class::DBI often wants me to set up things by hand that the computer should be able to do for me. For instance, I feel I shouldn't have to specify the columns in the table. Thankfully, there are numerous database-specific extensions for Class::DBI on CPAN that know how to interrograte the database for this information:

    package CD::DBI;
    use base 'Class::DBI::mysql';
    _ _PACKAGE_ _->connection("dbi:mysql:musicdb");

    _ _PACKAGE_ _->autoupdate(1);

    package CD::Artist;
    use base 'CD::DBI';
    _ _PACKAGE_ _->set_up_table("artist");

This uses the mysql extension to query the database for the columns in the table.

Once we've set up all our tables, we can start declaring the relationships between them. Relationships

Class::DBI supports several types of database relationships. The two most common are has_a and has_many. It also allows you to use or write plug-in modules to declare other relationship types.

The diagram in Figure 4-1 illustrates the difference between has_a and has_many.

Figure 4-1. has_a versus has_many

We've already seen the use of a has_a relationship between CDs and artistseach CD has_a artist. We've also already written some code to implement a nice Perlish interface to it: when we ask a CD object for its artist, it takes the artist's primary key, finds the row in the artist table with that ID, and returns the appropriate object. However, in Class::DBI, instead of writing our own accessor, we just declare the relationship:

    CD->has_a(artist => "CD::Artist");
    CD::Track->has_a(song => "CD::Song");
    # ...

The nice thing about this is that we can also declare relationships to classes that are not Class::DBI based but that follow the same general pattern: find the column in the database, do something to it, and turn it into an object. For instance, the publishdate column needs to be turned into a Time::Piece object:

    CD->has_a(publishdate => 'Time::Piece',
                  inflate => sub { Time::Piece->strptime(shift, "%Y-%m-%d") },
                  deflate => 'ymd',

As before, we relate a column to a class, but we also specify a subroutine that goes from the data in the database to an object, and a method to go the other way, to serialize the object back into the database.

A has_many relationship is also easy to set up; instead of writing the tracks accessor as we did before, we ask Class::DBI to do it for us:

    CD->has_many(tracks => "CD::Track");

Now, for instance, to dump all the tracks in the database, we can say:

    for my $cd (CD->retrieve_all) {
        print "CD: ".$cd->title."\n";
        print "Artist: ".$cd->artist->name."\n";
        for my $track ($cd->tracks) {
            print "\t".$track->song->name."\n";
        print "\n\n";

For more complex relationships, such as the way an artist is either a person or a group, we can use a plug-in relationship like Class::DBI::Relationship::IsA:

    use Class::DBI::Relationship::IsA;
    CD::Artist->is_a(person       => 'CD::Person');
    CD::Artist->is_a(artistgroup  => 'CD::Artistgroup');

The is_a relationship does the right thing: it inherits the accessors of the class that we're referring to. If we ask a CD::Artist for haircolor, it transforms this into a call to $artist->person->haircolor.

Plug-in relationships for Class::DBI are a relatively new concept, and there are not many on CPAN at the moment. HasVariant allows you to use one column to inflate to more than one kind of object; so, for instance, you could have your $cd->artist return a CD::Person or CD::Artistgroup directly depending on the data in the column. There's also HasManyOrdered, which is similar to has_many but allows you to specify how the results should be returned; we should, for instance, ensure that the tracks returned by $cd->tracks are returned in their track number on the CD. Class::DBI extensions

The other great thing about Class::DBI is that there are so many additional modules that make it easier to use. For instance, in the same way that Class::DBI::mysql asked the database for its rows, you can set up all your classes at once by asking the database for its tables as well. The Class::DBI::Loader module does just this:

    my $loader = Class::DBI::Loader->new(
        dsn => "dbd:mysql:music",
        namespace => "MusicDB"

With our database, this will set up classes called MusicDB::CD, MusicDB::Artist, and so on. All we need to do is set up the reltionships between the classes.

For very simple relationships, Class::DBI::Loader::Relationship can help set these up as well:

    $loader->relationship("a cd has an artist");
    $loader->relationship("a cd has tracks");
    # ...

There's also Class::DBI::DATA::Schema to define database tables from schemas placed in the DATA section of a class, Class::DBI::Plugin::RetrieveAll adds the functionality to easily do a SELECT * with various ordering and restrictions, and we'll meet a few more plug-in classes later in the chapter.

4.4.3. Other Contenders

I've just demonstrated Class::DBI here, but there are many more object-relational mapping tools on CPAN. I believe that Class::DBI has the cleanest and the simplest interface, which makes it ideal for demonstrating the principles of object-relational mapping, but there are those who would contend that this simplicity limits what it can do. Some of the other tools available make different trade-offs between complexity and power.

For instance, one limitation of Class::DBI is the difficulty of creating complex multitable joins that are executed in one SQL statement, letting the database do the work. Class::DBI leaves it to programmers to do this kind of work in Perl or build their own abstracted SQL using Class::DBI hooks and extensions. On the other hand, something like DBIx::SearchBuilder excels at constructing SQL in advance. SearchBuilder is the foundation of the Request Tracker problem tracking system, perhaps one of the most widely deployed and complex enterprise Perl applications; so SearchBuilder is clearly up to the job.

Other modules you should know about include SPOPS and Alzabo, both mature and fully featured relational mappers. There's also interesting work going on in Class::PINT to apply Tangram-style object persistence on top of Class::DBI.

    Table of Contents
    © 2000- NIV