Reading and Writing Excel Spreadsheets

Previous Table of Contents Next

Reading and Writing Excel Spreadsheets

If PDF is the lingua franca of display formatting, then Microsoft Excel is the common tongue of spreadsheet data. Microsoft's ubiquitous Office suite means that nearly every office has some way of reading Microsoft Word documents and exchanging Microsoft Excel spreadsheets.

Most spreadsheet programs can accept a format called Comma Separated Values (CSV). This is simply a table where the values are separated by commas. An example of a CSV table is shown below.

CSV table of Academy Award winners.

Year,Best Picture,Gross,Budget,Best Actor,Best Actress,Best Director 1965,My Fair Lady,72000000,17000000,Rex Harrison,Julie Andrews ,George Cukor 1966,The Sound of Music,163214286,8200000,Lee Marvin,Julie Christie,Robert Wise 1964,Lawrence of Arabia,70000000,15000000,Gregory Peck,Anne Bancroft,David Lean 1960,Ben Hur,90000000,15000000,Charlton Heston,Simone Signoret ,William Wyler 1958,The Bridge on the River Kwai,40000000,3000000,Alec Guinness ,Joanne Woodward,David Lean 1971,Patton,61700000,12000000,George C. Scott,Glenda Jackson ,Franklin Schaffner

You could just hand this table as-is to Excel, but the results are not impressive, as shown in Figure 20.7.

Figure 20.7. Your CSV table imported into Excel.

What's wrong? For starters, the columns are smashed together, the movie grosses are not shown as monetary values, and there are no discernable column headers. It looks primitive.

Using Perl to Create a Spreadsheet

Let's make a spiffier spreadsheet with Perl. To follow this example, you'll have to install the module Spreadsheet::WriteExcel. You do not need Microsoft Excel to use the module.

Your spreadsheet should

  • Have column headers highlighted in bold.

  • Show monetary values as money.

  • Resize the column widths so they display properly.

Listing 20.6 contains a program to do just this.

Listing 20.6. Convert CSV to a Native Excel Spreadsheet

1:  #!/usr/bin/perl -w


3:  use strict;

4:  use Spreadsheet::WriteExcel;


6:  my $workbook = new Spreadsheet::WriteExcel("movies.xls");

7:  my $sheet = $workbook->add_worksheet();


9:  my $moneyFormat = $workbook->add_format();

10: $moneyFormat->set_num_format('$#,##0');

11: my $boldFormat = $workbook->add_format();

12: $boldFormat->set_bold();


14: my $row=0;

15: my @longest=(0,0,0,0,0,0,0);

16: open(MOV, "<movies.csv") || die "Can't open movies.csv: $!";

17: while(<MOV>) {

18:      chomp;

19:         my @fields = split(/,/, $_);

20:         for(my $col = 0; $col < @fields; $col++) {

21:              if ($row == 0) {

22:                      $sheet->write($row, $col, $fields[$col], $boldFormat);

23:              } else {

24:                      if ($col == 2 or $col == 3) {

25:                              $sheet->write($row, $col,

26:                                            $fields[$col], $moneyFormat);

27:                      } else {

28:                              $sheet->write($row, $col, $fields[$col]);

29:                      }

30:              }


32:              if ($longest[$col] < length($fields[$col])) {

33:                      $longest[$col] = length($fields[$col]);

34:              }

35:         }

36:      $row++;

37: }

38: $longest[2]+=3;

39: $longest[3]+=3;

40: for(my $i = 0; $i < @longest; $i++) {

41:      $sheet->set_column($i,$i,$longest[$i]);

42: }


44: close(MOV);

45: $workbook->close();

Line 6: The first step to creating a spreadsheet is calling the Spreadsheet::WriteExcel constructor. This will return a workbook object.

Line 7: A new sheet is created using the existing workbook. You can create as many sheets as needed, but this example needs only one sheet.

Lines 9Ц10: You're going to need three cell formats for this spreadsheet: normal, bold, and money. These two lines create a new format for displaying money.

Lines 11Ц12: This creates a bold format for cells. You'll apply these formats as needed.

Line 14: To write into the spreadsheet, you'll need a row and column. $row stores the current row you're writing.

Line 15: To resize the columns properly, you need to keep track of the longest field seen in each column. This array is initialized to zeroes, and as you process each field you'll edit the value to the new maximum length.

Line 17: Read a line from the CSV store into $_.

Line 19: The fields are split apart by commas and stored into @fields.

Line 20: This loop processes each column value (stored in @fields) one at a time. I'm using a for(;;) loop instead of foreach(array) because you'll need the column number for formatting.

Lines 21Ц22: If you're writing the first row of the sheet, write the value into the sheet using the formatting in $boldFormat.

Lines 24Ц28: Otherwise, write using the format in $moneyFormat (if columns #2 or #3) or no formatting at all.

Lines 32Ц34: This stores the length of the longest field for each column in @longest.

Lines 38Ц39: The monetary column lengths are increased a bit. Add the dollar sign, and a couple of commas need to be accounted for.

Lines 40Ц42: Each column width is adjusted to be the length of the longest value in that column.

Line 45: The workbook is closed. This step is important; otherwise you might wind up with a corrupt Excel file.

The resulting pretty spreadsheet is shown in Figure 20.8.

Figure 20.8. Nicely formatted Excel spreadsheet.

Reading the Spreadsheet

Now that you've created a nice Excel spreadsheet, how would you take changes and put them back into the original CSV file?

Reading the Excel spreadsheet is even easier than writing it, especially because you really don't care about formatting. The Spreadsheet::ParseExcel module enables you to read the spreadsheet as a series of rows and columns, either with or without formatting. You do not need Microsoft Excel to use this module.

Listing 20.7 shows an example to read the movies.xls file that you created with Listing 20.6 (see Figure 20.8). This example only prints the spreadsheet as CSV. If you want to write it back into the original file, feel free to make the necessary modifications.

Listing 20.7. Reading an Excel Spreadsheet

1:  #!/usr/bin/perl -w


3:  use strict;

4:  use Spreadsheet::ParseExcel;


6:  my $excelObj = new Spreadsheet::ParseExcel;

7:  my $workbook = $excelObj->Parse('movies.xls');

8:  my $sheet = $workbook->{Worksheet}[0];


10:  my $row = $sheet->{MinRow};

11:  while($row <= $sheet->{MaxRow}) {

12:      my $col = $sheet->{MinCol};

13:      my @rowval = ();

14:      while($col < $sheet->{MaxCol}) {

15:              my $cell = $sheet->{Cells}[$row][$col];

16:              push(@rowval, $cell->{Val});

17:              $col++;

18:      }

19:      print join(',', @rowval), "\n";

20:      $row++;

21: }

Line 6: This creates a parser object that you'll use to read the spreadsheet.

Previous Table of Contents Next
© 2000- NIV