Dan Brian's XML::SimpleObject starts the tour of object models for XML trees. It takes the structure returned by XML::Parser in tree mode and changes it from a hierarchy of lists into a hierarchy of objects. Each object represents an element and provides methods to access its children. As with XML::Simple, elements are accessed by their names, passed as arguments to the methods.
Let's see how useful this module is. Example 6-5 is a silly datafile representing a genealogical tree. We're going to write a program to parse this file into an object tree and then traverse the tree to print out a text description.
<ancestry> <ancestor><name>Glook the Magnificent</name> <children> <ancestor><name>Glimshaw the Brave</name></ancestor> <ancestor><name>Gelbar the Strong</name></ancestor> <ancestor><name>Glurko the Healthy</name> <children> <ancestor><name>Glurff the Sturdy</name></ancestor> <ancestor><name>Glug the Strange</name> <children> <ancestor><name>Blug the Insane</name></ancestor> <ancestor><name>Flug the Disturbed</name></ancestor> </children> </ancestor> </children> </ancestor> </children> </ancestor> </ancestry>
Example 6-6 is our program. It starts by parsing the file with XML::Parser in tree mode and passing the result to an XML::SimpleObject constructor. Next, we write a routine begat( ) to traverse the tree and output text recursively. At each ancestor, it prints the name. If there are progeny, which we find out by testing whether the child method returns a non-undef value, it descends the tree to process them too.
use XML::Parser; use XML::SimpleObject; # parse the data file and build a tree object my $file = shift @ARGV; my $parser = XML::Parser->new( ErrorContext => 2, Style => "Tree" ); my $tree = XML::SimpleObject->new( $parser->parsefile( $file )); # output a text description print "My ancestry starts with "; begat( $tree->child( 'ancestry' )->child( 'ancestor' ), '' ); # describe a generation of ancestry sub begat { my( $anc, $indent ) = @_; # output the ancestor's name print $indent . $anc->child( 'name' )->value; # if there are children, recurse over them if( $anc->child( 'children' ) and $anc->child( 'children' )->children ) { print " who begat...\n"; my @children = $anc->child( 'children' )->children; foreach my $child ( @children ) { begat( $child, $indent . ' ' ); } } else { print "\n"; } }
To prove it works, here's the output. In the program, we added indentation to show the descent through generations:
My ancestry starts with Glook the Magnificent who begat... Glimshaw the Brave Gelbar the Strong Glurko the Healthy who begat... Glurff the Sturdy Glug the Strange who begat... Blug the Insane Flug the Disturbed
We used several different methods to access data in objects. child( ) returns a reference to an XML::SimpleObject object that represents a child of the source node. children( ) returns a list of such references. value( ) looks for a character data node inside the source node and returns a scalar value. Passing arguments in these methods restricts the search to just a few matching nodes. For example, child( 'name' ) specifies the <name> element among a set of children. If the search fails, the value undef is given.
This is a good start, but as its name suggests, it may be a little too simple for some applications. There are limited ways to access nodes, mostly by getting a child or list of children. Accessing elements by name doesn't work when more than one element has the same name.
Unfortunately, this module's objects lack a way to get XML back out, so outputting a document from this structure is not easy. However, for simplicity, this module is an easy OO solution to learn and use.
Copyright © 2002 O'Reilly & Associates. All rights reserved.