=head1 NAME
XML::Records - Perlish record-oriented interface to XML
=head1 SYNOPSIS
use XML::Records;
my $p=XML::Records->new('data.lst');
$p->set_records('credit','debit');
my ($t,$r)
while ( (($t,$r)=$p->get_record()) && $t) {
my $amt=$r->{Amount};
if ($t eq 'debit') {
...
}
}
=head1 DESCRIPTION
XML::Records provides a simple interface for reading "record-structured"
XML documents, that is, documents in which the immediate children of the
root element form a sequence of identical and independent sub-elements such
as log entries, transactions, etc., each of which consists of "field" child
elements or attributes. XML::Records allows you to access each record as a
simple Perl hash.
=head1 METHODS
=over 4
=item $reader=XML::Records->new(source, [options]);
Creates a new reader object
I<source> is either a reference to a string containing the XML, the name of
a file containing the XML, or an open IO::Handle or filehandle glob
reference from which the XML can be read.
The I<Option>s can be any options allowed by XML::Parser and
XML::Parser::Expat, as well as two module-specific options:
=over 4
=item I<Latin>
If set to a true value, causes Unicode characters in the range 128-255 to
be returned as ISO-Latin-1 characters rather than UTF-8 characters.
=item I<Catalog>
Specifies the URL of a catalog to use for resolving public identifiers and
remapping system identifiers used in document type declarations or external
entity references. This option requires XML::Catalog to be installed.
=back
=item $reader->set_records(name [,name]*);
Specifies what XML element-type names enclose records.
=item ($type,$record)=$reader->get_record([name [,name]*]);
Retrieves the next record from the input, skipping through the XML input
until it encounters a start tag for one of the elements that enclose
records. If arguments are given, they will temporarily replace the set of
record-enclosing elements. The method will return a list consisting of the
name of the record's enclosing element and a reference to a hash whose keys
are the names of the record's child elements ("fields") and whose values
are the fields' contents (if called in scalar context, the return value
will be the hash reference). Both elements of the list will be undef if no
record can be found.
If a field's content is plain text, its value will be that text.
If a field's content contains another element (e.g. a <customer> record
contains an <address> field that in turn contains other fields), its value
will be a reference to another hash containing the "sub-record"'s fields.
If a record includes repeated fields, the hash entry for that field's
name will be a reference to an array of field values.
Attributes of record or sub-record elements are treated as if they were
fields. Attributes of field elements are ignored. Mixed content (fields
with both non-whitespace text and sub-elements) will lead to unpredictable
results.
Records do not actually need to be immediately below the document
root. If a <customers> document consists of a sequence of <customer>
elements which in turn contain <address> elements that include further
elements, then calling get_record with the record type set to "address"
will return the contents of each <address> element.
=back
=head1 EXAMPLE
Print a list of package names from a (rather out-of-date) list of XML
modules:
#!perl -w
use strict;
use XML::Records;
my $p=XML::Records->new('modules.xml') or die "$!";
$p->set_records('module');
while (my $record=$p->get_record()) {
my $pkg=$record->{package};
if (ref $pkg eq 'ARRAY') {
for my $subpkg (@$pkg) {
print $subpkg->{name},"\n";
}
}
else {
print $pkg->{name},"\n";
}
}
=head1 RATIONALE
XML::RAX, which implements the proposed RAX standard for record-oriented
XML access, does most of what XML::Records does, but its interface is not
very Perlish (due to the fact that RAX is a language-independent interface)
and it cannot cope with fields that have sub-structure (because RAX itself
doesn't address the issue).
XML::Simple can do everything that XML::Records does, at the expense of
reading the entire document into memory. XML::Records will read the entire
document into a single hash if you set the root element as a record type,
but you're really better off using XML::Simple in that case as it's
optimized for such usage.
=head1 AUTHOR
Eric Bohlman (ebohlman@earthlink.net, ebohlman@omsdev.com)
=head1 COPYRIGHT
Copyright 2001 Eric Bohlman. All rights reserved.
This program is free software; you can use/modify/redistribute it under the
same terms as Perl itself.
=head1 SEE ALSO
XML::Parser
XML::RAX
XML::Simple
XML::Catalog
perl(1).
=cut