NAME
ODF::MailMerge - "Mail Merge" or just substitute tokens in ODF documents
SYNOPSIS
use ODF::lpOD;
use ODF::lpOD_Helper;
use ODF::MailMerge qw/replace_tokens/;
my $doc = odf_get_document("/path/to/file.odt");
my $body = $doc->get_body;
# Simple replacement of '{who}', '{last words}' and '{zzz}'
# everywhere in the document.
my $hash = {
who => "John Brown",
'last words' => [
[color => "#50FFEE", "bold"],
" I deny everything but...the design on my part to free the slaves."
],
zzz => \&callback,
};
replace_tokens($body, $hash);
# Mail-merge:
# 1. Find the prototype table containing the token "{mmproto}".
# 2. Replace tokens in that table using data from a spreadsheet,
# replicating the table as many times as necessary for all rows.
#
my $engine = ODF::MailMerge::Engine->new($body, "{mmproto}");
use Spreadsheet::Edit qw/read_spreadsheet apply %crow/;
read_spreadsheet "/path/to/data.xlsx!Sheet1";
apply {
$engine->add_record(\%crow); # %crow is a tied hash to current row
};
$engine->finish();
$doc->save(target => "/path/to/output.odt");
DESCRIPTION
This tool uses ODF::lpOD and ODF::lpOD_Helper to patch ODF documents. Token strings of the form "{key}" or "{key:modifiers...}" are replaced with values from a hash indexed by "key".
Optional :modifiers within tokens can change the value actually substituted or have side-effects such as removing lines when there is no value to substitute.
A "mail merge" function replicates a template object (e.g. table or section) as many times as needed to plug in values from multiple data records.
THE PARADIGM
First, manually create a prototype ODF document using e.g. LibreOffice, containing static content and {tokens} to be interpolated, formatted as desired. To use "mail merge", create a table or other ODF construct which represents a single entry or record, with {token}s where data values should be plugged in.
Substituted values will have the same formatting as the tokens which were replaced. This is quite powerful.
For example, to generate a multi-column "member directory", create a prototype table with tokens like {Name}, {Address}, etc. using any desired styles; place that table in a Section with the desired number of columns.
When processed, the table will be cloned and appended within it's Section, flowing into successive columns and new pages as needed. The prototype table's properties can be set to prevent breaking entries at column/page boundaries, and control borders, inter-entry spacing, etc.
SIMPLE SUBSTITUTION
$count = replace_tokens($context, $hash);
This function replaces tokens without using the mail-merge mechanism.
$context is the document body or any descendant; $hash maps token names to replacement values as described at "TOKEN REPLACEMENT".
All instances of tokens in $context are replaced if their names exist in %$hash. Token names not in %$hash are left as-is unless the hash contains a '*' wildcard entry.
MAIL MERGE OVERVIEW
The essential "mail merge" capabilities are:
A template of some kind specifies how to display data from one database record, with db field references where field values should be plugged in. That template is copied as many times as there are database records, plugging in specific values for the field references.
One or more fields may have *no* value in a particular record, in which case the containing row, paragraph etc.can be deleted to avoid leaving undesirable blank space. For example a mailing list may allow a secondary addressee line which most of the time is not used.
One or more fields may have *multiple* values. In that case part of the containing row, paragraph, etc. can be replicated to accommodate extra values for the same field. For example a personnel directory may allow each person to have any number of telephone numbers.
MAIL MERGE API
ODF::MailMerge does not care where the data comes from, as long as you can provide a hash table which maps token names to values for a particular record.
The example in the SYNOPSIS reads a spreadsheet using Spreadsheet::Edit, which provides just such a hash via the tied variable "%crow" (current row); this hash maps column titles (among other things) to data values in the row being visited by 'apply'. Therefore tokens {Name} and {Address} would be replaced by appropriate values from the "Name" and "Address" columns.
$engine = ODF::MailMerge::Engine->new($context, $proto_tag);
Create a new mail-merge engine which will replicate the protototype table containing $proto_tag. Currently the proto object must be a table but sections and other ODF text wrappers may be supported later.
$context is usually the document body e.g. $doc->get_body.
$proto_tag is a tag used to locate the prototype object within $context. The tag may appear anywhere within the object and will be deleted (and so has no effect on the final result).
$engine->add_record($hashref);
The prototype object is first cloned and appended to any previous copies.
Then all {key} or {key:modifier...} strings in the clone are replaced by looking up "key" in the specified hash as described at "TOKEN REPLACEMENT" below. An exception occurs if an unhandled token is found.
$engine->finish();
This must be called after the last add_record to clean up. It deletes the prototype, leaving behind only the clones with instantiated values.
TOKEN REPLACEMENT
In the hash you provide, keys are token names without the curly brackets or :modifiers. For example, the key "First Name" would be used for token "{First Name}" or "{First Name:...}" .
The hash key '*' is a wildcard, used if there is no entry for a token name.
Token names may contain internal spaces but leading and trailing spaces around the name (but not inside :modifiers) are ignored. Literal : { or } characters must be backslashed i.e. \: \{ or \}.
A hash value may be:
* "string" - a replacement value string
* [[Style info], "string", ...] - a styled replacement value
* [list of possibly-multiple replacement values]
* CODE ref - a callback (see "CALLBACKS")
[Styled content] values
See ODF::lpOD_Helper for details. In brief, these are refs to arrays containing [style spec] sub-arrays and plain strings, where a [style spec] describes a local style to be applied to the immediately following text string. As used here, the first item must be a [style spec] sub-array.
For example [[color => "red", "bold"], "John Brown"] means substitute "John Brown" in red, bold text, overriding the style of the {token}. Multiple pairs describe adjacent but differently-styled segments.
Styled values are not needed unless you must override the original style of the {token}.
Token :modifiers
:modifiers appended to a token name change the replacement value or have other effects. For example {Address:nb} would be replaced by the value given by $hash->{Address} with all regular spaces replaced by non-breaking spaces.
The standard :modifiers are
:nb - Convert spaces to non-breaking
:unfold - Convert embedded newlines to spaces
:breakmulti - Append newline if the value contains embedded newlines.
:span - (only in a table cell) Span the cell down over cells below
which are empty. To be useful, the content should have
Format->align text->Center so it can float.
:delempty - Delete the containing row, frame, or paragraph if the
token value is empty ("")
:rep_first, :rep_notfirst :rep_mid, :rep_last :rep=<formula>
- See below. Allows advanced control when rows, etc.
are being replicated to accommodate a multi-valued token.
Eliding Empty Lines (:delempty)
This modifier deletes the containing row (frame, paragraph, etc.) if all tokens in the row with the :delemtpy modifier have an empty value ("").
Note that the row, etc. is deleted even if other tokens without :delempty exist in the row and have non-empty values.
Multi-value tokens
If a token has multiple values, then the containing row, frame or paragraph is replcated.
NOTE: Where the following documentation refers to replicating "rows" it means the appropriate ODF object type.
Replicating rows with more than one token
A row is replicated enough times for the token with the most values. Tokens which have fewer values are instantiated in the initial rows and empty values ("") substituted in later rows. For example, given
┌──────────────┬────────────────┬───────────────────────┐
│{Name} │ {Phone} │ {Email} │
└──────────────┴────────────────┴───────────────────────┘
if the {Phone} token had four values and {Email} had two, the result would be four copies of the row, looking like this:
┌──────────────┬────────────────┬───────────────────────┐
│John Hancock │ (415) 555-1212 │ j.hancock@gmail.com │
├──────────────┼────────────────┼───────────────────────┤
│ │ (650) 555-1212 │ j.hancock@hotmail.com │
├──────────────┼────────────────┼───────────────────────┤
│ │ (800) 555-1212 │ │
├──────────────┼────────────────┼───────────────────────┤
│ │ (900) 888-7777 │ │
└──────────────┴────────────────┴───────────────────────┘
Next we'll see how to improve this by by eliminating interior borders;
Three additional template rows, which can have different formatting, may be provided which are used for the first, middle and last rows, as indicating by :rep_first, :rep_mid and :rep_last modifiers coded in any token:
┌──────────────┬────────────────┬───────────────────────┐
│{Name} │ {Phone} │ {Email} │
└──────────────┴────────────────┴───────────────────────┘
┌──────────────┬────────────────┬───────────────────────┐
│{Name} │ {Phone} │ {Email:rep_first} │
╵ ╵ ╵ ╵
╷ ╷ ╷ ╷
│{Name} │ {Phone} │ {Email:rep_mid} │
╵ ╵ ╵ ╵
╷ ╷ ╷ ╷
│{Name} │ {Phone} │ {Email:rep_last} │
└──────────────┴────────────────┴───────────────────────┘
(The extra space between rows is just for illustration to show the absent horizontal borders). The result after substitution:
┌──────────────┬────────────────┬───────────────────────┐
│John Hancock │ (415) 555-1212 │ j.hancock@gmail.com │
│ │ (650) 555-1212 │ j.hancock@hotmail.com │
│ │ (800) 555-1212 │ │
│ │ (900) 888-7777 │ │
└──────────────┴────────────────┴───────────────────────┘
The specialzed template rows, if present, must immediately follow the "main" template row which has no :rep* modifiers.
In this example the "main" template row is not used and is not instantiated in the result. If there was only a single value for each token then the "main" template row would be used and the specialised templates ignored.
:rep=EXPR indicates the template row etc. should be used when the Perl EXPR is true. EXPR may onle reference variables $i (the replicate index, starging with 0) of $N (the total number of replciates). For example :rep=$i==1 is equivalent to :rep_first.
The following five template rows could be used to eliminate interior borders like in the above example, but also alternate colors or other formatting of odd & even rows:
┌───────────────────────────────────────────────────────────┐
│EVEN (first) {Token Name:rep_first} │
╵ ╵
╷ ╷
│EVEN (middle) {Token Name:rep=$i>0 && $i<$N && ($i%2)==0} │
╵ ╵
╷ ╷
│ODD (middle) {Token Name:rep=$i>0 && $i<$N && ($i%2)==1} │
╵ ╵
╷ ╷
│EVEN (last) {Token Name:rep=$i == N && ($i % 2)==0} │
└───────────────────────────────────────────────────────────┘
╷ ╷
│ODD (last) {Token Name:rep=$i == N && ($i % 2)==1} │
└───────────────────────────────────────────────────────────┘
CALLBACKS
If a hash value is a sub reference, the sub is called with args
($token_name, $token, $para, $custom_mods)
$token_name is the just the name e.g. "foo" in "{foo:modifiers...}".
$token is the complete "{tokenname...}" string being replaced
$para is the paragraph containing the token.
$custom_mods is a ref to an array of unrecognized :modifier strings (excluding the ':') found in the token. It is up to your code to do what it wants with them. Note: An exception occurs if unrecognized :modifiers are encountered when a callback is not being used.
The callback's return values indicate whether and how to replace the token. The protocol uses the Hr_* constants exported by ODF::lpOD_Helper:
return(Hr_SUBST, <value>)
<value> may be any of the allowed hash values (except for a callback). If a [list of values] is returned and there is actually more than one value, then the containing row will be replicated as described a "Multi-value tokens".
return(0)
The token is not replaced, but left as-is, and processing continues. This only makes sense if the token will somehow be processed later, for example via a separate call to replace_tokens.
COMPLETE EXAMPLE
A complete example application is included in the ODF::lpOD_Helper distribution, usually installed at .../site_perl/5.xx.yy/auto/share/dist/ODF-MailMerge/examples/.
To display the path in your installation, run
perl -MODF::MailMerge=:all -wE 'say odfmm_example_path'
SEE ALSO
ODF::lpOD_Helper, Sreadsheet::Edit
AUTHOR
Jim Avera (jim.avera at gmail)
LICENSE
CC0 1.0 / Public Domain. However this requires ODF::lpOD to function so as a practical matter you must comply with ODF::lpOD's license.