Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

OpenOffice::OODoc
The Perl Open OpenDocument Connector

OpenOffice::OODoc - The Perl Open OpenDocument Connector


NAME

OpenOffice::OODoc - The Perl Open OpenDocument Connector


SYNOPSIS


        use OpenOffice::OODoc;

                        # get global access to the content of an OOo file

        my $document = ooDocument(file => "MyFile.odt");

                        # select a text element containing a given string

        my $place = $document->selectElementByContent("my search string");

                        # insert a new text element before the selected one

        my $newparagraph = $document->insertParagraph

                        (

                        $place,

                        position        => 'before',

                        text            => 'A new paragraph to be inserted',

                        style           => 'Text body'

                        );

                        # define a new graphic style, to display images

                        # with 20% extra luminance and color inversion

        $document->createImageStyle

                        (

                        "NewImageStyle",

                        properties      =>

                                {

                                'draw:luminance'        => '20%',

                                'draw:color-inversion'  => 'true'

                                }

                        );

                        # import an image from an external file, attach it

                        # to the newly inserted paragraph, to be displayed

                        # using the newly created style

        $document->createImageElement

                        (

                        "Image1",

                        style           => "NewImageStyle",

                        attachment      => $newparagraph,

                        import          => "D:\Images\Landscape.jpg"

                        );

                        # save the modified document

        $document->save;


DESCRIPTION

This toolbox is an extensible Perl interface allowing direct read/write operations on files which comply with the OASIS Open Document Format for Office Applications (ODF), i.e. the ISO/IEC 26300:2006 standard.

It provides a high-level, document-oriented language, and isolates the programmer from the details of the file format. It can process different document classes (texts, spreadsheets, presentations, and drawings). It can retrieve or update styles and images, document metadata, as well as text content.

OpenOffice::OODoc is designed for data retrieval and update in existing documents, as well as full document generation.


HOW TO USE THE DOCUMENTATION

The present chapter, then the OpenOffice::OODoc::Intro one, should be read before any attempt to dig in the detailed documentation.

The reference manual is provided in several separate chapters as described below.

The OpenOffice::OODoc documentation, as the API itself, is distributed amongst several manual pages on a thematic and technical basis. The present page is a general foreword only.

Each manual page correspond to a Perl module, with the exception of OpenOffice::OODoc::Intro. It's strongly recommended to have a look at the Intro, and to read the examples, before any other manual chapter, in order to get a quick and practical knowledge of the big picture. Another possible introductory reading has been published in The Perl Review (issue #3.1, dec. 2006) http://www.theperlreview.com, while an alternative presentation article, intended for French-reading users, can be downloaded at http://jean.marie.gouarne.online.fr/doc/perl_odf_connector.pdf

The API is object-oriented and, with the exception of the main module (OpenOffice::OODoc itself), each module defines a class. The features of each module are documented in a manual page with the same name. But, while some classes inherit from other ones, they bring a lot of features that are not documented in the corresponding manual page. The best example is OpenOffice::OODoc::Document: it contains a few method definitions by itself, but it's the most powerful class, because it inherits from four other classes, so it's features are documented in five manual pages. Fortunately, the classes are defined on a functional basis. So, for example, to know the text-related capabilities of a Document object, the user should select the Text manual page before the Document one.

The detailed documentation of the API is distributed according to the following list:

OpenOffice::OODoc

The present manual page contains (in the GENERAL FUNCTIONS section below) the description of a small number of miscellaneous functions, dedicated to control some general parameters, to create the main objects of the applications, or to provide the user with some basic utilities.

OpenOffice::OODoc::File

This manual page contains detailed information about the physical access to the OpenOffice.org files. In some simple applications, this page can be ignored without risk.

OpenOffice::OODoc::XPath

It describes all the common features, that are provided by the corresponding class, and available in every other class with the exception of OODoc::File. This manual page describes the low level, XPath-based XML API of OpenOffice::OODoc. It can be necessary for advanced applications, but can be ignored at first look. However, the Text, Image, Styles, Document and Meta objects inherit all the features of the XPath object, so this manual page can be useful even if the user don't need to work with explicit XPath objects.

OpenOffice::OODoc::Text

This manual page describes all the high level text processing methods and allows the user's program to deal with all the text containers (headers, paragraphs, item lists, tables, and footnotes). It can deal with any text content in any OOo document, and not only in Writer documents (a special mapping allows the programmer to address rows and cells in the same way in spreadsheets as in the tables belonging to other documents).

OpenOffice::OODoc::Image

This manual page describes all the graphics manipulation API, i.e. all the available syntax dedicated to insert or remove images in the documents, and to control the presentation of these images.

OpenOffice::OODoc::Styles

This manual page describes the methods to be used to control the styles of a document, knowing that each page layout, each text element, and each image is displayed or printed according to a style. This part of the documentation can be ignored if the user's programs are strictly content- focused and don't care with the presentation.

OpenOffice::OODoc::Document

This manual page describe some miscellaneous methods that deal simultaneously with text, presentation and/or images. So, in order to discover the capabilities of a ``Document'' object (created with ooDocument), the user should use the Text, Image, Styles AND Document manual pages. The OpenOffice::OODoc::Document class inherits all the features provided by the other classes with the exceptions of OpenOffice::OODoc::File and OpenOffice::OODoc::Meta.

OpenOffice::OODoc::Meta

This manual page describes all the available methods to be used in order to control the global properties (or ``metadata'') of a document. Most of these properties are those an end-user can get or set through the ``File/Properties'' command with the OpenOffice.org desktop software.

OpenOffice::OODoc::Manifest

This manual page describes the manifest management API, knowing that the manifest, in an OpenOffice.org file, contains the list of the file components (or ``members'') and the media type (or MIME) of each one. The text content, the style definitions, the embedded images, etc. are each one stored as a separate ``member''.


GENERAL FUNCTIONS

odfConnector()


        See ooDocument().

odfContainer()


        See ooFile().

odfDecodeText($ootext)


        Returns the translation of a raw OpenOffice.org (UTF-8) in

        the local character set. While the right translation is automatically

        done by the regular text read/write methods of OpenOffice::OODoc, this

        function is useful only if the user's application needs to bypass the

        API.

odfEncodeText($ootext)


        Returns the translation of an application-provided string,

        made of local characters, in an OpenOffice.org (UTF-8) string.

        The given string must comply with the active local encoding (see

        odfLocalEncoding()). While the right translation is automatically done

        by the regular text read/write methods of OpenOffice::OODoc, this

        function is useful only if the user's application needs to bypass the

        API.

odfLocalEncoding([character_set])


        Accessor to get/set the user's local character set

        (see $OpenOffice::OODoc::XPath::LOCAL_CHARSET in the

        OpenOffice::OODoc::XPath man page).

        Example:

                $old_charset = odfLocalEncoding();

                odfLocalEncoding('iso-8859-15');

        If the given argument is an unsupported encoding, an error

        message is produced and the old encoding is preserved. So

        this accessor is safer than a direct update of the

        $OpenOffice::OODoc::XPath::LOCAL_CHARSET variable.

        The default local character set is fixed according to the

        "OODoc/config.xml" file of your local OpenOffice::OODoc installation

        (see readConfig() below), or to "iso-8859-1" if this file is missing

        or doesn't say anything about the local character set. By calling

        ooLocalEncoding() with an argument, the user's programs can override

        this default.

        Note: the user can override this setting for a particular document,

        using the 'local_encoding' property of the document object (see the

        OpenOffice::OODoc::XPath manual page).

        See the Encode::Supported (Perl) documentation for the list

        of supported encodings.

odfPackage()


        See ooFile().

odfReadConfig([filename])


        Creates or resets some variables of the API according to the

        content of an XML configuration file. Without argument, this

        function looks for 'OODoc/config.xml' under the installation

        directory of OpenOffice::OODoc. In any case, the provided file

        must have the same XML structure as the config.xml file included

        in the distribution, so:

        <?xml version="1.0" encoding="UTF-8"?>

        <config>

            <OpenOffice-OODoc>

                <XPath-LOCAL_CHARSET>my_charset</XPath-LOCAL_CHARSET>

                <Styles-COLORMAP>my_colormap_file</Styles-COLORMAP>

                <File-WORKING_DIRECTORY>my_path</File-WORKING_DIRECTORY>

                <File-DEFAULT_OFFICE_FORMAT>2</File-DEFAULT_OFFICE_FORMAT>

                <INSTALLATION_DATE>my_oo_date</INSTALLATION_DATE>

            </OpenOffice-OODoc>

        </config>

        In the example above, "my_oo_date" should be replaced by a regular

        ISO-8601 date (YYYY-MM-DDThh:mm:ss).

        Elements out of the <OpenOffice-OODoc> element are ignored.

        Any element included in <OpenOffice-OODoc> sets or update a variable

        with the same name and the given value in the space of the

        OpenOffice::OODoc package. So, for example an element like

                <strange_thing>a strange value</strange_thing>

        will make a new $OpenOffice::OODoc::strange_thing variable,

        initialized with the string "a strange value", available for any

        program using OpenOffice::OODoc.

        Attributes and sub-elements are ignored.

        Strings with characters larger than 7 bits must be encoded in UTF-8.

        Any '-' character appearing in the name of an element is replaced

        by '::' in the name of the corresponding variable, so, for example,

        the <XPath-LOCAL_CHARSET> element controls the initial value of

        $OpenOffice::OODoc::XPath::LOCAL_CHARSET.

        All the variables defined in this file, are the file itself, are

        optional.

        The <INSTALLATION_DATE> element is not used by the API; it's provided

        for information only. It allows the user to get (in OpenOffice format)

        the date of the last installation of OpenOffice::OODoc, through the

        variable $OpenOffice::OODoc::INSTALLATION_DATE. In the default

        config.xml provided with the distribution, this element contains the

        package generation date.

        This function is automatically executed as soon as OpenOffice::OODoc

        is used, if the OODoc/config.xml configuration file exists.

odfTemplatePath([path])


        Shortcut for OpenOffice::OODoc::File::templatePath().

        Accessor to get/set an alternative path for the XML template files

        used to create new documents. See the manual page for the

        OpenOffice::OODoc::File module.

odfWorkingDirectory([path])


        Accessor to get/set the working directory to use for temporary

        files. Short-lived temporary files are generated each time the save()

        function (see OpenOffice::OOdoc::File) is called. If case of success,

        these files are automatically removed when the call returns, so the

        user can't view them. If something goes wrong during the I/O

        processing, the temporary files remain available for debugging. In any

        case, a working directory is necessary to create or update documents.

        However, OpenOffice::OODoc can be used without available working

        directory in a read-only application.

        The default working directory depends on the "OODoc/config.xml" file

        of your local OpenOffice::OODoc installation. If this file is missing

        or if it doesn't contain a <File-WORKING_DIRECTORY> element, the

        working directory is "." (i.e. the current working directory of the

        user's application).

        If an argument is given, it replaces the current working

        directory.

        A warning is issued if the (existing or newly set) path is not

        a directory with write permission. After this warning, the user's

        application can run, but any attempted file update or creation

        fails.

        This accessor sets only the default working directory for the

        application. A special, separate working directory can be set

        for each OOo document (see the manual page for OpenOffice::OODoc::File

        for details, if needed).

        CAUTION: a odfWorkingDirectory() call can't change the working

        directory of a previously created File object. So, consider the

        following code sequence:

                my $doc0 = ooDocument(file => 'doc0.odt');

                odfWorkingDirectory('C:\TMP');

                my $doc1 = ooDocument(file => 'doc1.odt');

        In this example, all the write operations related to the $doc0

        document will use the default working directory, while the ones

        related to $doc1 will use "C:\TMP".

ooDocument()


        Shortcut for OpenOffice::OODoc::Document->new().

        This function is the most general document constructor. It creates

        and returns a new Document object. It can be instantiated on the basis of

        an existing OpenOffice.org file, or using XML, OpenOffice-compliant

        data previously loaded in memory. With an appropriate "create"

        parameter, it can be used in order to create a new document from scratch

        as well. The Document class provides methods allowing a lot of read/update

        operations in the text content, the graphics, and the presentation.

        So ooDocument() is the recommended first call to get access to a document

        for further processing.

        See the OpenOffice::OODoc::Document manual page for detailed syntax.

ooDecodeText()


        See odfDecodeText().

ooEncodeText()


        See odfEncodeText().

ooFile()


        Shortcut for OpenOffice::OODoc::File->new().

        This function returns a File object, that is the object representation

        of the physical file containing the text, the images and the style

        definitions of an OpenOffice.org document.

        See the OpenOffice::OODoc::File manual page for detailed syntax.

        See the OpenOffice::OODoc::Intro manual page to know why, in some

        situations, the using applications need or don't need to deal with

        explicit File objects.

ooImage()


        Shortcut for OpenOffice::OODoc::Image->new().

        This function returns a Image object, that brings a subset of the

        Document object. Il can be used in place of ooDocument() if the

        calling application needs some image manipulation methods only.

        See the OpenOffice::OODoc::Image manual page for detailed syntax.

ooLocalEncoding()


        See odfLocalEncoding().

ooManifest()


        Short cut for OpenOffice::OODoc::Manifest->new().

        This function returns a Manifest object, giving access to the

        meta-information of the physical archive containing the document.

ooMeta()


        Shortcut for OpenOffice::OODoc::Meta->new().

        This function returns a Meta object. Such an object represents the

        global properties, or "metadata", of a document. It brings a set of

        accessors allowing the user to get or set some properties such as

        the title, the keyword, the description, the creator, etc.

        See the OpenOffice::OODoc::Meta manual page for details.

ooReadConfig()


        See odfReadConfig().

ooStyles()


        Shortcut for OpenOffice::OODoc::Styles->new().

        This function returns a Style object, that brings a subset of the

        Document object. In can be used in place of ooDocument() if the

        calling application needs some style/presentation manipulation

        methods only. Note the 's' at the end of 'Styles': this object doesn't

        represent a particular style; it represents a set of styles related

        to a document.

        See the OpenOffice:OODoc::Styles manual page for detailed syntax.

ooTemplatePath()


        See odfTemplatePath().

ooText()


        Shortcut for OpenOffice::OODoc::Text->new().

        This function returns a Text object, that brings a subset ot the

        Document object. It can be used in place of ooDocument() if the

        calling application is only text-focused (i.e. if it doesn't need

        to deal with graphics and styles). The processed document can contain

        (and probably contains) graphics and styles, but the methods to

        process them are simply not loaded.

        See the OpenOffice::OODoc::Text manual page for detailed syntax.

ooWorkingDirectory()


        See odfWorkingDirectory().

ooXPath()


        Shortcut for OpenOffice::OODoc::XPath->new().

        This function returns an XPath object, that brings all the low level

        XML navigation, retrieve, read and write methods of the API. The XPath

        class (in the OpenOffice::OODoc context) is an OpenOffice-aware

        wrapper for the general XML::Twig API. Unless you are a very advanced

        user and you have a particular hack in mind, you should never need to

        explicitly create an XPath object. But you must know that every method

        or property of this class is inherited by the Text, Image, Styles,

        Document and Meta objects. So the knowledge of the corresponding

        manual page could be useful.

        See the OpenOffice::OODoc::XPath manual page for detailed syntax.


AUTHOR/COPYRIGHT

Developer/Maintainer: Jean-Marie Gouarne http://jean.marie.gouarne.online.fr

Contact: jmgdoc@cpan.org

Copyright 2004-2007 by Genicorp, S.A. http://www.genicorp.com

Initial English version of the reference manual by Graeme A. Hunter (graeme.hunter@zen.co.uk)

License:


        - Licence Publique Generale Genicorp v1.0

        - GNU Lesser General Public License v2.1
Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy