Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

Alvis::Buffer
Perl extension for buffering utilities for the Alvis pipeline

Alvis::Buffer - Perl extension for buffering utilities for the Alvis pipeline


NAME

Alvis::Buffer - Perl extension for buffering utilities for the Alvis pipeline


SYNOPSIS


 use Alvis::Buffer;

 $Buffer::BUFFER = "/tmp/building.xml";

 $Buffer::verbose++;

 &Buffer::fix() or die "Cannot Buffer::fix";

 $in = new Alvis::Pipeline::Read(host => "harvester.alvis.info",

                                 port => 16716,

                                 spooldir => "/home/alvis/spool");

 while ($xml = $in->read(1)) {

     &clean_wrapping(\$xml);

     &Buffer::add($xml);

     if ( $Buffer::docs>1000 ) {

        $filename = &Buffer::save();

        if ( !$filename ) {

           &Buffer::close();

           die "Cannot Buffer::save";

        }

     }

 }

 $filename = &Buffer::save();

 &Buffer::close();


DESCRIPTION

This module provides a way of buffering Alvis XML into manageable chunks as it is read in from a pipeline (Alvis::Pipeline). Chunks can be controlled by file size or document count, but this is done externally to the module, and the module simple provides a function to save the current buffer contents.

Files of collected Alvis XML documents, with appropriate XML header and footer parts, are saved in the relative directory ``xml-add/'' under numbers 1,2,3, ... At each time of storage, the current directory is checked to see which number to use to store the latest batch. If ``xml-add/'' is empty, then ``xml/'' is checked instead. Presumably, files in ``xml-add/'' are being processed into ``xml/''.

The implementation is independent of any pipeline, and assumes a number of fixed directories. Assumes files are in UTF-8, and that documents are present in elements named <documentRecord>.


FUNCTIONS

fix()


 &Buffer::fix() or die "Cannot Buffer::fix";

Basic initialisation and checking to ensure the output buffer is OK, and have the current document count and size in memory. Returns 1 if everything is OK, else 0.

If runtime stops or aborts while the output buffer is still being built, a restart will safely recover the contents as long as no data was lost on the file.

add()


 &Buffer::add($xml);

Add an XML chunk to the current buffer and updates current document count and size in memory.

save()


 $filename = &Buffer::save();

 if ( !$filename ) {

     die "Cannot Buffer::save";

 } else {

     print STDERR "New XML file $filename saved\n";

 }

Save the current buffer into ``xml-add/'' as an appropriote integer name, such as ``xml-add/$N.xml'', where $N will be determined at the point of saving as the next biggest integer. Returns the filename used if everything is OK, else returns undef. This needs to be called explicitly so the variables $Buffer::docs and $Buffer:.size should be checked to determine when this should be done.

close()


 &Buffer::close();

Close the output buffer.


VARIABLES and PARAMETERS

Global variables are of two kinds. There are those intended to define characteristics of general use. These should be set by the user before the functions are used, but reasonable defaults are used.

BUFFER

name of the output buffer file that collects XML chunks. Don't include the randomising string ``$$'' if you want this file to be available during a restart.

HEADER

text to enter at the front of a sequence of <documentRecord> elements.

FOOTER

text to enter at the end of a sequence of <documentRecord> elements.

verbose

set to a non-zero value if more debugging shoulöd be reported to STDERR.

Then the are variables that are read-only and define current buffer statistics.

size

Count of characters in the current buffer.

docs

Count of documents, as determined by occurrences of <documentRecord> elements.


SEE ALSO

Alvis::Pipeline


AUTHOR

Wray Buntine, <buntine@hiit.fi>


COPYRIGHT AND LICENSE

Copyright (C) 2006 by Wray Buntine.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.

Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy