Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

@<Biblio::Document::Parser::Utils>
utility module for handling International characters and document conversion

@<Biblio::Document::Parser::Utils> - utility module for handling International characters and document conversion



NAME

@<Biblio::Document::Parser::Utils> - utility module for handling International characters and document conversion


DESCRIPTION

Biblio::Document::Parser::Utils provides some utility functions for handling international characters and for conversion of documents to plaintext.


SYNOPSIS


        use Biblio::Document::Parser::Utils qw( normalise_multichars );

        print normalise_multichars( $str );


METHODS

$str = normalise_multichar( $str )
Convert multi-char international characters into single UTF-8 chars, e.g.: ¨o => ö These appear in pdftotext output from PDFs generated by pdflatex.

$content = ParaTools::Utils::get_content($location)
This function takes either a filename or a URL as a parameter, and aims to return a string containing the lines in the file. A hash of converters is provided in ParaTools/Utils.pm, which should be customised for your system.

For URLs, the file is first downloaded to a temporary directory, then converted, whereas local files are copied straight into the temporary directory. For this reason, some care should be taken when handling very large files.

$escaped_url = ParaTools::Utils::url_escape($string)
Simple function to convert a string into an encoded URL (i.e. spaces to %20, etc). Takes the unencoded URL as a parameter, and returns the encoded version.


AUTHOR

Tim Brody <tdb01r@ecs.soton.ac.uk> Mike Jewell <moj@ecs.soton.ac.uk> (packaging)

Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy