Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

B<Biblio::Document::Parser::Standard>
document parsing functionality

B<Biblio::Document::Parser::Standard> - document parsing functionality



NAME

Biblio::Document::Parser::Standard - document parsing functionality


SYNOPSIS


  use Biblio::Document::Parser::Standard;

  use Biblio::Document::Parser::Utils;

  # First read a file into an array of lines.

  my $content = Biblio::Document::Parser::Utils::get_content("http://www.foo.com/myfile.pdf";);

  my $doc_parser = new Biblio::Document::Parser::Standard();

  my @references = $doc_parser->parse($content);

  # Print a list of the extracted references.

  foreach(@references) { print "-> $_\n"; }


DESCRIPTION

Biblio::Document::Parser::Standard provides a fairly simple implementation of a system to extract references from documents.

Various styles of reference are supported, including numeric and indented, and documents with two columns are converted into single-column documents prior to parsing. This is a very experimental module, and still contains a few hard-coded constants that can probably be improved upon.


METHODS

$parser = Biblio::Document::Parser::Standard->new()
The new() method creates a new parser instance.

@references = $parser->parse($lines, [%options])
The parse() method takes a string as input (see the get_content() function in Biblio::Document::Parser::Utils for a way to obtain this), and returns a list of references in plain text suitable for passing to a CiteParser module.


CHANGES

- 2003/05/13
Removed Perl warnings generated from parse() by adding checks on the regexps


AUTHOR

Mike Jewell <moj@ecs.soton.ac.uk> Tim Brody <tdb01r@ecs.soton.ac.uk>

Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy