Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

AI::Categorizer::Learner::Weka
Pass-through wrapper to Weka system

AI::Categorizer::Learner::Weka - Pass-through wrapper to Weka system


NAME

AI::Categorizer::Learner::Weka - Pass-through wrapper to Weka system


SYNOPSIS


  use AI::Categorizer::Learner::Weka;

  

  # Here $k is an AI::Categorizer::KnowledgeSet object

  

  my $nb = new AI::Categorizer::Learner::Weka(...parameters...);

  $nb->train(knowledge_set => $k);

  $nb->save_state('filename');

  

  ... time passes ...

  

  $nb = AI::Categorizer::Learner->restore_state('filename');

  my $c = new AI::Categorizer::Collection::Files( path => ... );

  while (my $document = $c->next) {

    my $hypothesis = $nb->categorize($document);

    print "Best assigned category: ", $hypothesis->best_category, "\n";

  }


DESCRIPTION

This class doesn't implement any machine learners of its own, it merely passes the data through to the Weka machine learning system (http://www.cs.waikato.ac.nz/~ml/weka/). This can give you access to a collection of machine learning algorithms not otherwise implemented in AI::Categorizer.

Currently this is a simple command-line wrapper that calls java subprocesses. In the future this may be converted to an Inline::Java wrapper for better performance (faster running times). However, if you're looking for really great performance, you're probably looking in the wrong place - this Weka wrapper is intended more as a way to try lots of different machine learning methods.


METHODS

This class inherits from the AI::Categorizer::Learner class, so all of its methods are available unless explicitly mentioned here.

new()

Creates a new Weka Learner and returns it. In addition to the parameters accepted by the AI::Categorizer::Learner class, the Weka subclass accepts the following parameters:

java_path
Specifies where the java executable can be found on this system. The default is simply java, meaning that it will search your PATH to find java.

java_args
Specifies a list of any additional arguments to give to the java process. Commonly it's necessary to allocate more memory than the default, using an argument like -Xmx130MB.

weka_path
Specifies the path to the weka.jar file containing the Weka bytecode. If Weka has been installed somewhere in your java CLASSPATH, you needn't specify a weka_path.

weka_classifier
Specifies the Weka class to use for a categorizer. The default is weka.classifiers.NaiveBayes. Consult your Weka documentation for a list of other classifiers available.

weka_args
Specifies a list of any additional arguments to pass to the Weka classifier class when building the categorizer.

tmpdir
A directory in which temporary files will be written when training the categorizer and categorizing new documents. The default is given by File::Spec->tmpdir.

train(knowledge_set => $k)

Trains the categorizer. This prepares it for later use in categorizing documents. The knowledge_set parameter must provide an object of the class AI::Categorizer::KnowledgeSet (or a subclass thereof), populated with lots of documents and categories. See the AI::Categorizer::KnowledgeSet manpage for the details of how to create such an object.

categorize($document)

Returns an AI::Categorizer::Hypothesis object representing the categorizer's ``best guess'' about which categories the given document should be assigned to. See the AI::Categorizer::Hypothesis manpage for more details on how to use this object.

save_state($path)

Saves the categorizer for later use. This method is inherited from AI::Categorizer::Storable.


AUTHOR

Ken Williams, ken@mathforum.org


COPYRIGHT

Copyright 2000-2003 Ken Williams. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


SEE ALSO

AI::Categorizer(3)

Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy