Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

AI::Categorizer::FeatureSelector::ChiSquare
ChiSquare Feature Selection class

AI::Categorizer::FeatureSelector::ChiSquare - ChiSquare Feature Selection class


NAME

AI::Categorizer::FeatureSelector::ChiSquare - ChiSquare Feature Selection class


SYNOPSIS


 # the recommended way to use this class is to let the KnowledgeSet

 # instanciate it

 use AI::Categorizer::KnowledgeSetSMART;

 my $ksetCHI = new AI::Categorizer::KnowledgeSetSMART(

   tfidf_notation =>'Categorizer',

   feature_selection=>'chi_square', ...other parameters...);

 # however it is also possible to pass an instance to the KnowledgeSet

 use AI::Categorizer::KnowledgeSet;

 use AI::Categorizer::FeatureSelector::ChiSquare;

 my $ksetCHI = new AI::Categorizer::KnowledgeSet(

   feature_selector => new ChiSquare(features_kept=>2000,verbose=>1),

   ...other parameters...

   );


DESCRIPTION

Feature selection with the ChiSquare function.


  Chi-Square(t,ci) = (N.(AD-CB)^2)

                    -----------------------

                    (A+C).(B+D).(A+B).(C+D)

where t = term ci = category i N = number of documents in the collection A = number of times where t and c co-occur B = `` '' `` t occurs without c C = '' `` '' c occurs without t D = `` '' `` neither c nor t occur

for more details, see : Yiming Yang, Jan O. Pedersen, A Comparative Study on Feature Selection in Text Categorization, in Proceedings of ICML-97, 14th International Conference on Machine Learning, 1997. (available on citeseer.nj.nec.com)


METHODS


AUTHOR

Francois Paradis, paradifr@iro.umontreal.ca with inspiration from Ken Williams AI::Categorizer code

Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy