Help-Site Computer Manuals
  Algorithms & Data Structures   Programming Languages   Revision Control
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

a Search Engine / Indexer running with Mysql

Search::Circa - a Search Engine / Indexer running with Mysql


Search::Circa - a Search Engine / Indexer running with Mysql


This is Search::Circa, a module who provide functions to perform search on Circa, a www search engine running with Mysql. Circa is for your Web site, or for a list of sites. It indexes like Altavista does. It can read, add and parse all url's found in a page. It add url and word to MySQL for use it at search.

Circa can be used for index 100 to 100 000 url


  • Accents are removed on search and when indexed

  • Search are case unsensitive (mmmh what my english ? ;-)

Search::Circa::Search work with Search::Circa::Indexer result. Search::Circa::Search is a Perl interface, but it's exist on this package a PHP client too.

Search::Circa is root class for Search::Circa::Indexer and Search::Circa::Search.


See the Search::Circa::Search manpage, the Search::Circa::Indexer manpage


  • Search Features
    • Boolean query language support : or (default) and (``+'') not (``-''). Ex perl + faq -cgi : Documents with faq, eventually perl and not cgi.

    • Client Perl or PHP

    • Can browse site by directory / rubrique.

    • Search for different criteria: news, last modified date, language, URL / site.

  • Full text indexing

  • Different weights for title, keywords, description and rest of page HTML read can be given in configuration

  • Herite from features of LWP suite:
    • Support protocol HTTP://,FTP://, FILE:// (Can do indexation of filesystem without talk to Web Server)

    • Full support of standard robots exclusion (robots.txt). Identification with CircaIndexer/0.1, mail Delay requests to the same server for 8 secondes. ``It's not a bug, it's a feature!'' Basic rule for HTTP serveur load.

    • Support proxy HTTP.

  • Make index in MySQL

  • Read HTML and full text plain

  • Several kinds of indexing : full, incremental, only on a particular server.

  • Documents not updated are not reindexed.

  • All requests for a file are made first with a head http request, for information such as validate, last update, size, etc.Size of documents read can be restricted (Ex: don't get all documents > 5 MB). For use with low-bandwidth connections, or computers which do not have much memory.

  • HTML template can be easily customized for your needs.

  • Admin functions available by browser interface or command-line.

  • Index the different links found in a CGI (all after name_of_file?)


Q: Where are clients for example ?

A: See in demo directory. For command line, see circa_admin and circa_search,, for CGI, take a look in cgi-bin/circa, they are installed with make cgi.

Q: Where are global parameters to connect to Circa ?

A: Use lib/ file

Q : What is an account for Circa ?

A: It's like a project, or a databse. A namespace for what you want.

Q : How I begin with indexer ?

A: See man page of circa_admin

Q : Did you succed to use Circa with mod_perl ?

A: Yes

Public interface

You use this method behind Search::Circa::Indexer and Search::Circa::Search object

connect user, password, database, host
Connect Circa to MySQL. Return 1 on succes, 0 else
  • user : Utilisateur MySQL

  • password : Mot de passe MySQL

  • db : Database MySQL

  • bost : Adr IP du serveur MySQL

Connect Circa to MySQL. Return 1 on succes, 0 else

Close connection to MySQL. This method is called with DESTROY method of this class.

Get or set the prefix for table name for use Circa with more than one time on a same database

fill_template masque, ref_hash
  • masque : Path of template

  • vars : hash ref with keys/val to substitue

Give template with remplaced variables Ex:

 $circa->fill_template('A <? $age ?> ans', ('age' => '12 ans'));

Will return:

  J'ai 12 ans,
fetch_first request
Execute request SQL on db and return first row. In list context, retun full row, else return just first column.

trace level, msg
Print message msg on standart output error if debug level for script is upper than level.

prompt message, default_value
Ask in STDIN for a parameter with message and default_value and return value


the Search::Circa::Indexer manpage, Indexer module

the Search::Circa::Search manpage, Searcher module

the Search::Circa::Annuaire manpage, Manage directory of Circa

the Search::Circa::Url manpage, Manage url of Circa

the Search::Circa::Categorie manpage, Manage categorie of Circa


$Revision: 1.18 $