Help-Site Computer Manuals
Software
Hardware
Programming
Networking
  Algorithms & Data Structures   Programming Languages   Revision Control
  Protocols
  Cameras   Computers   Displays   Keyboards & Mice   Motherboards   Networking   Printers & Scanners   Storage
  Windows   Linux & Unix   Mac

sitemapper.pl
script for generating site maps

sitemapper.pl - script for generating site maps


NAME

sitemapper.pl - script for generating site maps


SYNOPSIS


    sitemapper.pl 

        [ -verbose <debug level> ] 

        [ -help ] 

        [ -doc ] 

        [ -depth <depth> ] 

        [ -proxy <proxy URL> ] 

        [ -[no]envproxy ] 

        [ -agent <agent> ]

        [ -authen ] 

        [ -format <html|text|js|xml> ] 

        [ -summary <no. chars> ] 

        [ -title <page title> ] 

        [ -email <e-mail address> ]

        [ -gui ]

        -url <root URL>


DESCRIPTION

sitemapper.pl generates site maps for a given site. It traverses a site from the root URL given as the -site option and generates an HTML page consisting of a bulleted list which reflects the structure of the site.

The structure reflects the distance from the home page of the pages listed; i.e. the first level bullets are pages accessible directly from the home page, the next level, pages accessible from those pages, etc. Obviously, pages that are linked from ``higher'' up pages may appear in the ``wrong place'' in the tree, than they ``belong''.

The -format option can be used to specify alternative options for formating the site map. Currently the options are html (as described above - the default), js, which uses Jef Pearlman's (jef@mit.edu) Javascript Tree class to display the site map as a collapsable tree, and text (plain text).


OPTIONS

-depth <depth>

Option to specify the depth of the site map generated. If no specified, generates a sitemap of unlimited depth.

-email <e-mail address>

Option to specify the e-mail address which is reported by the robot to the site it gets pages from.

-url <root URL>

Option to specify a root URL to generate a site map for.

-proxy <proxy URL>

Specify an HTTP proxy to use.

-[no]envproxy

If -envproxy is set, the proxy specified by the $http_proxy environment variable will be used (this is the default behaviour). Use -noenvproxy to suppress this. -proxy takes precedence over -envproxy.

-agent <agent>

Allows the user to specify an agent for the robot to pretend to be (e.g. 'Mozilla/4.5'). This can be necessary for sites that do browser sniff for serving particular content, etc.

-format <formatting option>

Option for specifying the for the site map. Possible values are:

html
Plain old HTML bulleted list.

js
A collapsable DHTML tree, generated using Jef Pearlman's (jef@mit.edu) Javascript Tree class.

text
Plain text.

xml
An XML graph of linkage between pages.

-summary <no. chars>

Automatically extract a summary to display with the title. This will be truncated at the specified number of characters.

-title <page title>

Option to specify a page title for the site map.

-authen

Option to use LWP::AuthenAgent to get HTML pages. This allows the user to type username / password for pages that are access controlled.

-gui

Use a Tk GUI to run sitemapper.

-help

Display a short help message to standard output, with a brief description of purpose, and supported command-line switches.

-doc

Display the full documentation for the script, generated from the embedded pod format doc.

-version

Print out the current version number.

-verbose <debug level>

Turn on verbose error messages.


ENVIRONMENT

sitemapper.pl makes use of the $http_proxy environment variable, if it is set.


PREREQUISITES


    Date::Format

    HTML::Entities

    Getopt::Long

    IO::File

    LWP::AuthenAgent

    LWP::UserAgent

    Pod::Usage

    URI::URL

    WWW::Sitemap


OSNAMES


    hpux 10 PA-RISC1.1 

    linux 2.2.1 ppc-linux 

    linux 2.2.2 i686-linux 

    MSWin32 4.0 MSWin32-x86 

    sunos 4.1.4 sun4-sunos 

    sunos 5.6 sun4-solaris


SEE ALSO

Jef Pearlman's Javascript Tree class (http://developer.netscape.com/docs/examples/dynhtml/tree.html)


BUGS

The Javascript sitemap has only been tested on Netscape 4.05.


AUTHOR

Ave Wrigley <Ave.Wrigley@itn.co.uk>


COPYRIGHT

Copyright (c) 1998 Canon Research Centre Europe. All rights reserved.

This script is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


SCRIPT CATEGORIES

Web

Programminig
Wy
Wy
yW
Wy
Programming
Wy
Wy
Wy
Wy