next up previous contents index
Next: 4.3.4 Using extreme values Up: 4.3 RootNode specifications Previous: 4.3.2 Generic Enumeration filter

4.3.3 Example RootNode configuration

 

Below is an example RootNode configuration:

          <RootNodes>
    (1)   http://harvest.cs.colorado.edu/ URL=100,MyFilter
    (2)   http://www.cs.colorado.edu/     Host=50 Delay=60
    (3)   gopher://gopher.colorado.edu/   Depth=1
    (4)   file://powell.cs.colorado.edu/homes/hardy Depth=2
    (5)   ftp://ftp.cs.colorado.edu/pub/cs/techreports  Depth=1
    (6)   http://harvest.cs.colorado.edu/~hardy/hotlist.html \
                  Depth=1 Delay=60
    (7)   http://harvest.cs.colorado.edu/~hardy/ \
                  Depth=2 Access=HTTP|FTP
          </RootNodes>

Each of the above RootNodes follows a different enumeration configuration as follows:

  1. This RootNode will gather up to 100 documents that pass through the URL name filters contained within the file MyFilter.

  2. This RootNode will gather the documents from up to the first 50 sites it encounters while enumerating the specified URL, with no limit on the Depth of link enumeration. It will also wait for 60 seconds between each retrieval.

  3. This RootNode will gather only the documents from the top-level menu of the Gopher server at gopher.colorado.edu.

  4. This RootNode will gather all documents that are in the /homes/hardy directory, or that are in any subdirectory of /homes/hardy.

  5. This RootNode will gather only the documents that are in the /pub/techreports directory which, in this case, is some bibliographic files rather than the technical reports themselves.

  6. This RootNode will gather all documents that are within 1 step away from the specified RootNode URL, waiting 60 seconds between each retrieval. This is a good method by which to index your hotlist. By putting an HTML file containing ``hotlist'' pointers as this RootNode, this enumeration will gather the top-level pages to all of your hotlist pointers.

  7. This RootNode will gather all documents that are at most 2 steps away from the specified RootNode URL. Furthermore, it will follow and enumerate any HTTP or FTP URLs that it encounters during enumeration.



Duane Wessels
Wed Jan 31 23:46:21 PST 1996