It's probably a good idea to use the root directory of the archive (where the main pages reside). All index files will be built in this directory. You cannot use the same directory for more than one archive.
WebGlimpse needs to know how to translate from local file names to (global) URLs. The answer should be something of the form http://my.server.name/dir1/dir2 where dir2 is the same as the last directory name given in the first question. For example, if you said "/usr/foo/wg/archive1" to the first question, and "http://myserver/wgpath/archive1" points to that directory, then this should be your answer. Do not add a / at the end! (It's generally a good idea to give the full name of your server; WebGlimpse attempts to translate everything into IP numbers, but it's not fullproof.)
Name the archive any way you want. We suggest to put some thought into it. People may want to find you later using this description.
Do you want to build the archive based on traversal from given URLs?
A traversal-based configuration will automatically scan your archive starting from root URLs, which you will give later, and adding all URLs within a certain number of hops (given later). If you say no, the archive will be built based on its directory structure (similar to glimpseHTTP); the rest of the questions will not be relevant and will be skipped.
confarc now constructs some files and adds them to the archive directory. The most important ones for you are .wgfilter-index and .wgfilter-box. They allow you to exclude files from being indexed and/or from being modified with a search box. The first file (-index) provides a way to exclude files from being indexed. The rules are similar to the way Harvest excludes its collection. The default file is pretty straightforward. It works by pattern matching to the file names. The second file (.wgfilter-box) provides a way to exclude local html files from adding the WebGlimpse search box. Same rules. (Obviously, if a file is not to be indexed, no search box will be added to it.) Standard .wgfilter-index and .wgfilter-box are added. You can change them at any time, and they will take effect the next time you index.
Finally comes the moment you've been waiting for: You now enter the root URLs from which WebGlimpse will do the traversal and indexing. WebGlimpse will follow all links (recursively) from the ones you give.
confarc uses two main scripts: makenh (which computes neighborhoods to index) and addsearch (which adds the appropriate search boxes). Each of them can be run separately now or later.
To clean an archive, removing all the stuff that WebGlimpse added, run rmarc. You will only be asked to give the directory of the archive.
Written by Udi Manber
glimpse@cs.arizona.edu