The simplest way to start the Harvest system is to use the RunHarvest command (e.g., you will use this command if you follow the instructions in the binary Harvest distribution). RunHarvest prompts the user with a short list of questions about what data to index, etc., and then creates and runs a Gatherer and Broker with a ``stock'' (non-customized) set of content extraction and indexing mechanisms. Some more primitive commands are also available, for starting individual Gatherers and Brokers (e.g., if you want to distribute the gathering process). Some commands require that the user set the HARVEST_HOME environment variable, to indicate where Harvest is installed. The Harvest startup commands are:
There is no CreateGatherer command, but the RunHarvest command can create a Gatherer, or you can create a Gatherer manually (see Section 4.5.4 or Appendix C). The layout of the installed Harvest directories and programs is discussed in Appendix A.
Among other things, the RunHarvest command asks the user what port numbers to use when running the Gatherer and the Broker. By default, the Gatherer will use port 8500 and the Broker will use the Gatherer port plus 1. The choice of port numbers depends on your particular machine -- you need to choose ports that are not in use by other servers on your machine. You might look at your /etc/services file to see what ports are in use (although this file only lists some servers; other servers use ports without registering that information anywhere). Usually the above port numbers will not be in use by other processes. Probably the easiest thing is simply to try using the default port numbers, and see if it works.
Once you have successfully built a Harvest Gatherer, Broker, or Cache, please register your server(s) with the Harvest Server Registry (HSR) using our registration page. The RunHarvest command will ask you if you'd like to register your servers with the HSR. If you answer yes, then you do not need to use the registration page.
The remainder of this manual provides information for users who wish to customize or otherwise make more sophisticated use of Harvest than what happens when you install the system and run RunHarvest.