Back

ORANGEVALLEY INTRANET SEARCH ENGINE
For a Windows 95/98/NT LAN


Install Information - Print out for future reference.
After unzipping the zip file, run Setup. This will install the spider component onto your server. Start the program by going to the windows start button, programs, orangevalley spider. You will need to instruct the spider where the path to the Intranet is, so select 'Configure, Files to Index'. This brings up a fairly straight forward control panel. You need to select the root directory of the intranet, normally c:\inetpub\wwwroot\, and save it. The spider will find all the following directories and pages from here onwards. By double clicking on the displayed HTM pages these will be transfered to a list of excluded files. The files in this list will not be indexed. The type of files that you probably don't want to index are frameset files.

Other files
After you have unzipped the software file, there will be five other files detailed below.
Place the file called orangevalleysearch.asp, orangevalley.mdb and orangevalleysearch.html into the root directory of your intranet (normally c:\inetpub\wwwroot\. Orangevalleysearch.html is a small web page that has the search box and button. This calls the .asp file, which reads the database. You will also need a file called advobs.inc to be placed the same directory. I have included this file, but it is also part of windows, so it should be on your own machine somewhere in the windows directory. If the the search engine is not returning any results, then check that all the above files are all in the root directory of your Intranet. Additionally, there is a file called msjet35.dll in the downloaded bundle. Place this file into the same directory that the spider is installed. Various incompatibility between windows versions make it safer to put the file here, rather than the windows system folder.

The file orangevalley.mdb is an empty database that the spider will fill with pages. The data from the web page is dissembled and stored in the database fields, stripped of the html tags.

Search Logging
For the search logging, create a directory in the root directory of the Intranet called 'stats'. In this directory place a text file called orangevalleysearch.txt. This file can be created with Notepad and will will log any searches. If the file does not exist, it wont be created by the software. For the logging to take place, the web server will need to have the write permissions enabled. The log file contains some html formatting to enable the file to be viewed in a web browser.

As the logging can generate an error with some older versions of IIS, the logging is by default disabled. To enable the logging, load the script file orangevalleysearch.asp into notepad and find the line near the top that says:

'****************** USER ADJUSTABLE VALUES *************************

Loads of text...

KeepLogs=0

'************** END OF USER ADJUSTABLE VALUES *******************

To enable the search log, change KeepLogs=0 to KeepLogs=1
Before making any changes make a copy of the file as if you change any other values, the search script may stop working.

Other Information
As mentioned above, the search engine spider places the results of the indexing process into an access database. If you want to write your own script using the results from the spider, then these are the available database fields and the information that they contain.

The following fields are created:

url - this contains the url of the page relative to the root directory
data - this is a pure text version of your web site. All the html tags have been removed
title - The text between the title tags on the web page
description - The text from the meta description tag
category - The category of the web page
filedate - The date and time that the file was last edited or created

The category tag is a meta tag of the format

<META NAME="Category" Cat="enter the site category here">

The search engine script does not yet support this tag, but future uses will probably make use of it to enable search results to be placed into certain categories.

If the search engine does not return any results:
When you call up the search box, orangevalleysearch.html you must call this up via the web server. This means that you need to type http://your_computer_name/orangevalleysearch.html into the web browser. Your_computer_name is the name of the computer as seen with 'Network Neighbourhood' or 'My Network Places'.If the computer does not have a network card installed (an Intranet could be running on a stand alone computer) use the PWS or IIS help systems to establish the computer name. Calling the html search page by double clicking it from explorer will display the search box in the web browser, but you won't be going via the web server.

The simple way to tell if the web server is working will be that the search query is returned after you hit the search button. If the query space is blank, you are not going via the web server.

If the spider looks like it's got stuck on the first page, make sure that the database is saved to the correct directory and that you have configured the spider to start the index from this same directory.

Open the database file with MS Access and check that the file has some data in it. If it's empty, then you will need an additional driver. A version with the built in drivers can be downloaded from here. The default download, without the drivers is to prevent any drivers that you may already have installed being overwritten. This can occasionally cause problems with other odbc services. Please contact me if this confusing, giving me some information on your system, and I will get back to you.

Under the configure menu on the spider, is an item called Script Tags. This allows the spider to filter out any data held between any script tags. This is often script code. This script code can be returned with the search results, which is generally unwanted. The spider gives the option to ignore this data (default). However, some web sites contain the entire web page between these tags, so if the default option is set, nothing will be spidered. Try selecting the other option, Spider text between script tags.

If you get any other errors, then drop me an email giving me as much information on any error messages displayed. The best way to do this would be to send me the error page. It would also be useful to know the version of operating system that you are using.