Skip to content Skip to navigation

Swish-e Web Indexing System

Overview

Swish-e (Simple Web Indexing Systems for Humans—Enhanced) is a free, open-source web indexing tool that can be used to index Stanford web pages. This is a very brief primer about how to use Swish-e with the main Stanford www servers. To learn more about Swish-e, see the Swish-e web site.

Installing Swish-e

Download the files

To use Swish-e you must have a cgi-bin directory. Visit the cgi-bin service web page to learn more and to request cgi-bin service for your group or department directory. You should also be generally familiar with editing files in AFS, including dot files; opening zip files in AFS; and basic HTML for editing several form fields.

Create the directory "swish-e" under your cgi-bin directory. Download the default configuration files and place them in the swish-e directory you created. In the examples that follow, we assume that you want to index http://web.stanford.edu/group/example/, mapping to /afs/ir/group/example/.Unzip the files into /afs/ir/group/example/cgi-bin/swish-e/.

Create the Swish-e database

  1. Swish-e works by first creating a database of your web pages for the search web page to use. To do so, you must edit the included swish.conf file. First update where the database index files go and change the line that begins with IndexFile to the proper directory. For instance:

    IndexFile /afs/ir/group/example/cgi-bin/swish-e/index.swish-e

  2. By default you will probably want to index everything under a certain directory. Edit the line starting with IndexDir to include that directory. For the example, change it to:

    IndexDir /afs/ir/group/example/WWW/

    Using the file system lets you get around WebAuth restrictions. However, it will not work for dynamically generated content such as PHP pages which have the bulk of their content generated from a database. For this you can have Swish-e instead look at the pages on the web to see how they actually display. However, this does mean that you are restricted to only those pages not behind WebAuth.
  3. To look up files on the web, instead have the following two lines in your swish.conf file.

    IndexDir /usr/lib/swish-e/spider.pl SwishProgParameters default http://web.stanford.edu/group/example/

    Change the latter line for the main URL to start indexing from for your site.

    Once done, on a UNIX timeshare such as the cardinals, run Swish-e to create the database files.

    • If using the file access method, use:

       swish-e -c /afs/ir/group/example/cgi-bin/swish-e/swish.conf

    • If using the URL access method, you need to tell Swish-e that you are doing so with:

       swish-e -c /afs/ir/group/example/cgi-bin/swish-e/swish.conf -S prog

    You should rerun this every time you wish to update the contents of your searches. If you update your web pages, the new content will not be reflected in searches until you update the config.

Configure the results page

The next step is to configure the page that is displayed when you perform a search. The actual page is swish.cgi, installed from the archive downloaded above You don't need to change the swish.cgi itself, only its config file, .swishcgi.conf. You will need to change the following lines:

title
The title of the page; change 'Search Results' to what you want on the page title.
swish_index
The location of the index file, the same as the IndexFile setting given above. You should only need to change the directory.
path
The path where all of your config files are placed. This should be almost the same as swish_index, but without the ending index.swish-e.

The default install will then use a version of the Stanford Modern theme, included within the archive. You can customize and make your own version. Both stanfordmodern.tmpl and swish.tmpl are included. The first is the one used by default, while swish.tmpl is swish's own default, included for example. If you wish to change the general look of the template, you can change the "filename" line in .swishcgi.conf to a different file placed in this directory. Template modifications in general aren't supported by University IT, and you should create a backup of the template before starting to modify it, in case of a problem.

You do need to make two modifications to the template: there are two places with a form field that should point to the CGI script. By default they are both:

http://web.stanford.edu/group/example/cgi-bin/swish-e/swish.cgi

Change this to the actual URL where you have placed the script. Following the default directions, that should only involve changing the directory. You can then test that everything works by going directly to the  URL above and performing a search.

Add a search box

Lastly, you likely will want to embed the search box on your pages for general use. This is all you need for a basic form:

<form id="search" action="http://web.stanford.edu/group/example/cgi-bin/swish-e/swish.cgi" method="get" enctype="application/x-www-form-urlencoded">
<input type="hidden" name="metaname" value="swishdefault" />
<input type="hidden" name="sort" value="swishrank" />
<input type="text" value="Search..." name="query" id="query" alt="Search field" />
<button type="submit" value="Search">Search</button>
</form>

If using the Stanford Modern template and want something that plugs into the look there, follow this example:

<form id="search" action="http://web.stanford.edu/group/example/cgi-bin/swish-e/swish.cgi" method="get" enctype="application/x-www-form-urlencoded">
<div class="searchbox">
<input type="hidden" name="metaname" value="swishdefault" />
<input type="hidden" name="sort" value="swishrank" />
<input onfocus="this.value=''" type="text" value="Search..." name="query" id="query" alt="Search field" />
<button type="submit" value="Search" class="search_button">Search</button>
</div>
</form>

 

Note: for both examples, be sure to change the form action to your own swish.cgi.

Support

University IT cannot provide in-depth support for Swish-e, however if you have problems following the instructions above, please submit a HelpSU request. For complete information on Swish-e, see the Swish-e web site.

Last modified December 10, 2015