Guide to Writing CGI Scripts in REXX and Perl

Last Update: November 19, 1995
[ SLAC, The Lab | SLAC Home | Net Search ]

Contents


Introduction

This Guide is aimed at people who wish to write their own WWW executable scripts using WWW's Common Gateway Interface ( CGI). Though the main emphasis is on REXX many examples are also provided in Perl.

There are some simple software libraries to facilitate writing CGI scripts. cgi-lib.rxx is a REXX library (available at SLAC by using the REXX
PUTENV('REXXPATH=/afs/slac/www/slac/www/tool/cgi-rexx')
statement to include the library at execution time)and cgi-lib.pl is a similar library in Perl written by Steve Brenner. NCSA has a very useful set of Perl CGI handler subroutines that are available via anonymous FTP.Another set of Perl CGI Server Side Scripts written by Brigitte Jellinek is available under Gnu public license. There is also the Source code for www-leland scripts and programs and a project CGI.pm to create a Perl5 CGI Library. Finally there is the index to Perl programs and libraries associated with the Web by Eric Hood.

For more on Perl and CGI scripts see also the CGI and Perl Tutorial by Alan Richmond. Also see the WWW Virtual Library for more on Server Side CGI information. Carl Cordova has a page with Mac Web Development Resources. Finally there is Yahoo's Common Gateway Interface Information page where you can also find links for support for writing CGI scripts in C and for Macs, Amigas and other platforms.

Since there are security and other risks associated with executing user scripts in a WWW server, the reader may wish to first view a document providing information on a SLAC Security Wrapper for users' CGI scripts. Besides improving security, this wrapper also simplifies the task of writing a CGI script for a beginner.

Before embarking on writing a script, you may also want to check out some rough notes on SLAC Web Utilities Provided by CGI Scripts.

The CGI is an interface for running external programs, or gateways, under an information server. Currently, the supported information servers are HTTP (the Transport Protocol used by WWW) servers.

Gateway programs are executable programs (e.g. UNIX scripts) which can be run by themselves (but you wouldn't want to except for debugging purposes). They have been made executable to allow them to run under various (possibly very different) information servers interchangeably. Gateway programs conforming to this specification can be written in any language, including REXX or Perl, which produces an executable file

Getting the Input to the Script

The input may be sent to the script in several ways depending on the client's Uniform Resource Locator (URL) or an HyperText Markup Language (HTML) Form: You can review the REXX Code Fragment giving an example of how to read the various form of input into your script.

Decoding Forms Input

When you write a Form, each of your input items has a NAME tag. When the user places data in these items in the Form, that information is encoded into the Form data. The value each of the input items is given by the user is called the value.

Form data is a stream of name=value pairs separated by the & character. Each name=value pair is URL encoded, i.e. spaces are changed into plusses and some characters are encoded into hexadecimal.

You can review the REXX or the Perl code fragment giving examples of decoding the Form input.

Sending Document Back to Client

CGI programs can return a myriad of document types. They can send back an image to the client, an HTML document, a plaintext document, a Postscript documents or perhaps even an audio clip of your bodily functions. They can also return references to other documents (to save space we will ignore this latter case here, more information may be found in NCSA's CGI Primer). The client must know what kind of document you're sending it so it can present it accordingly. In order for the client to know this, your CGI program must tell the server what type of document it is returning.

In order to tell the server what kind of document you are sending back, CGI requires you to place a short header on your output. This header is ASCII text, consisting of lines separated by either linefeeds or carriage returns followed by linefeeds. Your script must output at least two such lines before its data will be sent directly back to the client. These lines are used to indicate the MIME type of the following document

Some common MIME types relevant to WWW are:

In order to tell the server your output's content type, the first line of your output should read:
Content-type: type/subtype
where type/subtype is the MIME type and subtype for your output.

Next, you have to send the second line. With the current specification, THE SECOND LINE SHOULD BE BLANK. This means that it should have nothing on it except a linefeed. Once the server retrieves this line, it knows that you're finished telling the server about your output and will now begin the actual output. If you skip this line, the server will attempt to parse your output trying to find further information about your request and you will become very unhappy.

You can review a REXX Code Fragment giving an example of handling the Content-type information.

After these two lines have been outputted, any output to stdout (e.g. a REXX SAY command) will be included in the document sent to the client.

Diagnostics and Reporting Errors

Since stdout is included in the document sent to the, diagnostics diagnostics outputted with the SAY command will appear in the document. This output will need to be consistent with the Content-type: type/subtype mentioned above.

You can review a REXX Code Fragment giving an example of diagnostic reporting.

If errors are encountered (e.g. no input provided, invalid characters found, too many arguments specified, requested an invalid command to be executed, invalid syntax in the REXX exec) the script should provide detailed information on what is wrong etc. It may be very useful to provide information on the settings of various WWW Environment Variables that are set.

You can review a REXX Code Fragment giving an example of error reporting and Typical Output Generated from such a code fragment.

My First REXX CGI Script

To get your Web server to execute a CGI script you must: The Web-Master will want to insure that Security Aspects of your script have been addressed before adding your script to the Rules file.

Other Sources of Interest

The book HTML & CGI Unleashed has much useful information on writing CGI scripts in C, Perl and REXX.

Also tune into the newsgroup comp.infosystems.www.authoring.cgi which covers discussion of the development of Common Gateway Interface (CGI) scripts as they relate to Web page authoring. Possible subjects include discussion how to handle the results of forms, how to generate images on the fly, and how to put together other interactive Web offerings.

Marc Hedlund keeps a CGI Frequently Asked Questions list READ THIS FIRST.

The World Wide Web (Frequently Asked Questions, with Answers) answers many, many questions about the World Wide Web in general.

There is also the Yahoo Forms Collection which shows many entries for forms information.

If you are using Perl and you have a general Perl question that isn't really a CGI-specific question, check out the Perl FAQ.

Acknowledgements

Much of the text on the Common Gateway Interface and Forms comes from NCSA documents. Useful information and text was also obtained from The World-Wide Web: How Servers Work, by Mark Handley and John Crowcroft, published in ConneXions, February 1995.


Les Cottrell
[ Feedback ]