CGI Programming on the World Wide Web

Previous Chapter 5
Server Side Includes
Next
 

5.7 Executing CGI Programs

You can use Server Side Includes to embed the results of an entire CGI program into a static HTML document, using the exec cgi directive.

Why would you want to do this? There are many times when you want to display just one piece of dynamic data, such as:

This page has been accessed 4883 times since December 10, 1995. 

Surely, you've seen this type of information in many documents around the Web. Obviously, this information is being generated dynamically (since it changes every time you access the document). We'll show you a few examples of embedded CGI programs using SSI.

User Access Counter

Suppose you have a simple CGI program that keeps track of the number of visitors, called by the exec SSI command in an HTML document:

This page has been accessed <!--#exec cgi="/cgi-bin/counter.pl"--> times.

The idea behind an access counter is simple. A data file on the server contains a count of the number of visitors that have accessed a particular document. Whenever a user visits the document, the SSI command in that document calls a CGI program that reads the numerical value stored in the file, increments it, and writes the new information back to the file and outputs it. Let's look at the program:

#!/usr/local/bin/perl
print "Content-type: text/plain", "\n\n";
$count_file = "/usr/local/bin/httpd_1.4.2/count.txt";
if (open (FILE, "<" . $count_file)) {
        $no_accesses = <FILE>;
        close (FILE);
        if (open (FILE, ">" . $count_file)) {
            $no_accesses++;
            print FILE $no_accesses;
            close (FILE);
            print $no_accesses;
        } else {
            print "[ Can't write to the data file! Counter not incremented! ]", "\n";
        }
} else {
        print "[ Sorry! Can't read from the counter data file ]", "\n";
}
exit (0);

Since we are opening the data file from this program, we need the full path to the file. We can then proceed to try to read from the file. If the file cannot be opened, an error message is returned. Otherwise, we read one line from the file using the <FILE> notation, and store it in the variable $no_accesses. Then, the file is closed. This is very important because you cannot write to the file that was opened for reading.

Once that's done, the file is opened again, but this time in write mode, which creates a new file with no data. If that's not successful, probably due to permission problems, an error message stating that information cannot be written to the file is output. If there are no problems, we increment the value stored in $no_accesses. This new value is written to the file and printed to standard output.

Notice how this program, like other CGI programs we've covered up to this point, also outputs a Content-type HTTP header. In this case, a text/plain MIME content type is output by the program.

An important thing to note is that a CGI program called by an SSI directive cannot output anything other than text because this data is embedded within an HTML or plain document that invoked the directive. As a result, it doesn't matter whether you output a content type of text/plain or text/html, as the browser will interpret the data within the scope of the calling document. Needless to say, your CGI program cannot output graphic images or other binary data.

This CGI program is not as sophisticated as it should be. First, if the file does not exist, you will get an error if you open it in read mode. So, you must put some initial value in the file manually, and set permissions on the file so that the CGI program can write to it:

% echo "0" > /usr/local/bin/httpd_1.4.2/count.txt
% chmod 666 /usr/local/bin/httpd_1.4.2/count.txt

These shell commands write an initial value of "0" to the count.txt file, and set the permissions so that all processes can read from and write to the file. Remember, the HTTP server is usually run by a process with minimal privileges (e.g., "nobody" or "www"), so the permissions on the data file have to be set so that this process can read and write to it.

The other major problem with this CGI program is that it does not lock and unlock the counter data file. This is extremely important when you are dealing with concurrent users accessing your document at the same time. A good CGI program must try to lock a data file when in use, and unlock it after it is done with processing. A more advanced CGI program that outputs a graphic counter is presented in Chapter 6, Hypermedia Documents.

Random Links

You can use the following CGI program to create a "random" hypertext link. In other words, the link points to a different WWW site every time you reload.

Why do you want to do this? Well, for kicks. Also, if the sites are actually mirrors of each other, so it doesn't matter which one you refer people to. By changing the link each time, you're helping to spread out the traffic generated from your site.

Place the following line in your HTML document:

<!--#exec cgi="/cgi-bin/random.pl"-->

Here's the program:

#!/usr/local/bin/perl
@URL = ("http://www.ora.com",
        "http://www.digital.com",
        "http://www.ibm.com",
        "http://www.radius.com");
srand (time | $$);

The @URL array (or table) contains a list of the sites that the program will choose from. The srand function sets a seed based on the current time and the process identification for the random number generator. This ensures a truly random distribution.

$number_of_URL = scalar (@URL);
$random = int (rand ($number_of_URL));

The $number_of_URL contains the index (or position) of the last URL in the array. In Perl, arrays are zero-based, meaning that the first element has an index of zero. We then use the rand function to get a random number from 0 to the index number of the last URL in the array. In this case, the variable $random will contain a random integer from 0 to 3.

$random_URL = $URL[$random];
print "Content-type: text/html", "\n\n";
print qq|<A HREF="$random_URL">Click here for a random Web site!</A>|, "\n";
exit (0);

A random URL is retrieved from the array and displayed as a hypertext link. Users can simply click on the link to travel to a random location.

Before we finish, let's look at one final example: a CGI program that calculates the number of days until a certain event.

Counting Days Until . . .

Remember we talked about query strings as a way of passing information to a CGI program in Chapter 2? Unfortunately, you cannot pass query information as part of an SSI exec cgi directive. For example, you cannot do the following:

<!--#exec cgi="/cgi-bin/count_days.pl?4/1/96"-->

The server will return an error.[1]

[1] However, a CGI program called by the exec SSI directive from a static HTML document has access to the query string passed to this document. For example, if you access an HTML document in the following manner:

http://some.machine/test.html?name=john
and this document contains an SSI directive, then the CGI program can access the query string ("name=john") by reading the QUERY_STRING environment variable.

However, we can create a regular Perl program (not a CGI program) that takes a date as an argument, and calculates the number of days until/since that date:

<!--#exec cmd="/usr/local/bin/httpd_1.4.2/count_days.pl  4/1/96"-->

In the Perl script, we can access this command-line data (i.e., "4/1/96") through the @ARGV array. Now, the script:

#!/usr/local/bin/perl
require "timelocal.pl";
require "bigint.pl";

The require command makes the functions within these two default Perl libraries available to our program.

($chosen_date = $ARGV[0]) =~ s/\s*//g;

The variable $chosen_date contains the date passed to this program, minus any whitespace that may have been inserted accidentally.

if ($chosen_date =~ m|^(\d+)/(\d+)/(\d+)$|) {
    ($month, $day, $year) = ($1, $2, $3);

This is another example of a regular expression, or regexp. We use the regexp to make sure that the date passed to the program is in a valid format (i.e., mm/dd/yyyy). If it is valid, then $month, $day, and $year will contain the separated month, day, and year from the initial date.

    $month -= 1;
    if ($year > 1900) {
        $year -= 1900; 
    }
    $chosen_secs = &timelocal (undef, undef, undef, $day, $month, $year);

We will use the timelocal subroutine (notice the & in front) to convert the specified date to the number of seconds since 1970. This subroutine expects month numbers to be in the range of 0--11 and years to be from 00--99. This conversion makes it easy for us to subtract dates. An important thing to remember is that this program will not calculate dates correctly if you pass in a date before 1970.

    $seconds_in_day = 60 * 60 * 24;   
    $difference = &bsub ($chosen_secs, time);
    $no_days = &bdiv ($difference, $seconds_in_day);
    $no_days =~ s/^(\+|-)//;

The bsub subroutine subtracts the current time (in seconds since 1970) from the specified time. We used this subroutine because we are dealing with very large numbers, and a regular subtraction will give incorrect results. Then, we call the bdiv subroutine to calculate the number of days until/since the specified date by dividing the previously calculated difference with the number of seconds in a day. The bdiv subroutine prefixes the values with either a "+" or a "-" to indicate positive or negative values, respectively, so we remove the extra character.

    print $no_days;
    exit(0);

Once we're done with the calculations, we output the calculated value and exit.

} else {
    print " [Error in date format] ";
    exit(1);
}

If the date is not in a valid format, an error message is returned.


Previous Home Next
Executing External Programs Book Index Tailoring SSI Output

HTML: The Definitive Guide CGI Programming JavaScript: The Definitive Guide Programming Perl WebMaster in a Nutshell