return to first page linux journal archive
keywordscontents

At the Forge

CGI Programming

So you're gathering information from your surfers; what now?

by Reuven M. Lerner

This time, we are going to look at one of the most common things that people want their CGI programs to do, namely save data to files on disk. By the end of the column, we will have accumulated enough tools to produce a simple, but functional guest-book program that will allow visitors to your site to save comments that can be read by others.

For starters, let's look at a simple HTML form that will allow users to send data to our CGI program, which we will call ``entryform.pl'':

<HTML>
<Head>
<Title>Data entry form</Title>
</Head>
<Body>
<H1>Data entry form</H1>
<Form action="/cgi-bin/entryform.pl"
method=POST>
<P>Name: <input type=text name="name" 
value=""></P>
<P>E-mail address: <input type=text 
name="email" value=""></P>
<P>Address: <input type=text 
name="address" value=""></P>
<P>Country: <input type=text 
name="country" value=""></P>
<P>Male <input type=radio name="sex" 
value="male">
Female <input type=radio name="sex" 
value="female"></P>
<input type=submit>
</Form>
</Body>
</HTML>
Of course, an HTML form won't do anything on its own; it needs a CGI program to accept and process its input. Below is a Perl5 program that, if named ``entryform.pl'' and placed in the main ``/cgi-bin'' directory on a web server, should print out the name-value pairs that arrive from the above form:

0    #!/usr/local/bin/perl5

1    # We want to use the CGI module
2    use CGI;

3    # Create a new CGI object
4    my $query = new CGI;

5    # Print an a appropriate MIME header
6    print $query->header("text/html");

7    # Print a title for the page
8    print $query->start_html(-title=>"Form 
     contents");

9    # Print all of the name-value pairs
10   print $query->dump();

11   # Finish up the HTML
12   print $query->end_html;
Here's a quick run-down of what each line of code does:

Line 0 tells a Unix box where to find the Perl interpreter. If your copy of Perl is called something else, you need to modify this line.

Without explicitly importing the CGI module in line 2, Perl wouldn't know how to create and use CGI objects. (Trying to use code from a module you haven't imported is guaranteed to confuse Perl and generate error messages.) We then declare $query to be an instance of CGI (line 4).

We then tell the user's browser that our response will be HTML-formatted text, and we do that by using a MIME header. The lack of a MIME header is the most common reason for a 500 error; whenever one of your CGI programs produces one of these, make sure that you aren't trying to print HTML before the header! Note that line 6 is equivalent to saying:

print "Content-type: text/html\n\n";
which also tells the browser to expect text data formatted in HTML. In general, though, I prefer to use the CGI object for readability reasons.

Line 8 creates the basic HTML necessary to begin the document, including giving it the title, ``Form contents''.

Line 10 uses the CGI object's built-in facility for ``dumping'' an HTML form's contents in an easy-to-read format. This allows us to see what value was assigned to each of the elements of the HTML form, which can be invaluable in debugging problematic programs. For now, though, we are just using the CGI ``dump'' method to get ourselves started and confirm that the program works.

Saving the Data to a File

Now that we have proven that our HTML form is sending data to our CGI program, and that our program can send its output back to the user's web browser, let's see what we can do with that data. For starters, let's try to save the data from the form to a file on disk. (This is one of the most common tasks that clients ask me to implement, usually because they want to collect data about their visitors.)

#!/usr/local/bin/perl5
# We want to use the CGI module
use CGI;
# Set the filename to which we want the elements
# saved
my $filename = "/tmp/formcontents";
# Set the character that will separate fields in
# the file
my $separation_character = "\t";
# Create a new CGI object
my $query = new CGI;
# ----------------------------------------------
# Open the file for appending
open (FILE, ">>$filename") || 
	die "Cannot open \"$filename\"!\n";
# Grab the elements of the HTML form
@names = $query->param;
# Iterate through each element from the form,
# writing each element to $filename. Separate
# elements with $separation_character defined
# above.
foreach $index (0 .. $#names)
{
	# Get the input from the appropriate
	# HTML form element
	$input = $query->param($names[$index]);
	# Remove any instances of
	# $separation_character

	$input =~ s/$separation_character//g;
	# Now add the input to the file
	print FILE $input;
	# Don't print the separation character
	# after the final element
	
print FILE $separation_character if 
		($index < $#names);
}
# Print a newline after this user's entry
print FILE "\n";
# Close the file
close (FILE);
# -----------------------------------------------
# Now thank the user for submitting his
# information
# Print an a appropriate MIME header 
print $query->header("text/html");
# Print a title for the page
print $query->start_html(-title=>"Thank you");
# Print all of the name-value pairs
print "<P>Thank you for submitting the ";
print "form.</P>\n";
print "<P>Your information has been ";
print "saved to disk.</P7gt;\n";
# Finish up the HTML
print $query->end_html;
The above program is virtually identical to the previous one, except that we have added a section that takes each of the HTML form elements and saves them to a file. Each line in the resulting file corresponds to a single press of the HTML form's ``submit'' button.

The above program separates fields with a TAB character, but we could just as easily have used commas, tildes or the letter ``a''. Remember, though, that someone is eventually going to want to use this data--either by importing it into a database or by splitting it apart with Perl or another programming language. To ensure that the user doesn't mess up our database format, we remove any instances of the separation character in the user's input with Perl's substitution(s) operator. A bit Draconian, but effective!

One of the biggest problems with the above program is that it depends on the HTML form elements always coming in the same order. That is, if you have elements X, Y and Z on an HTML form, will they be placed in @names in the same order as they appear in the form? In alphabetical order? In random order? To be honest, there isn't any way to be sure, since the CGI specifications are silent on the matter. It's possible, then, that one user's form will be submitted in the order (X, Y, Z), while another's will be submitted as (Y, Z, X)--which could cause problems with our data file, in which fields are identified by their position.

A simple fix is to maintain a list of the fields that we expect to receive from the HTML form. This requires a bit more coordination between the program and the form, but given that the same person often works on both, that's a minor concern.

First, we define a list, @fields, near the top of the program. This list contains the names of all of the fields that we expect to receive, in the order that we expect to receive them:

my @fields = ("name", 
		     "email", 
		     "address",
		     "country", 
		     "sex");
Next, we change the ``foreach'' loop (which places the field elements in the output file) such that it iterates through the elements of @fields, rather than @names.

foreach $index (0 .. $#fields)
{
  # Get the input from the appropriate HTML form
  # element
  $input = $query->param($fields[$index]);
  # Remove any instances of $separation_character
	
  $input =~ s/$separation_character//g;
  # Now add the input to the file
  print FILE $input;
  # Don't print the separation character after the
  # final element
  print FILE $separation_character if 
	($index < $#fields);
}

Required Fields

What if we want to make sure that users have filled out certain fields? This is particularly important when we are collecting data about visitors to a site, and want to make sure that we receive their names, addresses and other vital data. A simple way to do that is to create a list, @required_fields, in which the required fields are listed:

my @required_fields = ("name",
			      "email",
			      "address");
If you simply want a generic message indicating that one or more required fields haven't been filled out, you can add the following subroutine at the bottom of the program file:

sub missing_field
{
  # Print an a appropriate MIME header 
  print $query->header("text/html");
  # Print a title for the page
  print $query->start_html(-title=>
  "Missing field(s)");
  # Tell the user what the error is
  print "<P>At least one required ";
  print "field is missing.</P>\n";
  # Finish up the HTML
  print $query->end_html;
   
}
We can then insert the following code into the program itself, just before we open the file--since there isn't any reason to open the file if we are simply going to close it again:

foreach $field (@required_fields)
{
  # Make sure that the field contains more than
  # just whitespace
  &missing_field if 
  ($query->param($field) !~m/\w/);
  exit;
}
The above code will indeed do the trick, but gives a generic error message. Wouldn't it be better to tell the user which field contains the error? We can do that by modifying missing_field such that it takes an argument, as follows:

sub missing_field
{
  # Get our local variables
  my (@missing_fields) = @_;
  # Print an a appropriate MIME header 
  print $query->header("text/html");
  # Print a title for the page
  print $query->start_html
  (-title=>"Missing field(s)");
  print "<P>You are missing the following ";
  print "required fields:</P>\n";
  print "<ul>\n";        
  # Iterate through the missing fields, printing
  # them foreach $field (@missing_fields)
  {
    print "<li> $field\n";
  }
  print "</ul>\n";        
        
  # Finish up the HTML
  print $query->end_html;
  exit;
}
We then modify the loop that checks for required fields:

foreach $field (@required_fields)
{
  # Add the name of each missing field
  push (@missing_fields, $field) if 
   ($query->param($field) !~ m/\w/);
}

# If @missing_fields contains any elements, then
# invoke the error routine
&missing_field(@missing_fields) 
  if @missing_fields;
If we want to get really fancy, we can provide English names for each of the required fields, so that users don't have to suffer through the names we used with the HTML form. We can do that by using associative arrays:

$FULLNAME{"name"} = "your full name";
$FULLNAME{"email"} = "your e-mail address";
$FULLNAME{"address"} = "your mailing address";
Then we modify the foreach loop in &missing_fields such that it prints the full name of the missing field, rather than the name associated with it on the HTML form:

# Iterate through the missing fields, printing
# them foreach $field (@missing_fields)
{
  print "<li> $FULLNAME{$field}\n";
}
print "</ul>\n";

Dying with Style

Remember that die statement we put in our original program? Well, think about what will happen if that part of the program is ever truly invoked--die will produce an error message, which is a good thing. But that error message will be sent to our web browser, before the HTML header, giving us the dreaded ``Server error'' message, indicating that something (but not saying what that something is) has gone wrong with our script.

More useful would be a routine that printed the error message to the screen. For example, we could add the following subroutine:

sub error_opening_file
{
    my ($filename) = @_;
    # Print an a appropriate MIME header 
    print $query->header("text/html");
    # Print a title for the page
    print $query->start_html(-title=>"Error 
    opening file");
    # Print the error
    print "Could not open the file 
    \"$filename\".</P>\n";
    # Finish up the HTML
    print $query->end_html;
    exit;
}
And now, we can rewrite the ``open'' statement as follows:

open (FILE, ">>:$filename") || 
  &error_opening_file($filename);
You probably don't want to tell your users your program couldn't open a particular file--not only do your users not care, but you don't need to tell them which files you are using. A more user-friendly version of error_opening_file could tell the user that the server is experiencing some trouble, or is undergoing maintenance or give a similar message that doesn't broadcast catastrophe to the world.

Bringing It All Together

The final version of the program, with (a) required fields, (b) full-English descriptions of those fields, and (c) a better error message when we cannot open the file, reads as follows:

#!/usr/local/bin/perl5
# We want to use the CGI module
use CGI;
# Set the filename to which we want the elements
# saved
my $filename = "/tmp/formcontents";
# Set the character that will separate fields in
# the file
my $separation_character = "\t";
# In what order do we want to print fields?
my @fields = ("name", 
              "email", 
              "address", 
              "country", 
              "sex");
# Which fields are required?
my @required_fields = ("name", 
                       "email", 
                       "address");
# What is the full name for each required field?
$FULLNAME{"name"} = "your full name";
$FULLNAME{"email"} = "your e-mail address";
$FULLNAME{"address"} = "your mailing address";
# Create a new CGI object
my $query = new CGI;
# ---------------------------------------------
# Make sure that all required fields have arrived
foreach $field (@required_fields)
{
    # Add the name of each missing field
    push (@missing_fields, $field) 
    if ($query->param($field) !~ m/\w/);
}
# If any fields are missing, invoke the error
# routine
&missing_field(@missing_fields) 
  if @missing_fields;
# ---------------------------------------------
# Open the file for appending
  open (FILE, "7gt;>$filename") || 
  &error_opening_file($filename);
# Grab the elements of the HTML form
@names = $query->param;
# Iterate through each element from the form,
# writing each element to $filename. Separate
# elements with $separation_character defined
# above.
foreach $index (0 .. $#fields)
{
    # Get the input from the appropriate HTML
    # form element
    $input = $query->param($fields[$index]);
    # Remove any instances of
    # $separation_character
    $input =~ s/$separation_character//g;
    # Now add the input to the file
    print FILE $input;
    # Don't print the separation character after
    # the final element
    print FILE $separation_character if 
    ($index < $#fields);
}
# Print a newline after this user's entry
print FILE "\n";
# Close the file
close (FILE);
# ---------------------------------------------
# Now thank the user for submitting their
# information
# Print an a appropriate MIME header 
print $query->header("text/html");
# Print a title for the page
print $query->start_html(-title=>"Thank you");
# Print all of the name-value pairs
print "<P>Thank you for submitting ";
print "the form.</P>\n";
print "<P>Your information has been ";
print "saved to disk.</P>\n";
# Finish up the HTML
print $query->end_html;

# ---------------------------------------------
# Subroutines
sub missing_field
{
    # Get our local variables
    my (@missing_fields) = @_;
    # Print an a appropriate MIME header 
    print $query->header("text/html");
    # Print a title for the page
    print $query->start_html(-title=>
    "Missing field(s)");
    print "<P>You are missing the following 
    required fields:</P>\0";
    print "<ul>\n";
    # Iterate through the missing fields,
    # printing them
    foreach $field (@missing_fields)
    {
	print "<li> $FULLNAME{$field}\n";
    }
    
    print "</ul>\n";
    
    print "</ul>\n";
    
    # Finish up the HTML
    print $query->end_html;
    
    exit;
}

sub error_opening_file
    {
	my ($filename) = @_;
	# Print an a appropriate MIME header 
	print $query->header("text/html");
	# Print a title for the page
	print $query->start_html(-title=>
        "Error opening file");
	# Print the error
	print "Could not open the file 
        \"$filename\".</P>\n";
	# Finish up the HTML
	print $query->end_html;
	exit;
    }

Creating a Guest-book

One of the most common CGI applications on the Web is a ``guest-book'', which allows visitors to a site to sign in, leaving their names, e-mail addresses and short notes. We can easily construct such a program, using the basic framework seen in the above programs. The only difference between the ``guestbook'' program and the programs we have seen so far is that the guest-book must be formatted in HTML in order for users to be able to read it in their browsers.

Here is a very simple guest-book program that is virtually the same as the previous program we saw:

<HTML>
<Head>
<Title>Guestbook entry</Title>
</Head>
<Body>
<H1>Guestbook entry</H1>
<Form action="/cgi-bin/guestbook.pl" 
    method=POST>
<P>Name: <input type=text name="name" 
    value=""></P>
<P>E-mail address: <input type=text name="email"
value=""></P>
<input type=submit>
</Form>
</Body>
</HTML>
The following program is the same as the one above, except that it saves data to the ``guestbook.html'' and formats the data in HTML.

#!/usr/local/bin/perl5
# We want to use the CGI module
use CGI;
# Set the filename to which we want the elements
# saved
my $filename = 
"/home/reuven/Consulting/guestbook.html";
# Set the character that will separate fields in
# the file
my $separation_character = "</P><P>";
# In what order do we want to print fields?
my @fields = ("name", "email");
# Which fields are required?
my @required_fields = ("name", "email");
# What is the full name for each required
# field?
$FULLNAME{"name"} = "your full name";
$FULLNAME{"email"} = "your e-mail address";
# Create a new CGI object
my $query = new CGI;
# ---------------------------------------------
# Make sure that all required fields have arrived
foreach $field (@required_fields)
{
  # Add the name of each missing field
  push (@missing_fields, $field) if 
  ($query->param($field) !~ m/\w/);
}

# If any fields are missing, invoke the error
# routine

&missing_field(@missing_fields) if 
  @missing_fields;
# ----------------------------------------------
# Open the file for appending

open (FILE, ">>$filename") || 
  &error_opening_file($filename);

# Grab the elements of the HTML form
@names = $query->param;

# Iterate through each element from the form,
# writing each element to $filename. Separate
# elements with $separation_character defined
# above.

foreach $index (0 .. $#fields)
{
  # Get the input from the appropriate HTML form
  # element

  $input = $query->param($fields[$index]);

  # Remove any instances of $separation_character

  $input =~ s/$separation_character//g;

  # Now add the input to the file
  print FILE $input;

  # Don't print the separation character after the
  # final element

  print FILE $separation_character if 
  ($index < $#fields);
}

# Print a newline after this user's entry
print FILE "<BR><HR><P>\n\n";

# Close the file
close (FILE);

# -------------------------------------------

# Now thank the user for submitting his
# information

# Print an a appropriate MIME header 
print $query->header("text/html");

# Print a title for the page
print $query->start_html(-title=>"Thank you");

# Print all of the name-value pairs

print "<P>Thank you for submitting ";
print "the form.</P>\n";

print "<P>Your information has been ";
print "saved to disk.</P>\n";

# Finish up the HTML
print $query->end_html;


# --------------------------------------------
# Subroutines

sub missing_field
{
  # Get our local variables
  my (@missing_fields) = @_;

  # Print an a appropriate MIME header 
  print $query->header("text/html");

  # Print a title for the page
  print $query->start_html(-title=>"
  Missing field(s)");

  print "<P>You are missing the ";
  print "following required fields:</P>\n";
  print "<ul>\n";

  # Iterate through the missing fields, printing
  # them

  foreach $field (@missing_fields)
    {
	print "<li> $FULLNAME{$field}\n";
	}

  print "</ul>\n";

  print "</ul>\n";

  # Finish up the HTML
  print $query->end_html;

  exit;
}


sub error_opening_file
{
  my ($filename) = @_;

  # Print an a appropriate MIME header 
  print $query->header("text/html");

  # Print a title for the page
  print $query->start_html(-title=>
  "Error opening file");

  # Print the error
  print "Could not open the ";
  print "file \"$filename\".</P>\n";

  # Finish up the HTML
  print $query->end_html;

  exit;
}
The above program will take input from the HTML form and save the data in an HTML-formatted file. If that file is accessible from the web server, your users should be able to view others' entries in the guest-book.

Reuven M. Lerner has been playing with the Web since early 1993, when it seemed like more of a fun toy than the world's Next Great Medium. He currently works from his apartment in Haifa, Israel as an independent Internet and Web consultant. When not working on the Web or informally volunteering with school-age children, he enjoys reading (just about any subject, but especially computers, politics, and philosophy--separately and together), cooking, solving crossword puzzles and hiking. You can reach him at reuven@the-tech.mit.edu or reuven@netvision.net.il.