You might wonder, "Now that I know how CGI works, what programming language can I use?" The answer to that question is very simple: You can use whatever language you want, although certain languages are more suited for CGI programming than others. Before choosing a language, you must consider the following features:
Let's look at each of these features in more detail. Most CGI applications involve manipulating text (as you will see throughout this book) some way or another, so inherent pattern matching is very important. For example, form information is usually "decoded" by splitting the string on certain delimiters.
The ability of a language to interface with other software, such as databases, is also very important. This greatly enhances the power of the Web by allowing you to write gateways to other information sources, such as database engines or graphic manipulation libraries.
Finally, the last attribute that must be taken into account is the ease with which the language can access environmental variables. These variables constitute the input to the CGI program, and thus are very important.
Some of the more popular languages for CGI programming include AppleScript, C/C++, C Shell, Perl, Tcl, and Visual Basic. Here is a quick review of the advantages and, in some cases, disadvantages of each one.
Since the advent of System 7.5, AppleScript is an integral part of the Macintosh operating system (OS). Though AppleScript lacks inherent pattern-matching operators, certain extensions have been written to make it easy to handle various types of data. AppleScript also has the power to interface with other Macintosh applications through AppleEvents. For example, a Mac CGI programmer can write a program that presents a form to the user, decode the contents of the form, and query and search a Microsoft FoxPro database directly through AppleScript.
C and C++ are very popular with programmers, and some use them to do CGI programming. These languages are not recommended for the novice programmer; C and C++ impose strict rules for variable and memory declarations, and type checking. In addition, these languages lack database extensions and inherent pattern-matching abilities, although modules and functions can be written to achieve these functions.
However, C and C++ have a major advantage in that you can compile your CGI application to create a binary executable, which takes up fewer system resources than using interpreters (like Perl or Tcl) to run CGI scripts.
C Shell lacks pattern-matching operators, and so other UNIX utilities, such as sed or awk, must be used whenever you want to manipulate string information. However, there is a software tool, called uncgi and written in C, that decodes form data and stores the information into shell environment variables, which can be accessed rather easily. Obviously, communicating with a database directly is impossible, unless it is done through a foreign application. Finally, the C Shell has some serious bugs and limitations that make using it a dangerous proposition for the beginner.
Perl is by far the most widely used language for CGI programming! It contains many powerful features, and is very easy for the novice programmer to learn. The advantages of Perl include:
Because of these overwhelming advantages, Perl is the language used for most of the examples throughout this book.
To whet your appetite slightly, here is an example of a CGI Perl program that creates the simple virtual document presented in the Virtual Documents section that appeared earlier in this chapter:
#!/usr/local/bin/perl print "Content-type: text/plain","\n\n"; print "Welcome to Shishir's WWW Server!", "\n"; $remote_host = $ENV{'REMOTE_HOST'}; print "You are visiting from ", $remote_host, ". "; $uptime = `/usr/ucb/uptime` ; ($load_average) = ($uptime =~ /average: ([^,]*)/); print "The load average on this machine is: ", $load_average, ".", "\n"; print "Happy navigating!", "\n"; exit (0);
The first line of the program is very important. It tells the server to run the Perl interpreter located in /usr/local/bin to execute the program.
Simple print statements are used to display information to the standard output. This CGI program outputs a partial HTTP header (the one Content-type header). Since this script generates plain text and not HTML, the content type is text/plain.
Two newlines (\n) are output after the header. This is because HTTP requires a blank line between the header and body. Depending on the platform, you may need to output two carriage-return and newline combinations (\r\n\r\n).
The first print statement after the header is a greeting. The second print statement after the header displays the remote host of the user accessing the server. This information is retrieved from the environmental variable REMOTE_HOST.
As you peruse the next bit of code, you will see what looks like a mess! However, it is a combination of very powerful search operators, and is called a regular expression (or commonly known as regexp)--see the expression below. In this case, the expression is used to search the output from the UNIX command uptime for a numeric value that is located between the string "average:" and the next comma.
Finally, the last statement displays a good luck message.
Tcl is gaining popularity as a CGI programming language. Tcl consists of a shell, tclsh, which can be used to execute your scripts. Like Perl, tclsh also contains simple constructs, but is a bit more difficult to learn and use for the novice programmer. Like Perl, Tcl contains extensions to databases and graphic libraries. It also supports regular expressions, but is quite inefficient in handling these expressions at compile time, especially when compared to Perl.
Visual Basic is to Windows what AppleScript is to the Macintosh OS as far as CGI programming is concerned. With Visual Basic, you can communicate with other Windows applications such as databases and spreadsheets. This makes Visual Basic a very powerful tool for developing CGI applications on a PC, and it is very easy to learn. However, Visual Basic lacks powerful string manipulation operators.