CGI Programming on the World Wide Web

Previous Chapter 10
Gateways to Internet Information Servers
Next
 

10.10 Forking/Spawning Child Processes

Before we end this chapter, let's look at a very powerful feature found on the UNIX operating system: concurrent processes.

The cookie server we discussed can accept only one connection at a time, although it will queue up to five connections, which it will handle sequentially, one after the other. Because of the way the server operates--storing information in variables--it cannot be designed to handle multiple connections simultaneously. Let's look at the reason for this.

In UNIX, a process (parent) has the ability to create another process (child) that executes some given code independently. This can be really useful for programs that need a lot of time to finish. For example, if you have a CGI program that needs to calculate some complex equation, search large databases, or delete and cleanup a lot of files, you can "spawn" a child process that performs the task, while the parent returns control to the browser. In such a case, the user does not have to wait for the task to finish, because the child process is running in the background. Let's look at a simple CGI program:

#!/usr/local/bin/perl
$| = 1;
print "Content-type: text/plain", "\n\n";
print "We are about to create the child!", "\n";
if ($pid = fork) {
        print <<End_of_Parent;
I am the parent speaking. I have successfully created a child process.
The Process Identification Number (PID) of the child process is: $pid.
The child will be cleaning up all the files in the directory. It might
take a while, but you do not have to wait!
End_of_Parent
 
} else {
        close (STDOUT);
        
        system ("/usr/bin/rm", "-fr", "/tmp/CGI_test", "/var/tmp/CGI");
        exit(0);
}
print "I am the parent again! NOAow it is time to exit.", "\n";
print "My child process will work on its own! Good Bye!", "\n";
exit(0);

The fork command actually creates a child process, and returns the PID of the process to the parent, and a value of zero to the child. In this example, the first block of code is executed by the parent, while the second block is executed by the child. The one thing you have to note is that the child process gets a copy of all the variables and subroutines that are available to the parent. However, if the child process makes any modifications at all, they are simply discarded when it exits; they do not affect the parent process.

This is the main reason why the cookie server cannot handle multiple connections. There are two issues here. The first is that multiple connections are not supported. Once the CGI program connects to the server, the server handles requests from the program, and so cannot accept any more connections until the program breaks the connection. The only way to allow multiple connections is to fork a process every time there is a connection, so there is a new process to handle each connection.

This leads us to the second issue. If there is a separate child process to handle each connection, then each process would have its own variable namespace (along with a copy of the parent's data). If a child process modifies or stores new data (in variables), then that data is gone once the process terminates, and there is no way to pass that data back to the parent. That's why we only have one server that keeps track of the data one connection at a time.

The system command that we have been using to execute UNIX commands is implemented in the following way:

unless (fork) {
    exec ("command");
}
wait;

This is identical to:

system ("command");

Basically, the child process--the unless block executes only if the return value from fork is zero--executes the specified command, while the parent waits for it to finish. Here is how we could implement a server that handles multiple connections simultaneously (although this approach will not work for our cookie server):

$SIG{'CHLD'} = "wait_for_child_to_die";
while (1) {
    ( ($ip_name, $ip_address) = &accept_connection (COOKIE, SOCKET) )
        || die "Could not accept connection.", "\n";
    
    if (fork) {
        #
        # Parent Process (do almost nothing here)
        #
    } else {
        #
        # Child Process (do almost everything here)
        #
    }
    &close_connection (COOKIE);    
}
sub wait_for_child_to_die
{
    wait;
}

One important note: If a parent does not wait for a child process to die, certain "zombie" processes will be left on the system.


Previous Home Next
Maintaining State with a Server Book Index Advanced and Creative CGI Applications

HTML: The Definitive Guide CGI Programming JavaScript: The Definitive Guide Programming Perl WebMaster in a Nutshell