CGI Programming on the World Wide Web

Previous Chapter 10
Gateways to Internet Information Servers
Next
 

10.3 Socket I/O in Perl

The functions used to set up sockets in Perl have the same names as the corresponding UNIX system functions, but the arguments to the socket functions are slightly different, because of the way Perl works. Let's look at an example that implements a client to the finger server.

Please note that this not a CGI script. However, it should be very easy to convert this to a CGI script if so desired. It is meant to be run from the command line and to be passed one argument, the name of the user you want information about:

% finger_client username[@host]

As you can see, the calling format is identical to that of the UNIX finger command. In fact, this program works in the same exact manner.

#!/usr/local/bin/perl
require "sys/socket.ph";

The Perl header file " socket.ph" contains definitions pertaining to different types of sockets, their addressing schemes, etc. We will look at some of these definitions in a moment.

If this file is not found, you (or the system administrator) need to run the h2ph Perl script that converts all the C/C++ header files into a format that Perl can understand. Now, let's continue.

chop ($hostname = `bin/hostname`);
$input = shift (@ARGV);

The current hostname is retrieved with the UNIX hostname command. And the input to the script is stored in the input variable. The shift statement simply returns the first element of an array.

($username, $remote_host) = split (/@/, $input, 2);

The specified username and remote host are split from the input variable.

unless ($remote_host) {
    $remote_host = $hostname;
}

If no host is specified, it defaults to the local host.

$service = "finger";

Once you create a socket, it is usually bound (or attached) to a port on the machine. In order to send a message--or request--to the server, you have to send it to the port the server is running on. Generally, most of the common servers (like FTP, Archie, Gopher, HTTP, and Finger) run on specific ports, and are usually the same on nearly all hosts across the Net. Otherwise, clients on different machines would not be able to access the servers, because they would not know what port the server is bound to. A list of all the ports and the servers attached to them are listed in the /etc/services file.

In this case, we are specifying the server's name, and not the port number. In case you are curious, the finger server runs on port 79. Later on, the getservbyname function converts the service "finger" to the correct port number.

$socket_template = "S n a4 x8";

This represents a 16-byte structure that is used with sockets for interprocess communications on the Internet. The first two bytes represent the numeric codes for the Internet address family in the byte order the local machine uses for short integers. The next two bytes represent the port number you want to connect to, in Internet standard byte order (i.e., big endian--the high byte of the integer is stored in the leftmost byte, while the low byte is stored in the rightmost byte). Bytes four through eight represent the IP address, and the last eight contain "\0" characters. We will see this in action soon.

$tcp = (getprotobyname("tcp"))[2];

Since the finger server is set up as a TCP protocol (don't worry about what this means!), we need to get a numeric code that identifies this protocol. The getprotobyname functions returns the name, alias, and number of the specified protocol. In our case, we are storing just the third element, as we do not need the others. As a side note, the constant AF_NS (from the sockets.ph header file) can be used instead of calling the getprotobyname function.

if ($service !~ /^\d+$/) {
    $service = (getservbyname ($service, "tcp"))[2];
}

If the service specified in the variable is not a numeric value, the getservbyname function uses the /etc/services file to retrieve the port number.

$current_address = (gethostbyname ($hostname))[4];
$remote_address  = (gethostbyname ($remote_host))[4];

The gethostbyname function converts a host name into a packed string that represents the network location. This packed string is like a common denominator; it needs to be passed to many functions. If you want to convert this string into the IP address, you have to unpack the string:

@ip_numbers = unpack ("C4", $current_address);
$ip_address = join (".", @ip_numbers);
unless ($remote_address) {
    die "Unknown host: ", $remote_host, "\n";
}

If the packed string representing the remote host is not defined, it signifies that the location does not exist.

$current_port = pack ($socket_template, &AF_INET, 0, $current_address);
$remote_port  = pack ($socket_template, &AF_INET, $service, $remote_address);

These two lines are very important! Using the socket template we discussed earlier, three values representing the Internet addressing scheme, the port number, and the host name, are packed to create the socket structure that will be used to actually create the socket. The &AF_INET is a subroutine defined in the socket header file that refers to the Internet addressing (i.e., 128.197.27.7) method. You can also define other addressing schemes for sockets, such as &AF_UNIX, which uses UNIX pathnames to identify sockets that are local to a particular host.

socket (FINGER, &AF_INET, &SOCK_STREAM, $tcp) || die "Cannot create socket.\n";

The socket function creates a TCP/IP (Internet Protocol) socket called FINGER, which can actually be used as a file handle (as we will soon see). That is one of the simple beauties of sockets: Once you get through the complicated connecting tasks, you can read and write them like files.

The &SOCK_STREAM (another subroutine defined in the header file) value indicates that data travels across the socket as a stream of characters. You can also choose the &SOCK_DGRAM paradigm in which data travels in blocks, or datagrams. However, SOCK_STREAM sockets are the easiest to use.

bind (FINGER, $current_port)   || die "Cannot bind to port.\n";
connect (FINGER, $remote_port) || die "Cannot connect to remote port.\n";

The bind statement attaches the FINGER socket to the current address and port. Finally, the connect function connects the socket to the server located at the address and port specified by remote_port. If any of these functions fail, the script terminates.

$current_handle = select (FINGER);
$| = 1;
select ($current_handle);

This group of statements is used to unbuffer the socket, so the data coming in and going out of the socket is displayed in the correct order.

print FINGER $username, "\n";

The specified username is sent to the socket. The finger server expects a username only. You can test to see how the finger server works by using telnet to connect to port 79 (where the server resides):

% telnet acs.bu.edu 79
Trying 128.197.152.10 ...
Connected to acs.bu.edu.
Escape character is '^]'.
shishir
.
.
. (information returned by the server for user "shishir")
.
.

To complete our program:

while (<FINGER>) {
    print;
}
close (FINGER);
exit (0);

The while loop simply reads the information output by the server, and displays it. Reading from the socket is just like reading from a file or pipe (except that network errors can occur). Finally, the socket is closed.

If you found the explanation of socket creation confusing, that is OK. You will not have to write code like this. An easier set of functions will be explained shortly.


Previous Home Next
What Are Sockets? Book Index Socket Library

HTML: The Definitive Guide CGI Programming JavaScript: The Definitive Guide Programming Perl WebMaster in a Nutshell