Contents:
Overview
What Are Sockets?
Socket I/O in Perl
Socket Library
Checking Hypertext (HTTP) Links
Archie
Network News on the Web
Magic Cookies
Maintaining State with a Server
Forking/Spawning Child Processes
You have probably heard of information servers on the Internet such as Archie (which lets you search FTP sites) and NNTP (the Usenet news server). Like the Web itself, these services run as protocols on top of TCP/IP. To make these services available over the Web, you can develop CGI applications that act as clients to other Internet information servers using the TCP/IP network protocol.
Let's start by looking at how a server functions. Take an electronic mail application (though the theory can apply to any other server). Most mail programs save the user's messages in a particular file, typically in the /var/spool/mail directory. When you send mail to someone on a different host, the mail program must find the recipient's mail file on that machine and append the message to it. How exactly does the mail program achieve this task, since it cannot manipulate files on a remote host directly?
The answer to this question is interprocess communication (IPC). A process on the remote host acts as a messenger for the mail process on that machine. The local process communicates with this remote agent across a network to "deliver" mail. As a result, the remote process is called a server (it "services" an issued request), and the local process is referred to as a client. The Web works along the same philosophy: the browser is the client that issues a request to an HTTP server that interprets and executes the request.
The most important thing to remember here is that the client and the server must "speak the same language." In other words, a particular client is designed to work with a specific server. So, for example, an Archie client cannot communicate with a Web server. But if you know the stream of data expected by a server, and the stream produced as output, you can write a CGI program that communicates with it, as we showed in the previous chapter.
One very useful application we will show in this chapter is one where you create both the client and the server. This will be a cookie handler, which helps you keep track of data when it is entered into multiple forms.
The communication protocols depend on the type of UNIX system. The version of UNIX from AT&T, called System V, provides STREAMS to communicate with processes across a network. On the other hand, the BSD flavor of UNIX, from the University of California at Berkeley, implements objects called sockets for network communication. In this chapter, we will look only at BSD sockets (also adopted by the PC world), which are, by far, the most popular way to handle network communications.