The persistent nature of Java servlets makes them ideal for database/web technology. Mr. McDonald takes a look at using servlets with PostgreSQL and JDBC.
by Bruce McDonald
The Common Gateway Interface (CGI) has and continues to be the most commonly used method for creating dynamic and responsive web pages. The main problem with CGI (that stems from the Hyper-Text Transfer Protocol) is that each new client request results in a new instance of the CGI executable being forked by the HTTP daemon. This can lead to considerable resource consumption on web hosting machinery. Many solutions exist to address this problem, most focusing on keeping the executable persistent between client requests. This has the added benefit of holding open costly resources like database and socket connections.
Java is one of the newest kids on the block, but as a C and C++ coder I really like some of the features of this language. Its object model is nice, it is (relatively) portable, and the class libraries available from both Sun and third parties are extensive. Servlets, a rather fanciful name, is the Java answer to the CGI problem. Servlets are Java classes, loaded and kept resident by the HTTP daemon. When the servlet is loaded, it is initialized by a method call; at that point, database connections can be established and held between client requests. In addition to this, there are a number of useful classes which facilitate the more complex server/browser interactions such as cookies. Unfortunately, the Servlet classes from Sun are still in a fluid state and, therefore, code written now may be broken by future releases. This is a fact of programming life and since this is a small application, not much harm is likely.
Since my site has been running the ubiquitous Apache web server, it was an easy choice to stick with this software. The Servlet part was the more tricky decision. I decided to play it safe and get the Apache-Java solution (http://java.apache.org/). This is not a finished product, but it certainly looked acceptable, and I wanted to have access to the source if anything broke. Alternatives include Live Software's JRun and IBM's Servlet Express. JRun is able to work with a number of web servers including Apache and Netscape, and the code is available under a fairly lenient license. Servlet Express is more of an unknown quantity--that is, I did not even evaluate it since it is only available for Apache 1.2.
The source code for JServ is available for download from the Java-Apache site, and the version I used was 0.9.11. Unpack the code using tar into a suitable directory (/usr/local/lib/JServ on my machine) and read the documentation carefully. A Makefile does most of the magic. The Makefile does need to be edited to modify the installation directory--alas, no autoconf. A recent JDK is required--I used Steve Byrnes 1.1.6-5 which I have found to be both fast and stable. I also have the tya 1.1-3i JIT (Just In Time Compiler) which quite transparently super-charges the execution of Java code, but it is not necessary. Make the JServ code to compile the Java source files to a local directory. Included in the JServ package is the requisite Servlet code from Sun. Both the Sun Servlet code (in the form of a jar file) and the JServ code need to be added to your CLASSPATH variable for both development and Apache.
Now it's Apache's turn. Get a nice, recent copy--I have 1.3.1, which I find to be a splendid creation--and unpack the source code into another directory. The magic of building an external Apache module is described in the INSTALL file in the root of the /apache directory. In order to compile the Apache daemon, it is first necessary to ``configure'' the Apache build environment. This is done through the wonderfully arcane autoconf-produced script configure. You merely need to add the following to the ./configure line:
--add-module=/usr/local/lib/jserv/mod_jserv.c.This will produce a Makefile that will copy the mod_jserv.c file to the ./src/modules/extra directory and compile it into Apache. Note that merely copying the mod_jserv.c file into the extra directory will not accomplish anything. In order to make this new module shared (assuming DSO in apache is configured), add the following line to the configure command line:
--enable-shared=jservThis option will produce a .so shared object which is dynamically linked to Apache when necessary.
Obviously some care is required. I cribbed most of the details from the apache.spec file from the apache SRPM; I advise you to get this file and install it. Make the changes in the spec file and rebuild the RPM (rpm -ba apache.spec), then install the new RPM. This applies only to RPM-based systems (Red Hat, SuSE, Caldera, etc).
A number of tags need to be added to your httpd.conf file. This is situated in /etc/httpd/conf in my only slightly modified Red Hat 5.1 machine. Listing 1 is an excerpt from my file. All tags in Listing 1 are covered in the JServ documentation, and the JAVA_COMPILER is really only applicable to systems which have a JIT installed. Notice that all the relevant classpaths are included: there is no need for Swing and other such visual classes to be present. I also set the LD_LIBRARY_PATH to the Java shared objects. If your Java code relies on shared libraries not located in the directories listed in your /etc/ld.so.conf file, then you must add these directories to LD_LIBRARY_PATH. Additionally the JServ error log (ServletErrorLog) is an important configuration item. It is in this file that all JServ-related errors and exceptions are dumped. Keep an eye on the regular httpd ErrorLog file, too. During testing, I run the command tail -f on the JServ error log file.
Your current Apache daemon must now be restarted. How you do this will vary depending on your distribution, but for Red Hat systems simply use the SysV scripts to do the magic:
/etc/rc.d/init.d/httpd stop /etc/rc.d/init.d/httpd startIn general, it should be possible to merely kill all instances of the httpd daemon and then rerun the new executable:
killall httpd /usr/sbin/httpdNow let's get a little servlet to test the current functionality of the JServ-enabled Apache. The Java servlet in Listing 2 illustrates the benefits and simplicity of the servlet model. This code keeps a counter (hits) that is reset to zero when the servlet is initialized. Every time the servlet is invoked by Apache, the counter is displayed and incremented. Notice that no effort is made to keep the servlet and its data persistent.
Every servlet must provide a service method which gets called whenever Apache needs to serve the URL http://roger.dodger.org/servlet/ServletOne. The two arguments passed, HttpServletRequest and HttpSerletResponse, encapsulate the HTTP connection. Information on HTTP-specific data such as cookies can be manipulated with these two objects. The method init is called only once when the class is first loaded (or reloaded) by Apache. This initialization method can be used to set up long-lived and potentially costly resources. From this simple example, it can be seen that the class, once loaded by Apache, remains persistent until it is reloaded. The hits variable is initialized once (in the init method), and is then incremented each time service is called. JServ will reload the class if it has been modified since it was first loaded. Sometimes it is necessary to reload the class if a change is made to the property file; in this case, touch the class file. This happens quite transparently.
Servlets are most often mentioned in the same context as database connectivity, and rightly so. Servlets and their persistent nature make for ideal database/web integration technology. My first forays into this led me quickly to the conclusion that CGI scripts were too slow for anything but large-scale processing with databases--where the initial setup of the database connection matched (or even exceeded) the actual database operations. Since I was familiar with PostgreSQL which came with my Red Hat 5.1 installation, I decided to use that RDBMS for experimentation. I also had some experience with developing Java database client software. The Java Database Connectivity (JDBC) classes in the JDK (Java Development Kit) are easy to use and somewhat portable between databases. The JDBC classes are comprised of database-independent code (the java.sql.* classes, part of the JDK) and database-specific code. The JDBC classes for PostgreSQL are part of the later distributions and are necessary for this application.
All this can be clearly demonstrated by a small JDBC example. In order for this to be a useful example, it is necessary to have a working PostgreSQL installation. Listing 3 is a rather dense example that illustrates a number of JDBC issues. First, the setup of the PostgreSQL JDBC driver (loginJdbc) and the login to the database (loginUrl, loginUser and loginPasswd). The variable loginUrl is often tricky to get right (especially when learning). The last part, in this case ``dbname'', is the name of the database to which the connection is being made. The latter two constants (loginUser and loginPasswd) need to be changed to reflect your database environment. Setting the permissions to the database involves executing the command createuser (as user postgres) and then granting the needed permissions to the user. Once the connection is established, the SQL SELECT statement is executed and the result set is captured by the variable rs. Meta-data for this result set (rsmd) is then used to determine the number of columns returned and to display the column names. The result set is then iterated through, printing each row as strings. This example is very simple-minded and is intended only for SQL SELECT statements because of the result set. Be prepared for exceptions; e.g., an exception is thrown if no columns are returned.
My first guest-book application was a Perl CGI script that kept all the guest entries in a formatted text file that was displayed or added to as the parameters dictated--simple and effective. Unfortunately, I wanted to do more with that information, including linking back to the Apache log files and maybe even handing out cookies to monitor usage. The script, which started small and lean, began to choke on the file processing necessary. It was also a little distressing, since I had a very functional RDBMS installed and ready to use. I studied the alternatives (mod_perl) and decided to go with Servlets. The first step was to design the tables. The most important table holds each guest-book entry on a row.
CREATE TABLE guest_book ( entryid INT NOT NULL, response CHAR(8), name VARCHAR(32), addr VARCHAR(48), email VARCHAR(24), time DATETIME, comment TEXT, PRIMARY KEY (entryid) )Each row in the guest-book table is uniquely identified by the entryid column which has an implicit index. Additionally, the handling of these keys is handled by a key table, a convenience in a small application but almost essential in larger applications.
CREATE TABLE key_table ( id INT NOT NULL, val INT DEFAULT 0, PRIMARY KEY (id) )The key table is able to keep track of any number of keys within the range of the int type, which in my case is [-231,231-1]. In order to get a new key, the current key must be retrieved and then incremented. This must be atomic, i.e., the operation must take place in a transaction. This is illustrated in the getKey method. Ideally, a stored procedure would handle all these details (the key-table concept should not be visible to application developers in an ideal world), but this level of detail with PostgreSQL involves C shared libraries and Database APIs--something with which I did not want to get involved.
An additional problem that needed solving was how to provide servlets with property files in a consistent fashion. Property files provide a convenient way of placing data that would otherwise be hard-coded into the application. This is done by providing a property on the command line to JServ called base.dir which points to a world-writable directory (but is chmod +t so that user's may not stomp on other users files). This is specified in the httpd.conf file in the ServletBinaryArgument tag. In this directory you can store property files which may be loaded by servlets. I am sure this can be done in a smarter fashion.
The main processing entry point is the service method. From here, either the list of entries is displayed (listEntries), a form is displayed (showForm) or a new entry is made (addEntry). The code for the Guest Book application is not shown due to space considerations, but is included in the archive file at ftp://ftp.ssc.com/pub/lj/listings/issue67/3243.tgz.
Listing 4 is the property file used to set up the various parts of the servlet. I moved a number of the HTML header and footer strings into this file along with the JDBC configuration parameters.
The servlet/JDBC/PostgreSQL proved to be a powerful and fast technology. Most of the problems I encountered were configuration problems that required me to carefully read the associated documentation. Unfortunately, technical documentation on Servlets is scarce and I would urge further experimentation. The next step I took with this application was to install Sybase ASE for Linux, a RDBMS with which I am comfortable. I then coded a number of stored procedures that allowed the Servlet to delegate most of its data manipulation to the database, where it rightly belongs. If you are going to replace PostgreSQL with Sybase, it is necessary to get the jConnect JDBC classes from the Sybase web site. Of course, this can be done with PostgreSQL, but learning the details of an RDBMS C API was tangential to the exercise.
Bruce McDonald can be reached at bruce@triphop.dyn.ml.org.
All listings referred to in this article are available by anonymous download in the file ftp://ftp.ssc.com/pub/lj/listings/issue67/3243.tgz.