This document explains how to build a database-backed web site using Apache, the PHP3 module and the PostgreSQL relational database.
by Will Benton
Why would one want to link a database to the web? A better question might be, ``Why wouldn't one want to link a database to the web?'' Static HTML pages are good for many things: documentation, hypertext books, personal pages and other unchanging information. However, static pages present a few problems:
I hope this quick rundown is enough to convince you that you really want to use a database for your dynamic web site. In this article, I will present instructions for installing PHP3 (http://www.php.net/) and PostgreSQL (http://www.postgresql.org/), a little bit of theory, some instructions for using SQL and PHP3, and an example application.
This description assumes you are running Red Hat, but most of these instructions will be applicable to other distributions; these tools are fairly painless to install from source, anyway.
Here's a list of what you'll need to run the example application and develop your own applications:
PostgreSQL uses a different access system than the rest of your system; oddly enough, not even root has access to the database by default. The database system has its own user system and passwords, and postgres is the database administrator account by default. The advantage to the separate access system is that one can create database users who do not have UNIX accounts; this way, the database for your web application can specify access control without creating a potential security hole for your system. To add your web administrator (web) as a database user, use createuser (as root):
# su postgres -c createuser Enter name of user to add ---> web Enter user's postgres ID, or RETURN to use UNIX user id: 542 -> 542 Is user "web" allowed to create databases (y/n) y Is user "web" allowed to add users? (y/n) y createuser: web was successfully addedThen, as web (or whatever account you used), you'll be able to create a database with createdb foo and then try some queries on foo using psql foo.
You'll also need to set up PostgreSQL to accept incoming TCP/IP connections so your PHP3 pages can access it. Fortunately, System V init makes this easy. Simply open the file /etc/rc.d/init.d/postgresql and change the line
su postgres -l -c \ 'usr/bin/postmaster -S -D/var/lib/pgsql'so that it reads
su postgres -l -c \ '/usr/bin/postmaster -S -D/var/lib/pgsql -i'While you're at it, you will probably want to specify a different port from the default (5432) for security reasons. To run the PostgreSQL back end on a different port, merely append a -p port to the above line.
Just about every relational database in the world uses SQL (or some extended version of SQL) as its query language. SQL allows you to define tables, select records based on given criteria, update values in one or many records and delete records. This is just a brief introduction to SQL; for more complete references, see Resources.
To create a table, one uses the CREATE TABLE statement. Its syntax is as follows:
CREATE TABLE tablename (field-1 type-1, ..., field-n type-n)In psql, you will need to end each statement with a semicolon. These semicolons are not part of the SQL language, but rather for the benefit of psql's lexer.
You may also declare fields as NOT NULL, UNIQUE or PRIMARY KEY, or specify a value as DEFAULT to a field. PostgreSQL will create an index on primary key fields. Unfortunately, as of version 6.4, PostgreSQL does not support foreign keys, but at least the parser will not choke on the SQL REFERENCES keyword.
Here's an example, akin to the UNIX password file:
CREATE TABLE passwd (username varchar(8) PRIMARY KEY, -- PRIMARY KEY implies UNIQUE cryptedpass char(13), uid int UNIQUE NOT NULL, gid int NOT NULL, gecos varchar(80), -- the GECOS field (real name, office, etc.) homedir varchar(80), shell varchar(50) DEFAULT '/bin/sh');Note that SQL uses a double-dash to begin comments, which are terminated by a newline.
To insert data into a table, use the INSERT statement:
INSERT INTO tablename (field-1, ..., field-n) VALUES (value-1, ..., value-n/)You needn't specify field names if you are inserting values into every field. Here's an example for the table we just created:
INSERT INTO passwd (username, cryptedpass, uid, gid, gecos, homedir, shell) VALUES ('fred', '37MniLTaiPLaL', 42, 500, 'Fred Mbogo', '/home/fred/', '/bin/sh');Note that SQL uses single quotes for string constants. Any closet Pascal programmers will feel right at home.
The SQL SELECT statement returns records where values meet a certain criteria. Here are some examples of SELECT in action:
SELECT * FROM passwd; -- returns all fields of all records SELECT username FROM passwd; -- returns all usernames SELECT * FROM passwd WHERE username = 'fred'; SELECT * FROM passwd ORDER BY username, shell; SELECT * FROM passwd WHERE homedir LIKE '/home%' -- % is the SQL wildcard character AND shell = '/bin/sh' ORDER BY username; SELECT homedir, projectname FROM passwd, projects -- assuming we have a projects table WHERE -- this will return the home directory of passwd.username = projects.leader; -- each project leader for each project
To change field values in records, use UPDATE:
UPDATE tablename SET field-1 = value-n WHERE qualificationThe WHERE is optional, but if you don't specify a WHERE clause, SQL will update all the records, which is clearly the ``Wrong Thing''.
Let's say Fred Mbogo wants to change his shell. This script will do it:
UPDATE passwd SET shell = '/bin/tcsh' WHERE username = 'fred';
To delete records, simply use DELETE:
DELETE FROM tablename WHERE qualifierJust like UPDATE, the WHERE is optional, but you probably want it anyway. Let's say Fred has offended his sysadmin one too many times:
DELETE FROM passwd WHERE username = 'fred';
The on-line PHP3 manual, http://www.php.net/manual/, is an excellent reference and will be necessary reading before you create your own database web application. Furthermore, it is a database-backed web site and has lots of user comments. Here, we will examine just the most basic PHP3 features.
Here is a simple PHP3 program, which demonstrates some basic features. Note the separate HTML and PHP3 blocks:
<title>Hello, world!</title> <body> <?php echo("Hello, world!\n"); echo("<p>\nWhat a <b>bold</b> move this is!\n"); ?> </body>This program will send the following HTML to the remote browser:
<title>Hello, world!</title> <body> Hello, world! <p> What a <b>bold</b> move this is! </body>A similar program, which takes an argument, would look like this:
<title>Hello, world!</title> <body> <?php echo("Hello, $name!\n"); echo("<p>\nWhat a <b>bold</b> move this is!\n"); ?> </body>You would view this page (assuming you called it hello.php3) like any CGI script: http://yourhost.net/~fred/hello.php3?name=fred. This, of course, assumes you are named Fred and have put this file in your /public_html directory.
PHP3 provides a number of useful functions for connecting to databases; the best place to read up on these is http://www.php.net/manual/ref.pgsql.php3, and we shall examine a few of them.
int pg_connect(host, port, options, tty, dbname);This function returns an integer, the ``connection index'', which you will need for all operations on this connection. If a connection can't be established, it returns zero.
Our example application is an address book that one can access over the Internet. A user logs in with her name and password and is presented with a menu of options, including browsing and searching the address book and adding a new person. For each person in the address book, the database stores an arbitrary number of e-mail addresses, telephone numbers, URLs and postal addresses. This address book also has some nifty features like mailing passwords to new accounts and automatic mailto and href links for e-mail and web addresses.
We have already completed the first step of the design process--deciding what we need our application to do. What remains is defining the data that our application will access in three iterations:
The low-level model we shall be using is the relational model, which has been around since 1970 and is the model used by most commercial relational databases, including Oracle, Sybase and Informix.
Finally, we shall define the ways in which our users can see the data. Once we've completed this, we're done with the hard part and can move on to the tedious part--the implementation.
Since the entity-relationship model is such a high-level model, some of the hairier issues of the low-level model are not yet apparent, and our data model looks quite simple. The main advantage of the E-R model is that it presents a clear picture of the database miniworld--the segment of the real world that we're modeling--and is easily comprehensible to both laypersons and software engineers. Entities are defined as follows:
Now we need to move our high-level model (which people can understand) to a low-level model (which our database management system can understand). This process is quite straightforward, although the database designer still has some options at this point, involving normalization. Normalization (see Resources) is a formal process to quantify and measure the quality of a relational model; as always, the tradeoff is theoretical quality versus performance.
The quick-and-dirty algorithm for moving your data model from the entity-relationship to the relational model is this:
SQL code for the ``finished'' relational model can be found at ftp://ftp.ssc.com/pub/lj/listings/issue67/3475.tgz. All code is released under the GNU GPL.
The remaining design step is deciding what sort of capabilities we wish to grant users to access and update the data. This is perhaps not as much of a problem in our database, but what if we were designing a database of employees? It might cause great discord in the office if everyone knew the salary of the guy who spends every day surfing the Web and taking two-hour lunch breaks; however, they should be able to access his name, department and extension. Likewise, they shouldn't be able to change that information unless they are the department secretary or manager.
We do want some protection on our address book, so that you can type in your grandmother's e-mail address with the peace of mind that a spammer can't get it just by accessing your web server. We also don't want to bother the user with implementation details like unique ID numbers on each record--this should be a user-friendly address book. Therefore, we will allow the following:
We also make note of the constraints which we cannot enforce with views: for example, the consideration that one may view only her own address book. We must implement these constraints in the application program.
Implementation in PHP3 is quite straightforward; many things in the example code speak for themselves, and others are well-commented.
The source code for the example application is intended to be more of a teaching tool than a finished product. It works well, but you would certainly want to add features before making a large-scale service from it. I have released it under the GNU GPL, so feel free to modify my code and share your modifications with others. This code is also on the FTP site shown above.
Will Benton can be reached at wcb@ccil.org