To learn more about building dynamic web sites, Mr. Lerner presents an application for reading the news using Mason and MySQL.
by Reuven M. Lerner
Last month, we took an initial look at Mason, a template system that sits on top of mod_perl and allows us to create fast-executing dynamic web sites built out of small components.
This month, we will look at a simple application built in Mason--a system to display the latest press releases on a corporate site. Of course, such a system could be tailored in a number of ways, including an on-line newspaper or other publication in which information changes on a regular basis. In creating this small site, we will see some of the steps involved in working with Mason.
The core element of our news system will be a relational database. I will use MySQL in these examples, although any relational database system should work fine.
I created a new MySQL database called ``atfnews'' on my MySQL server and assigned privileges so that the user atfnews can connect using the password ``atfpass''. I then created the following two tables:
CREATE TABLE Categories ( category_id SMALLINT UNSIGNED AUTO_INCREMENT, category_name VARCHAR(25) NOT NULL, PRIMARY KEY(category_id), UNIQUE(category_name) ); CREATE TABLE Articles ( article_id MEDIUMINT UNSIGNED AUTO_INCREMENT, category_id SMALLINT UNSIGNED NOT NULL, posting_date TIMESTAMP NOT NULL, headline VARCHAR(30) NOT NULL, body TEXT NOT NULL, PRIMARY KEY(article_id), UNIQUE(category_id, headline) );As you can probably tell from their names, the Categories table contains a list of category ID numbers and names. The Articles table contains several more pieces of information, including an article ID, the category ID into which an article should be placed, the date and time at which the article was posted, the article's headline and its body. We ensure no two articles in a given category have the same headline with a UNIQUE clause at the end of our CREATE TABLE statement.
The posting_date column takes advantage of MySQL's TIMESTAMP data type. This type automatically inserts the time and date of the latest INSERT or UPDATE to a given row. In this way, we can easily determine when news stories were added to the database, without having to enter or keep track of the information ourselves.
In order for our news system to work, we will need to create at least two different sets of components. One set will allow users to enter news items into the database (i.e., perform INSERTs), and the second will make it possible to retrieve items from the database (i.e., perform SELECTs). In a production setting, we would probably want to restrict posting access to a selected number of users. This would be possible with a standard .htaccess file, which allows users to restrict access to individual files or directories, or with a more sophisticated system that stores user information in a database.
One of Mason's strong points is its use of components. Components are actually Perl subroutines, cleverly disguised in the form of HTML files with some Perl thrown in. (Mason's parser performs the underlying magic that turns components into subroutines.) This structure means that repeated functionality can be packaged into one component, then invoked from within other components.
For example, Listing 1 contains a component called ``database-connect.comp''. This component returns a value, rather than producing HTML output. Its purpose is to connect to a database server and return a database handle, typically called $dbh. By centralizing this connection code, we can easily move our site from one server to another, changing only the relevant $host, $user, $password and $database variables as necessary.
Once database-connect.comp has been configured, any component on our system can receive a valid database handle with the following code:
<%init> my $dbh = $m->comp(database-connect.comp); </%init>The above code takes advantage of Mason's object-oriented interface, using the predefined $m object to invoke another component.
By placing the assignment inside of <%init>, we ensure that the component will connect to the database before anything else occurs within the component. However, this also means we are creating a new lexical variable ($dbh) with each invocation of the component.
It would be slightly more elegant to perform the above assignment within a <%once> section, creating $dbh a single time and keeping the value around. However, <%once> sections are executed outside of the Mason component context, meaning they cannot invoke methods on $m. Moreover, <%once> sections are invoked before new Apache child processes are created, which a $dbh object might not like. Thus, it is common to define $dbh in a <%once> section, but to perform the assignment in <%init>:
<%once> my $dbh; </%once> <%init> $dbh = $m->comp(database-connect.comp); </%init>The plain-vanilla mason.pl (or ``handler.pl'', as the Mason documentation describes it) configuration file that comes with the Mason distribution is almost good enough for this system to work. We need to load only Apache::DBI, a wrapper module that works with DBI within the mod_perl environment, ensuring that database connections are created and dropped only as necessary.
In order to load Apache::DBI, we need to put a use Apache::DBI statement in mason.pl, which is loaded with a PerlRequire statement in httpd.conf. In order to save some memory, we insert a PerlModule Apache::DBI line into httpd.conf. This ensures the module is loaded into memory before Apache splits into numerous child processes. The module might still require a fair amount of memory, but at least that memory will be shared among all Apache processes rather than requiring each one to have its own copy.
The first step toward making our news system work is to allow users to create new categories. Each news story in our simple system will be placed in precisely one category, much as each story in a newspaper is printed in only one section.
If we were to use the CGI model for creating a database editor, we would need to create an HTML form, pointing its <Form> action to the URL of a CGI program. That CGI program would then need to retrieve the HTML form elements, connect to the database and perform an INSERT.
With Mason, all this becomes much easier because of the relationship between HTML form elements and variables. We will still need two different components, one that presents the form and another that acts on the form's contents. The first component, add-category-form.html (see Listing 2), is a normal HTML form, with a single text field and a ``submit'' button. The only difference between this form and its non-Mason counterpart is the action attribute in the <Form> tag. In Mason, even a file with an .html suffix is a program and can thus receive input from an HTML form.
The component that handles the input and inserts a new row into the Categories table is called add-category.html (see Listing 3). As is often the case with Mason components, you must first look at the component's final sections (<%once>, <%init> and <%args>) in order to understand what is happening.
In the case of add-category.html, our <%once> section merely defines $dbh, as described above. The <%init> section performs two actions. First, it defines $dbh based on the returned value from ``database-connect.comp''. Once the database connection has been established, the <%init> section goes on to INSERT the user's input into the database. Notice how we use DBI's placeholders, shown here as a question mark in the list of VALUES, to avoid potential problems with quoted strings within our SQL query.
The placeholder is filled in with the value of $new_category_name, a scalar variable defined in <%args>:
<%args> $new_category_name </%args>By defining it there, we indicate that add-category.html must receive an HTML form element new_category_name when it is invoked. We could have given new_category_name a default value; however, this value is crucial to the functioning of add-category.html and must be mandatory.
Depending on whether the SQL INSERT succeeds, the scalar variable $successful_insert is set to true or false. This value is then used in the large if-else statement, to produce HTML that reflects the success or failure of the INSERT shown at the beginning of Listing 3. Notice how $DBI::errstr, the standard DBI error message, is available from within our component.
Once we have added one or more categories, we can begin to insert news items into the system. Unlike add-category-form.html, add-news-form.html (Listing 4) will need to connect to the database and cannot be a simple HTML form. This is because we want to present the user with a <select> list of current categories. In order to create this list dynamically, we will need to connect to the database and perform a simple SELECT. Other than that, the HTML form is relatively straightforward. We will use a table to organize the titles and form elements, but it consists of three basic elements: a headline, the body text and a category <select> list.
I decided to do this in a relatively inefficient (but easy to understand) way, using an SQL ORDER BY clause to retrieve names in alphabetical order. In order to keep track of the two different values (ID and name), I put them into the @categories array:
while ($row_ref = $sth->fetchrow_arrayref) { my ($id, $name) = @$row_ref; push @categories, {id => $id, name => $name}; }We can then iterate through @categories, placing the category ID as the ``value'' attribute (which will be submitted to the add-news.html component), but displaying the name of the category:
<select name="category_id"> % foreach my $category (@categories) { <option value="<% $category->{id} %>"> <% $category->{name} %> % } </select>Listing 5
The component that adds news, add-news.html (Listing 5), is almost identical to add-category.html, except it inserts three values rather than just one: the category ID, the headline and the body of the article. If the submission is successful, we tell the user that the article has now been placed in the database.
While we could retrieve the news directly into a top-level component, it is easier for us to create a generic component that retrieves any number of articles from any category. In this way, we can use this ``get-news.comp'' component in a number of different high-level components, retrieving the number and type of articles that interest us.
Listing 6, get-news.comp, is fairly straightforward, returning a list of articles to the caller. It builds the article list much as we built the category list in add-news-form.html, retrieving each of the articles:
while ($row_ref = $sth->fetchrow_arrayref) { my ($headline, $body, $posting_date) = @$row_ref; push @articles, {headline => $headline, body => $body, posting_date => $posting_date}; } return @articles;We take advantage of MySQL's LIMIT clause to restrict the retrieval to only as many articles as the user is interested in receiving. Also, we retrieve the articles in reverse order of their arrival, so that the article with the latest timestamp will come first. This ensures whenever we retrieve the latest five articles, they will indeed be the newest:
my $sql = "SELECT headline, body, posting_date "; $sql .= "FROM Articles "; $sql .= "WHERE category_id = ?"; $sql .= "ORDER BY posting_date DESC "; $sql .= "LIMIT ?";
get-news.comp returns the latest news into an array. But, of course, users are interested in reading news, not looking at a Perl array. For that reason, we'll define two more top-level components: one to choose the category and number of articles we wish to read and one to display them.
First, we will create a component view-stories-form.html (Listing 7), which lets us choose a category and maximum number of stories to display. This component repeats the paradigm of creating a <select> list from a Perl array. It then invokes view-stories.html (Listing 8), a simple component which does nothing but iterate through the stories returned by get-news.comp, placing them in a nicely formatted page of HTML.
As you can see, the amount of effort and code necessary to create this site was fairly modest. And while this is a relatively simple site, it does work--and it represents one way in which Mason and databases can be used together to create a dynamic site in a minimum amount of time. True, we ended up writing a number of components; but at least two of them are reusable if we decide to expand the site in the future, and thus will reduce the amount of work and debugging we'll have to do at that time.
With a bit more work, we could add personalization to this site, allowing users to read only news that is new to them and in only the categories that interest them.
As I indicated last month, Mason has increasingly become my tool of choice for producing these sorts of web sites, because of the speed and ease with which I can do so. The fact that I can separate tasks into reusable components and the high-speed gains from working within mod_perl are bonuses that make Mason an extremely attractive web-development environment.
Reuven M. Lerner, an Internet and Web consultant, moved to Modi'in, Israel following his November marriage to Shira Friedman-Lerner. He has written a book, Core Perl published by Prentice-Hall. Reuven can be reached at reuven@lerner.co.il. The ATF home page, including archives, source code and discussion forums, may be found at http://www.lerner.co.il/atf/.