return to first page linux journal archive
keywordscontents

At the Forge

Templates: Separating Programs from Design

Make web site design changes easier by using templates--HTML pages with embedded Perl code.

By Reuven M. Lerner

If you are running a small web site, then you are probably responsible for everything--server administration, the site's content and the CGI programs that produce variable and dynamic content.

If your site is of any significant size, the work is probably divided among a number of people. Indeed, most large sites divide their staffs between the people who are responsible for the site's content and design, and those who are responsible for the infrastructure and technical side of the site.

Such a division undoubtedly makes it easier to administer a site. After all, it is much easier to find someone to write content or to write CGI programs than to do both. In addition, splitting the work according to function allows everyone to do what he does best.

At the same time, such a division makes it difficult for sites to maintain a uniform presentation style. CGI programs produce HTML that must match the style of the rest of the site. This might mean inserting a certain header at the top of each page, using a particular background color or inserting a graphic on the side of each page.

In other words, there are two separate sources for HTML content on a web site. The pages of HTML created by the designers, and the HTML produced by the CGI programs. If a site does not change its style often (or at all), the fact that the HTML comes from two sources does not matter. The designers establish a style for the site, which is then adopted by both designers and programmers for their work.

However, many sites have taken to redesigning every few months, partly due to continually improved technology that allows designers to create more interesting, exciting experiences on their sites. Every time a site's design changes, all of the existing content must be rewritten to fit the new design. Sites that have split their content between programmer-generated HTML and CGI-generated HTML will find themselves having to convert two types of files with two separate staffs.

For example, let's assume that a site has standardized white-on-blue text. Each time the designers create a new page, they make sure to include a <body> tag of the form:

<body bgcolor="blue" fgcolor="white">
In order for the site to have a uniform look, all of the CGI programs on this site must include a similar <body> tag at the top of their output. Here is a basic ``hello, world'' page that demonstrates how to accomplish this:

#!/usr/bin/perl -w

use strict;
use diagnostics;
use CGI; # Available from http://www.perl.com/CPAN

# Create an instance of CGI
my $query = new CGI;

# Send an appropriate MIME header
print $query->header("text/html");

# Begin the HTML, with our colors indicated
print $query->start_html(
	-title => "Hello, world!",
	 -bgcolor => "blue",
	 -fgcolor => "white");

# Send our message
print "<P>Hello, world!</P>\n";

# End the HTML
print $query->end_html;
If this program were run as a CGI program from within a web server, it would produce a short page of HTML on our screens, with a the text appearing in white text on a blue background. (And yes, we should use hex codes for consistent colors across platforms, but this is just meant to be an easy example.)

After creating an instance of CGI (an object module freely available from CPAN at http://www.perl.com/CPAN), the program sends a MIME header indicating that it will be sending HTML-formatted text to the user's browser. Following that MIME header, it sends a <body> tag, hidden somewhat by the start_html method that takes care of such tag production for us.

Finally, we send our short message, marked up in HTML, and invoke the end_html method, which sends a </body> tag to end the body of the HTML text and an </html> tag to indicate the end of the HTML page.

What happens when the designers decide that white-on-blue text is passe, and that they would rather have a more modern look (along the lines of Wired magazine) with orange text on a green background? It would not be very difficult for the designers to perform a global ``search and replace'' on the <body> tags appearing within the HTML files on the site. To modify each of the CGI programs on the server is much trickier.

A Simple Solution

One solution is to put all of our design-related variables in a library module that we can import into our programs. Here is an example of such a module called SiteDesign:

#!/usr/bin/perl -w

package SiteDesign;

$background_text = "white";
$foreground_text = "blue";

1;
The above module is named by the package statement. Following that statement, variables and functions are assumed to begin with the string SiteDesign::. To avoid problems with the package names when variables are imported into a program, we have turned off the normally helpful construct use strict.

Assume that the above code is placed in a file named SiteDesign.pm, and the file is placed in a directory named by the special Perl variable @INC (the list of directories in which Perl modules are located). Our programs should now be able to include this library with the statement:

use SiteDesign;
In other words, we could rewrite our ``Hello, world'' program as:

#!/usr/bin/perl -w

use strict;
use diagnostics;
use CGI; # Available from http://www.perl.com/CPAN

use SiteDesign;

# Create an instance of CGI
my $query = new CGI;

# Send an appropriate MIME header
print $query->header("text/html");

# Begin the HTML, with our colors indicated
print $query->start_html(
	-title => "Hello, world!",
	-bgcolor => $SiteDesign::background_text,
	-fgcolor => $SiteDesign::foreground_text);

# Send our message
print "<P>Hello, world!</P>\n";

# End the HTML
print $query->end_html;
This code is certainly an improvement over the first version of our program, in that the HTML produced by our programs can be changed without having to modify the programs. Existing CGI programs do have to be changed so that they make use of SiteDesign.pm--but you only have to change your existing code once, rather than each time the site's design changes.

This approach is useful in many ways, but it does not solve all of the problems. While we have reduced the amount of work that a site's programmers need to perform each time the designers change their minds, we have not eliminated it entirely. The designers still have to come to the technical staff each time they wish to make such changes.

Furthermore, there is a practical limit to the number of ways in which we can affect our programs' output by setting variables. We could add a variable indicating which image, if any, should be displayed at the top of each page, another variable indicating whether an image should be displayed at the bottom of the page, another variable indicating the font size, yet another for whether the first paragraph should be centered, and so forth, ad infinitum. Sure, it would still be easier to change these variables than to change the output of each CGI program, but this solution does not scale well to a large number of variables. Would you want to be the programmer asked to modify 30 configuration variables each time the site's design was changed?

One possible solution to this problem is to put the variables in a configuration file, similar to the quiz file that we have discussed over the last few months. Such a file, particularly if it were masked by an interface consisting of CGI programs and HTML forms, would allow designers to modify the site's design without having to bother the programmers. However, designers would still have to deal with the large number of configuration variables, as well as understand what they mean. And programmers would still have to write code taking all sorts of styling possibilities into account.

In other words, the use of variables to indicate styling is better than nothing at all but is far from a perfect solution. What we would like is a way of creating pages of HTML that could be modified by designers, and also gives the possibility of executing code within those pages of HTML.

Templates

Luckily, we can create such hybrid Perl/HTML pages using the Text::Template Perl module, written by Mark-Jason Dominus. This module, available from CPAN at http://www.perl.com/CPAN/, allows us to take such hybrid files, evaluate the Perl parts, leave the pure HTML alone and send the results to the user's web browser. While the template module is identified as beta software and is not guaranteed to be stable, I have been using it for some time and have not encountered any problems. (I wish that I could say that about some of the commercial software that I use.) Although The template module is not designed to work exclusively with HTML pages, it is in this area that I have found it to be highly useful.

Templates are pages of HTML that contain zero or more pieces of Perl code. (Thus, a plain HTML file is also a template, although such files don't do anything special.) The Perl code is contained inside the curly braces that Perl uses to identify blocks within programs. For example, here is one template that displays the time of day as recorded on the server:

<HTML>
<Head>
<Title>Welcome to our site</Title>
</Head>
<Body>
<P>Welcome to our site!The time is now
{localtime;}
</>P
</Body>
</HTML>
At first glance, the above template appears to be HTML and nothing more. If you look within the curly braces ({ }), you will see Perl code hiding there. In this particular case, we have used the Perl function ``localtime'', which prints out the time and date using the standard Unix format.

Because the above file looks and acts like HTML--it is HTML, after all, except for the Perl code--we can give it to our designers, who can change the layout in any way they might like. If they wish to insert an image before/after the time or if they wish to center the time of day, they can do so by using the familiar HTML tags. The site's programmers merely have to stress the importance of not modifying the text contained within the curly braces, which should be off-limits to them. By the same token, the site's programmers should only modify the code contained within the curly braces, since that is the portion for which they are responsible.

By using templates, we get the best of both worlds. Pages can contain programs, and thus, can modify their output depending on circumstances, while styling is still determined by the HTML surrounding the blocks of code.

Writing templates is admittedly something that takes a bit of time to grasp; however, the principles of writing templates are easy to understand. As mentioned above, anything within curly braces is considered to be Perl code and is replaced by the results of its evaluation. Thus, the expression:

{ 2 + 2; }
returns 4, and the expression:

{
	$browser = $ENV{"HTTP_USER_AGENT"};
	$outputstring = "<P>You are using \"$browser\"
as your browser.</P>\n";
}
returns a string telling the user which browser he is using, bracketed by HTML ``paragraph'' tags.

It is also possible to make calculations in one block of Perl and to use the results of those calculations in a later block. Thus, we can create the following:

<HTML>
<Head>
<Title>Welcome to our site</Title>
</Head>
{
$time = localtime;
$browser = $ENV{"HTTP_USER_AGENT"};
}
<Body>
<P>Welcome to our site! The time is now
{ $time; } 
</P>
<P>You are using {$browser;} to view our site.</P>
</Body>
</HTML>
In this code, we use the first block of Perl to assign variables needed in the rest of the template. It might seem a bit contrived but can be of great help when creating large, complicated templates to set up a number of variables in the first block and then to refer to them in subsequent blocks.

If we are not careful when writing blocks of code, we can accidentally insert some extraneous characters into our resulting page of HTML. In the above example, the first block of code assigns values to variables. The code block itself returns the value of $browser, since that was the last variable assignment. In other words, our users see the name of their browser twice--once where the first block sits and the second, where we might expect to see it, in the third Perl block.

In order to avoid such problems, I generally use a variable named $outputstring, which is used solely for the purpose of sending output to the resulting page of HTML. At the beginning of each block, I assign $outputstring the empty string (""), ensuring that it is not tainted by values from previous blocks. The last line of each block is then set to $outputstring;, which evaluates to the value of $outputstring and is sent to the user's browser. In between these two uses of $outputstring I can perform any calculations that I want--and anything that I want to send to the user is simply concatenated onto the current value of $outputstring.

Since CGI variables are actually environment variables and child processes inherit environment variables from their parents, we can also access CGI variables from within our templates. We have already seen this in the above examples, when we retrieve $ENV{"HTTP_USER_AGENT"}, which should return the identifying string that web browsers send to web servers along with their document requests.

Because the code inside templates is full-blown Perl, we can use all of the techniques and code that we ordinarily use, including the use of library modules for databases, centralized libraries of code, and just about anything else available.

Of course, you need to be sure that your code is debugged before releasing it on an unsuspecting public. It is quite embarrassing to create a template and put it out in a public area of your web site, only to discover a bug that causes the entire template to crash. Actually, the template won't crash; the template module is smart enough to catch problems and point them out on the resulting page of HTML. Debugging templates can be tricky, so be sure to allocate additional testing and debugging time whenever you use templates rather than straight CGI programs.

The Template Wrapper

So, how do we turn a hybrid Perl/HTML template into plain HTML to be sent to the user's browser? If users were shown these templates without some sort of translation, they would appear as HTML files with the Perl reproduced verbatim on the user's screen. This display is obviously not desirable.

The key is to have a CGI program, called wrapper.pl, to take the name of a template in its query string (i.e., the argument that a CGI program receives following the question mark in the URL). Once it has received the template name, wrapper.pl creates an instance of Text::Template and instructs that module to perform the magic necessary to turn our template into a page of HTML.We can then send the resulting HTML to our user's browser. As far as the user is concerned, the page was and is HTML; he does not know that we have used a template to create our output.

Here is a simple version of wrapper.pl:

#!/usr/bin/perl -

use strict;
use diagnostics;
use CGI;
use Text::Template;

# Create an instance of CGI
my $query = new CGI;

# Send an appropriate MIME header
print $query-header("text/html");

# Get the name of the template
my $file = "/home/httpd/html/templates/" . $query-param("keywords");

# Create an instance of template
my $template = new Text::Template(-type => FILE,
	-source => $file);

# Perform the evaluation, and send the results
# to the user's browser
print $template-fill_in;
This program may appear quite simple, but we have hidden a great deal of depth within our calls to Text::Template-- first when we open the file and when we ask the Template object to evaluate each of the small Perl programs inside the indicated template, it does so. Finally, we take the results of that evaluation and send them to the user's browser with the print statement.

Assuming the directory in which templates are stored not only makes the resulting URLs shorter but also makes your site somewhat more secure, since outsiders will not know your file system. It is also a good idea to remove any references to the parent directory (represented with ``..'') in the filename passed to wrapper.pl, so as to avoid turning our program into a convenient way of looking at all of the files on the server's hard disk. One easy way to do this is to replace the original assignment of $file with the following two lines:

# Get the name of the template
my $file = "/home/httpd/html/templates/" . $query->param("keywords");

# Remove possible security problems
$file =~ s|/\.\./|/|g;
This will remove attempts to ascend one or more directories, making it more difficult for someone to spy on the contents of our server.

Thus, if we have a template named /home/httpd/html/templates/test.tmpl and a site called www.oursite.com, we can view the template in translated form by using the URL http://www.oursite.com/cgi-bin/wrapper.pl?test.tmpl. If we have another template in the same directory named foo.html, we can view it using the URL http://www.oursite.com/cgi-bin/wrapper.pl?foo.html.

One odd note that you should remember when creating templates is the fact that they are effectively served out of the CGI directory on your server (usually called cgi-bin). In all of the above templates, this does not make any difference. If our templates were to incorporate images whose URLs were named relatively (i.e., without a leading slash) rather than absolutely (i.e., with a leading slash), this could cause a problem.

For example, it is quite common for HTML files to be placed in one directory, while the images used by those files are placed in a subdirectory, perhaps named ``images''. In order to create an HTML file with an image inside of it, we could do the following:

<HTML>
<Head>
<Title>Example of image</Title>
</Head>
<Body>
<P>This is a sample Web page, containing an image.</P>
<img src="images/graphic.gif">
</Body>
</HTML>
But if we were to take this same file and feed it through wrapper.pl, the image would no longer appear. That's because the ``image'' subdirectory exists relative to the directory in which the HTML file resides, rather than the directory in which the CGI program resides.

One quick solution to this problem is to use the <base> HTML tag, with a URL other than the one under which it was invoked. The <base> tag looks like:

<base href="http://www.oursite.com/text/english/">
With this tag in place, our browser will know to load the image in the above template from http://www.oursite.com/text/english/images, regardless of whether the document was loaded from within the CGI directory or the original HTML directory. The problem with this approach is that it makes it more difficult to move files and directories to other places on the site--a trade-off that is often worth making.

One word of warning before I conclude. Normally, access to the CGI directory and to the programs contained within it is restricted to a small set of programmers who can be trusted to write and modify code on your system. With templates, that group is suddenly expanded to include all of the site's designers, who could theoretically modify the code within a template to perform malicious acts. Remember that since templates include code, it is a good idea to restrict access to the directory containing the templates, rather than granting it to everyone on your system.

In short, templates are a useful way to separate the design of a web site from the CGI programs it contains. By using them wisely, you will give everyone more freedom to do what they enjoy, as well as what they do best.

Reuven M. Lerner is an Internet and Web consultant living in Haifa, Israel, who has been using the Web since early 1993. In his spare time, he cooks, reads and volunteers with educational projects in his community. You can reach him at reuven@netvision.net.il.