return to first page linux journal archive
keywordscontents

cgimodel: CGI Programming Made Easy with Python

Always look on the bright side of life and at a method for debugging CGI programs on the command line.

by Chenna Ramu and Christina Gemuend

The Common Gateway Interface (CGI) is a way in which you can let others from all over the world execute a program that resides on your computer. CGI is dynamic, since it runs in real time. You can decorate the CGI output with HTML (Hyper Text Markup Language). Most of the time, CGI is used as a front end for existing applications. CGI can be easy or complex, depending on the complexity of your project. Most CGI developers know the frustration which comes with debugging the CGI programs.

We present a very simple and robust way of doing CGI programming with Python. Debugging your CGI is easy, since you can do it on the command line, and integrating existing applications to work with CGI is just one step.

For our work, we chose Python, an object-oriented scripting language with a clear syntax. It is very easy to use, widely available and is free software.

Our intended audience is both experienced and novice CGI programmers. We will use the words ``function'' and ``method'' interchangeably. Note that CGI can be written in any computer language.

The GET and POST Methods

There are two ways of invoking CGI programs: through a URL with all data included, or by submitting HTML forms.

The two methods defined in HTTP to send your data to the CGI are GET and POST. When the method is GET, the CGI program gets the input from the QUERY_STRING environment variable. When the method is POST, the CGI program gets the input from standard input (STDIN). In both cases, one has to parse the input to obtain the input argument name,value pairs.

CGI may or may not be complicated, but when you have a larger application with many features, you might have problems in testing, debugging, etc. This is true with all software projects. Debugging becomes problematic with CGIs. For example, when the method is GET, you have to set up environment variables QUERY_STRING and REQUEST_METHOD. When the method is POST, you must set up REQUEST_METHOD and CONTENT_LENGTH (number of bytes) to read from standard input (STDIN). Moreover, when your program crashes, it is not visible to your browser--you do not know what happened. The only message you get in this situation is the error report made by the web server.

You can use either of these methods (GET/POST) depending on your need. If you will be sending more data to CGI, use the POST method. When you have less data to be sent to CGI, use GET to put all the data inside the URL. For example, on one line, type:

<A HREF="/cgi/cgimodel.py?fun=DisplayFile&fileName=
cgimodel.pycgimodel">cgimodel</A>
With HTML FORMS (for POST method), the same would be

<FORM METHOD="post" ACTION="/cgi-bin/cgimodel.py">
<INPUT TYPE=hidden name=fun value=DisplayFile>
<INPUT TYPE=hidden name=fileName value=cgimodel.py>
<INPUT TYPE=SUBMIT VALUE="cgimodel">
</FORM>
We all know the difficulties of and have adopted different styles for debugging CGI programs. Our intention is to build CGI that does not work in the traditional way, but like other programs which work on the command line. This means you can test your CGI the way you test any other program on the command line. When it works on the command line, it is guaranteed to exhibit the same behavior on the Web.

The cgimodel Module

Let us see how we can make life easier with cgimodel, which lets you integrate your existing application in an elegant way without much hassle. Basically it consists of two modules: cgimodel.py (see Listing 1) and cgidisp.py (see Listing 2).

Listing 1. cgimodel.py

Listing 2. cgidisp.py

cgimodel.py is a wrapper to Python's CGI module. It also encapsulates reading from the command line, so there is no real difference in invoking from HTML FORMS or a URL or the command line.

The CollectArgs function in the cgimodel.py module takes care of collecting arguments including name,value parameters from CGI or the command line. On the UNIX command line, you can supply the name,value parameters like this:

-name1 value1 -name2 value2
or like this:

name1 value1 name2 value2
The same is true for both URL and FORMS.

You do not have to modify anything in cgimodel.py. You just have to use it. The main section of cgimodel contains the following lines:

d = Dispatcher() 
parDict = CollectArgs(parDict)
print mime_html
fun=parDict['fun']
if not fun:
   print "usage: cgimodel -fun functionName"
   d.ShowAvailableFunc()
   TraceIt(parDict)
else:
   try:
      d.dispatch(fun,parDict)
   except:
      TraceIt(parDict)
cgimodel.py tries to call the function you have given as an argument to the parameter -fun.

When there is no such function available, it tells you the names of functions that can be called. If there is an exception (because of a syntax error, etc.) in the program, the exception will be traced back and reported. You can use this feature to e-mail the exception to yourself and make your CGI program more stable.

The cgidisp Module

The other module, cgidisp.py, is the one in which you have to modify or insert an instance to the class Dispatcher for your application using one argument, namely parDict. For example, under class Dispatcher, if you define a method like

def cmd_myHello(self,parDict):
print "<H1>Hello</H1>"
then this function is immediately available to the outside world. You can call it on the command line this way:

cgimodel.py -fun myHello
with URL (GET method)

cgimodel.py?-fun=myHello
and with HTML forms as

<FORM METHOD="post" ACTION="/cgi-bin/cgimodel.py>
<INPUT TYPE=hidden name=fun value=myHello>
<INPUT TYPE=SUBMIT VALUE="Say Hello">
</FORM>
It's that easy!

The dispatch method under the class Dispatcher is called from cgimodel.py with one argument. This argument is the name of the function to be executed. Here is the interesting part. After prefixing the function name with the ``cmd_'' string, the dispatch method checks to see if such a function is available with hassattr. The Dispatcher maps the command to the function and executes it. This way, you do not have to use a lookup table to keep track of available functions. The additional overhead of adding a new command to the new function is not there; you just have to write the function and call it through the command line. The functionality is already there. This kind of pattern is possible with Python, since it is a highly dynamic language.

Please note that when calling the method, we are not using the prefix cmd_ of the method. This is explained later.

The main section of the Dispatcher class contains the following:

class Dispatcher:
   def __init__(self):
      self.debug = None
   def dispatch(self, command,args=None):
      mname = 'cmd_' + command
      if hasattr(self, mname):
      method = getattr(self, mname)
         if not args:
            return method()
         else:
            return method(args)
      else:
         print "<PRE>" self.error(command)<\n>    
         self.ShowAvailableFunc()
         print "</PRE>"
   def cmd_Hello(self,parDict):
      print " Hello World !"
   def cmd_ShowDict(self):
      print "<PRE></H1>Debug Info:</H1><HR>"
      for k,v in parDict.items():
            print "%-30s :  %s " %(k,v)
      print "</PRE>"
   def error(self,s):
      print " #<B>Error<B>: <BB>Function ( %s ) not available\n " %s
      return
All your parameters are available in the parDict dictionary whether they are input from URL, FORM or command line--there is no difference. You can check for their existence in this way:

if parDict['param']:
   print " yes ", parDict['param']
else:
   print " No "
The None object is returned when there is no parameter, i.e., when you try to access an unspecified parameter.

The instances inside the class Dispatcher are of two types: those that are prefixed by the ``cmd_'' string are qualified for calling from outside; internal instances are not visible outside. For example, the error instance cannot be called from CGI, but the instances cmd_Hello and cmd_ShowDict can be called. This convention is made to differentiate between the instances that are for internal (used inside the class Dispatcher) and external (by cgimodel/cgidisp) use.

So, add a ``cmd_'' prefix to the instances you want to use with CGI. For example, cmd_TopPage can be called with

cgimodel.py -fun TopPage
on the command line and

cgimodel.py?-fun=TopPage
will be the corresponding URL. The -fun is mandatory. This way, you can indicate which function you want to call. Obviously, you can have as many functions as you want, and they are CGI-ready. This is the exact requirement of larger CGI projects.

A couple of functions come with the module for free. The function DisplayFile displays colorized Python source code on the Web. This one relies on the module py2html.py, available with the standard Python distribution.

cgimodel.py -fun DisplayFile -fileName cgimodel.py
URL equivalent:

cgimodel.py?-fun=DisplayFile&fileName=cgimodel.py
Note the name=value and the & to separate the name,value pairs--the traditional method of specification for CGI.

The method cmd_ShowDict shows all dictionary items in the parDict dictionary and is useful for checking whether you have supplied the correct parameters.

Adding Existing Modules to CGI

Assume you already have this module:

#!/usr/bin/env python
#  testmethod.py 
def  Method1(name1,name2,name3):
	print name1,name2,name3
if __name__ == '__main__':
     Method1('one','two','three')
Edit the cgidisp.py module, inserting the following method under the class Dispatcher:

def cmd_TestMeth(self,parDict):
   import testmethod 
   name1 = parDict['name1']
   name2 = parDict['name2']
   name3 = parDict['name3']
   testmethod.Method1(name1,name2,name3)
Now it is ready! You can call this on the command line by typing on one line:

cgimodel.py -fun TestMeth -name1 one -name2 two -name3 three
or by URL (all on one line):

cgimodel.py?-fun=TestMeth&name1=one&name2=two&name3=three
or by FORMS:

<FORM METHOD="post" ACTION="/cgi-bin/cgimodel.py">
<INPUT TYPE=hidden name=fun value=TestMeth>
<INPUT TYPE=hidden name=name1 value=one>
<INPUT TYPE=hidden name=name2 value=two>
<INPUT TYPE=hidden name=name3 value=three>
<INPUT TYPE=SUBMIT VALUE="Run">
</FORM>
It would be much better if you could separate HTML text from CGI modules, so that CGI looks thinner and more readable. You can use the template modules (see Resources) to do this. The template module keeps the text away from the CGI and has a page-paragraph structure. Each CGI call can be associated with a page, and each paragraph can be used to set up the view of your HTML page.

cgimodel can host any number of applications. The redundancy of writing a CGI front end is no longer necessary. Since many applications can be run by a single cgimodel, logging information particular to each application can be done for later analysis to improve server performance, stability of each application, better service, etc. Currently, this can be done with the log information generated by the web server.

With cgimodel.py, cgidisp.py and possibly the template.py modules, you should find writing and testing CGI programs easier.

Resources

Chenna Ramu (ramu.chenna@embl-heidelberg.de) holds a postgraduate degree in mathematics. He currently works for European Molecular Biological Laboratory in Heidelberg, Germany, in the area of biocomputing. Interests are theoretical study about DNA/protein sequences, database development, parsing, compilers, system administration and web technology. He came across Python recently (thanks to Gert Vriend) and found it quite nice for programming. Christine Gemuend has a degree in computer science. She is interested in parallel computers and database systems, and is working in the area of informatics.