How Popular Is Your Page?

Counting hits with CGI- Gene Devereaux explains his portable, multi-user hit counting code.

by Gene Devereaux

When my service provider brought up an HTML server I decided to learn something about the Web and what better way than to build my own Web pages on-line. I started by bringing up a http://www.halcyon.com/gened/gened.html, my resume and documenting several projects I was working on.

As time went on, I became curious as to how many people were actually accessing my home page and several friends were asking the same question. I also wanted to start giving something back to the Internet community for all the free software I have used over the years.

In November 94 I decided to write a small script that would allow me to keep track of roughly how many people accessed my home page. The goal of the project was to teach myself something about writing CGI scripts and to produce a handy little tool I could release as freeware.

My development environment

I decided the only way I would be able to experiment with all aspects of CGI would be to have my own server up and running. My service provider uses the CERN server under Ultrix so I brought the same server up under Linux and configured it to match www.halcyon.com. After reading the available on-line documentation on CGI, I realized I could choose any one of several languages for my counter: shell scripts, C, Perl and Tcl to name a few. I chose Tcl because I was familiar with the language and it seemed like it would be straight forward to implement with the string handling capabilities of Tcl.

After I finished the script and copied it to my service providers system I realized that there was no Tcl interpreter available on halcyon. I had bought a book on Perl and I decided this was my chance to use Perl. The Tcl script is available from http://www.halcyon.com/gened/number.tcl.

Design Requirements

I always try and formulate some sort of design requirements before I start coding so I settled on this simple list:

  1. The counter should be re-entrant so that it would only be necessary to have a single version of the script on the server.
  2. For ease of maintenance, count's would be kept in file's in the same directory tree as the users HTML files.
  3. To allow an unlimited number of counters to be used, the counter file name would be passed in the HTML tagging.
  4. The counter would be displayed as an XBM image because XBM format is easy to manipulate and the XBM mime type is recognized by most graphical browsers.

Building the numbers

Under the X environment there is a tool called bitmap. I brought up a 16x16 bitmap and built a full set of digits from zero to nine. Figure 1 shows bitmap with the digit zero completed. Right below figure 1 is the output of the bitmap editor for the zero bitmap. I used this information to fill in the digits array $bmv. I then built a bitmap of six 16x16 digits across a page. This information was used to construct the $num array. It was easy to determine from the $num array what I needed to do to build the counter image from $bmv digits.

Support Routines

Perl under UNIX handles multi-user file access through system calls to flock. If the operating system does not support flock the flock call will produce a fatal error. The subroutine lock insures that the file opened under the file handle FVAL cannot be accessed by anyone else until the unlock subroutine frees the file.


#----------------------
# Define operation constants
#----------------------
    $LOCK_SH = 1;
    $LOCK_EX = 2;
    $LOCK_NB = 4;
    $LOCK_UN = 8;
    sub lock {
      flock(FVAL, $LOCK_EX);
    }
    sub unlock {
      flock(FVAL, $LOCK_UN);
    }

The get_count subroutine open's the counter file, reads the count into the $elv array, increments the count by one and writes the new value back to the counter file. The first four lines of the subroutine open the counter file and lock it so no other user can access the file and corrupt the count.


    get_count {
      local($i);
      open(FVAL,";/archive/local$val_path";) 
           || die ";Can't read value file\n";;
	 &lock();
Now the number is read from the first six lines of the counter file into the $elv array and incremented by one.

      while (<FVAL>) {
        chop;
        # Force to numeric so we can start with an empty file.
        $elv[$. - 1] = $_ + 0;
      }
      close FVAL;
      # Increment the number, carrying one digit at a time.     
      foreach $i (5,4,3,2,1,0) {
        last if (++$elv[$i]  <=  9);
        $elv[$i] = 0;
        if ( $i == 0 ) {
            #wrap after 999999 accesses....
            @elv = (0, 0, 0, 0, 0, 1);
            last;
        }
      }
The new count is now rewritten back to the counter file, the file lock is cleared, and the file is closed. Each $elv digit is multiplied by 32. This produces an index to the actual number in the $bmv array. The $bmv array holds the XBM digits 0 thru 9 in XBM format consisting of 32 bytes per digit.

    open(FVAL,";/archive/local$val_path";) 
         || die ";Can't write to value file\n";;
    foreach $i (0..5) {
      print FVAL ";$elv[$i]\n";;
    }
    &unlock();
    close FVAL;
    foreach $i (0..5) {
      $elv[$i] *= 32;
    }
  }
The $val_path variable is set to the contents of the environment variable $PATH_INFO. The Web Server sets $PATH_INFO to the path and file name following the CGI script name. In the following example the value contained in $PATH_INFO would be /your_dir/number.val.
..IMG SRC=";http://www../htbin/number.xbm/your_dir/number.val";..

After the get count routine is called $elv[0..5] holds indexes to the XBM digits table $bmv that reflect the six digit number in the number.val file. The full 16x16 digit represented by each digit is copied two at a time to their proper location in the $num array.

The num array holds six 16x16 xbm digits in a horizontal pattern. Each digit is made up of two bytes followed by two bytes of the next digit thru all six digits. The $elv digits representing the six numbers from the counter file are copied to the $num to form a six digit number identical to the way it would be built by the bitmap editor.


  $val_path=$ENV{";PATH_INFO";};
  $get_count; 
  $i = 0;
  $j = 0;
  foreach $char (0..15) {
     $j = $k;
     $k += 2;
     foreach $nv (0..5) {
       $num[$i+($nv*2)] = $bmv[$j+$elv[$nv]];
       $j++;
       $num[$i+1+($nv*2)] = $bmv[$j+$elv[$nv]];
       $j--;
     }
     $I += 12;
   }

The output stream is sent so that the content type is followed by a blank line as specified by CGM scripting rules. This is the first two lines returned to the server. The rest of the script sends a full ASCII representation of an xbm file from $num just as it would look if it was built by the bitmap editor.


   print <<EOF;
   Content-type: image/x-xbitmap
   #define number_width 96
   #define number_height 16
   static char number_bits[] =  { 
   EOF
   $row = 1;
   foreach $j (0..190) {
     print ";$num[$j], ";;
     if ($row == 12) {
       print ";\n";;
       $row = 1;
     } else {
       $row++;
     }
   }
   print ";$num[191]\n};\n";;

Conclusions

I have received a number of requests for a count that is not displayed back to the user. All that has to be done to accomplish this is to modify the script to return a blank bitmap file.

The Tcl version of counter is virtually the same design and implementation done in Tcl.

One of the great benefits of putting a freeware package on the net is all the quality improvements that are sent back by other contributors. I have incorporated several improvements over the past few months that have been sent to me by interested users. The day I started this article I received some great improvements from David J. MacKer and I have included them in the latest release. It has just occurred to me that I should put an acknowledgment in the header of all future packages that I write to give credit where credit is due.

Some people do not like this type of script and have pointed out that the counter has no real purpose because the same information is available in the server log files. Furthermore they believe this sort of thing takes up valuable bandwidth and is not really an accurate measure of how many real accesses have happened. I do agree that the count is only a rough figure and that it is not the most efficient way to get the information about accesses, but I have had fun writing the program; I learned what I set out to learn and a lot of people have enjoyed using the script.

Eugene E. (Gene) Devereaux is a Senior Principal Scientist with Boeing's Airplane Systems Laboratory where he is a software developer and a member of the ASL WEB Advisory Board. He has been working with computer hardware and software for over 30 years and his current interest is JAVA and using Web tools to interface with Real Time programs and protocol analyzers. He operates his own Linux based BBS ( TechTalk-101 at 206-584-1178 8-n-1 ) and maintains his own HOME page at http://www.halcyon.com/gened/gened.html. He can be reached by email at gened@halcyon.com