LJ 42: Literate Programming Using Noweb

Listing 1. Noweb Source file

\documentclass[10pt]{article}
\usepackage{noweb}
\noweboptions{smallcode,longchunks}

\begin{document}
\pagestyle{noweb}

@ \paragraph{Introduction}
This is [[autodefs.perl]]\footnote{Copyright 1997, 
Andrew L.  Johnson and Brad C. Johnson,  All 
rights reserved.}, a Perl script to be used as an
[[autodefs]] filter in the [[noweb]] pipeline to
identify and index some common Perl definitions.
Since this file is also meant to show off some of 
the features of [[noweb]] it is purposely verbose
and contorted.

Perl does not require the formal declaration or
typing of variables which makes it difficult to
differentiate between declarations and usages of
variables.  We may however find definitions of
[[sub]]'s and [[package]]'s with little difficulty
and that is the purpose of this module. Before we
begin we need to know some facts about [[noweb]]'s
pipeline structure.\footnote{Noweb's pipeline
structure is described in the \textit{Noweb 
Hackers Guide} which is included in the [[noweb]]
distribution.} Actual code in the pipeline lie 
between lines of the form [[@begin code]] and 
[[@end code]]. In Perl these are easily recognized
by the following regular expressions.
<<Global variables>>=
$begin_code_pat = "^\@begin code";
$end_code_pat   = "^\@end code";
@ %def $begin_code_pat $end_code_pat

@ Within a code block there are many types of
lines. Ones that contain actual code are prefixed by
[[@text]].
<<Global variables>>=
$code_line_pat = "^\@text";
@ %def $code_line_pat

@ If, on a code line inside a code block, we find
something that should be added to the "Defines"
block at the end of the code chunk and appear in 
the index, then we need to add a line to the
pipeline of the form "[[@index defn <ident>]]".
<<Global variables>>=
$index_prefix = "\@index defn";
@ %def $index_prefix

@ \paragraph{autodefs.perl}
Our actual Perl script has the following simple 
shape:
<<autodefs.perl>>=
#!/usr/bin/perl
<<Global variables>>
<<[[process_code_chunk]] subroutine>>
while ( <> ) {
    print $_;
    if (/$begin_code_pat/o) {
        process_code_chunk;
    }
}
@

\paragraph{Processing the code chunk}
To process the code chunk we need to perform a few
housekeeping tasks. First, we only want to 
consider lines that begin with [[$code_line_pat]]
and second, we want to stop when we find a line 
that matches [[$end_code_pat]]. The following loop
will suffice for this purpose.
<<[[process_code_chunk]] subroutine>>=
sub process_code_chunk {
    while ( ($_ = <>) && !/$end_code_pat/o ) {
        print $_;
        if( /$code_line_pat/o ) {
           <<Find and print any definitions>>
        }
    }
# make sure we print the "@end code" line
    print $_; 
}
@
@ When checking for definitions we first strip off
any comments since [[sub]] or [[package]] may
also occur in a comment.  We then build
a list [[@def_list]] which contain all of the
[[sub]] and [[package]] definitions on the line
and print out an [[@index defn]] line for
each.
<<Find and print any definitions>>=
$_ =~ s/#.*//o;
@def_list = (/subs(\w+)/go, /package\s(\w+)/go);
foreach $item (@def_list) {
    print "$index_prefix $item\n";
}
@

\paragraph{Defined Chunks}\par\noindent

\nowebchunks

\paragraph{Index}\par\noindent

\nowebindex
@
\end{document}