return to first page linux journal archive
keywordscontents

A Glimpse of Icon

A Language For the Rest of Us

This article gives a quick introduction to the programming language Icon, developed at the University of Arizona.

by Clinton Jeffery and Shamim Mohamed

Linux users are early adopters of new technology, so it's not surprising that many in the Linux community wish to use the best programming language for a given application, rather than being limited to just one language. The purpose of this article is to tell you about one of the simplest and most powerful programming languages available. It's called Icon, and it is a language for people who love programming. This tutorial is a ``teaser'' meant to pique your curiosity; the April 1998 issue of Linux Gazette has a longer tutorial which goes into more detail about the features described here.

My Programming Language Can Beat Up Your Programming Language

Languages are the subject of religious wars; very little is gained by arguments ``proving'' one language is better than another. Icon is not perfect, nor is it the ``best'' language--but it is a very nice language to use. Icon is for people who don't want to deal with memory management in C or C++; for people who want the power of Perl and beyond, but prefer a cleaner expression syntax and fewer special cases; and for people who have a use for rich data structures and algorithms, but take for granted all the programming building blocks they learned in school. Icon is used for children's games, scripture analysis, CGI scripts, compiler research, literate programming, system administration and visualization. It is in many ways what BASIC should be and what Perl and Java could have been. (If you know a language that allows simpler and more direct solutions to the three short, complete program examples given in this article, please tell us about it.)

Icon: Listing the Basics

Icon's basic philosophy is to make programming easy. Its syntax is similar to C or Pascal; programs are composed of procedures, starting from main. Icon's built-in list and table data types beat out most languages: other languages have similar types but just don't seem to do the operators and semantics as nicely. Both types use familiar subscript notation, hold values of any type and grow or shrink as needed. Lists take the place of arrays, stacks and queues. Tables associate keys of any type with corresponding values. These types are ingeniously implemented; for example, lists are like arrays when you use them like arrays, and like linked lists when you use them like linked lists.

Although Icon has some exotic concepts compared with C or FORTRAN, in several ways Icon programs are more readable, not just shorter. For example, when they are ``true'', the relational operators return the value of the right operand, and associate left to right, so (12 < x < 20) tests whether x is between 12 and 20.

Here is a silly sample program that counts the number of occurrences of each word given on its command line and writes the words out in alphabetical order, along with their corresponding counts. A table is created with all keys mapping to a default value of 0. Then, each argument on the command line is used as a key in the table to increment a counter. The table is sorted, producing a list of two-element lists containing the keys and their values. These pairs are removed from the list one at a time, and the keys and values are written out.

procedure main(argv)
 T := table(0)
 every T[ !argv ] +:= 1
 L := sort(T)
 while pair := pop(L) do
 write(pair[1], ": ", pair[2])
end

The Joy of Generators

Generators are Icon's unique feature; they are its computer science research contribution. They give the language simpler, more intuitive notation, so they are worth making a mental leap. Generators can produce more than one value, and expression evaluation tries each value from a generator until it finds one that makes the enclosing expression succeed and produce a value. For example, (2|3|5|7) is a simple expression that produces the values 2, 3, 5 and 7; so the expression (x = (2|3|5|7)) tests if the value of x is one of those four values.

In the previous program example, the expression !argv generated the elements from the list argv. Expression evaluation tries to obtain a value; the every control structure causes all the values to be produced. This code

every i := (1 to 10) | (20 to 30) do
 write(L[i])
prints the first ten values from the list, followed by elements 20 through 30.

Generators are a very natural way to write procedures that compute a sequence of values. In a language like C, the procedure has to maintain its state between calls using static data; in Icon, this is done automatically. Here's one way you might write a web-link checker:

every url := get_url(document) do
 test_url(url)
The procedure get_url scans the document for hyperlinks:

procedure get_url(filename)
 f := open(filename) |
 stop("Couldn't open ", filename)
 while line := read(f) do {
 ...
 url := ...
 suspend url
 }
end
In the above example, get_url is called only once. Each time a suspend occurs, a result is produced for the surrounding expression, and if the surrounding expression fails, the call is resumed where it left off, at the suspend. Generators are the basis for additional powerful language features (see Linux Gazette article for details).

Graphics and User Interfaces

Icon's built-in graphics have about 40 functions and introduce only one new type, the window, which is a special extension of the file type. This contrasts with graphics APIs in other languages where learning graphics means learning 400 or more functions that manipulate several dozen new types of values. Passing strings and integers into a few functions is all you need to write amazing graphics without excessive code.

One demonstration of Icon graphics is Brad Myers' ``rectangle-follows-mouse'' test, a program that opens up a window in which a rectangle follows a mouse around on the screen. A window is opened (file mode ``g'') with an XOR raster drawing operation that causes graphics to erase themselves when redrawn. In the loop, for each user event, the ten-pixel square is erased and redrawn at the new mouse location. &x and &y are Icon keywords which hold the current mouse location and are saved in variables x and y. The variables x and y start out as null. The expression \x fails if x is null, causing the first call to DrawRectangle to be skipped the first time through the loop, since at this point, there is no rectangle to draw.

procedure main()
 w := open("win","g", "drawop=reverse")
 repeat {
 # get mouse/keyboard event
 Event(w)
 # erase old rectangle
 DrawRectangle(w, \x, y, 10, 10)
 # draw new rectangle
 DrawRectangle(w, x := &x, y := &y, 10, 10)
 }
end
Simple graphics programming is easy, but complex graphics are also possible. The Icon Program Library (IPL), a collection of Icon utilities and libraries, offers a more extensive Motif-style user interface toolkit as well as a WYSIWYG (what you see is what you get) interface builder that lets you build interfaces by drawing them. The IPL contains several other examples of graphical games and applications.

POSIX Made Simple

The Unicon flavor of Icon adds an elegant set of UNIX system-level facilities. An ultra-simple version of the ls utility illustrates some of these features. This version takes a directory name on the command line and produces a listing of file information including file size and modified time, sorted by name. (A more interesting version is included in Linux Gazette article.)

ls reads the directory and performs a stat call on each name it finds. In Icon, opening a directory is exactly the same as opening a file for reading; every read returns one file name.

$include "posix.icn"
procedure main(argv)
 f := open(argv[1]) |
 stop("ls: ", sys_errstr(&errno))
 names := list()
 while name := read(f) do
 push(names, name)
 every name := !sort(names) do {
 p := lstat(name)
 write(p.size, "\t", ctime(p.mtime)[5:17],
 "\t", name)
 }
end
The lstat function returns a record with all the information that lstat(2) returns. In the Icon version, the mode field is given as a human readable string--not an integer to which you must apply bitwise magic. Also, in Icon, string manipulation is very natural.

Give Icon a try; whether you're a programmer or not, you'll love it.

Resources

Clint Jeffery is an assistant professor in the Division of Computer Science at the University of Texas at San Antonio. He writes and teaches about program monitoring and visualization, programming languages and software engineering. Contact him at jeffery@cs.utsa.edu or read about his research at http://www.cs.utsa.edu/faculty/jeffery.html. He received his Ph.D. from the University of Arizona.

 

Shamim Mohamed met UNIX in 1983 and was introduced to Linux at version 0.99 pl12. These days he is a Silicon Valley polymath and factotum, and an instrument-rated pilot flying taildraggers. He can be reached at spm@drones.com or http://www.drones.com/. He received his Ph.D. from the University of Arizona.