No one expects the Spanish Inquisition, but everyone expects Java--an overview of JPython, an elegant scripting solution for Java systems.
by David Ascher
Java has been misrepresented. Overoptimistically, Sun's marketing department presents Java as revolutionary, because a person can write code on one operating system and have it run on many others. Performance fanatics ridicule Java, because no Java program can match the speed of optimized C code. Both of those points of view are flawed--the marketers are wrong to claim that a revolutionary level of ``write once, run anywhere'' has been achieved, and the speed freaks ignore the fact that Java's performance is ``good enough'' for many tasks. In my view, Java is a well-designed, modern systems programming language which did not try to integrate well with others or be a performance leader. It's a fine language for writing medium to large, fairly static systems, especially if interoperability with Java libraries is desired and interoperability with non-Java ``legacy'' code is not needed. Under those conditions, all that's missing is a good scripting solution which allows the users to control the applications. And that's just what Jim Hugunin did when he rewrote the Python interpreter in Java and called it JPython.
That's right: JPython is a completely new implementation in Java of what Guido van Rossum did in C--7 years later and using a very different language. It's the same thing, and yet altogether different. I will describe what JPython is, how it works, and what it will let you do which ``the original Python'' can't. If you have any need for a scripting language which can interact with Java libraries such as the Swing GUI toolkit, you may find that JPython fills a significant gap. But first, a little history.
In 1990, Guido van Rossum chose a portable, well-established language to write his first version of Python. That language was C. Guido cares deeply about portability, and as a result, Python runs on just about anything with a C compiler. (I should know--I claim to be the first person to have run Python on both a Windows CE handheld and a Cray supercomputer.) History validated this choice; while C++ was available at the time, it was still immature. By 1997, Python had achieved all the portability Guido thought was reasonable--and then Java came along and upset everything. For a variety of reasons, the Java platform was designed to operate in a language vacuum--as a result, the interfaces between Java and other languages such as C and C++ are cumbersome. Furthermore, any Java system made to talk to ``native'' code using the Java Native Interface (JNI) loses several important features of a pure Java program, such as portability and security guarantees. Thus, Python couldn't, as written, run on the ``Java Platform''.
There were three possible ways of solving that problem. The first, to have Python talk to Java via the JNI, was attempted by a few teams, but didn't provide a complete solution and did not allow the development of a 100% Java solution. A second alternative considered was to convert Python virtual machine bytecodes to Java virtual machine bytecodes. While this sounds simple in theory, it wasn't practical. Python's bytecodes are very high level and manipulate complex objects, while Java's bytecodes are more akin to assembly language. The third approach was the ``right'' one, but it was also the most ambitious: to convert Python source files directly to Java bytecodes. Because of Python's dynamic nature, this meant re-implementing the Python parser and interpreter in Java. Luckily for us, Jim Hugunin, a longtime fan of Python and the main author of the Numeric Python extensions, undertook that project.
Soon, Jim Hugunin was so engrossed with JPython that he quit his Ph.D. work at MIT and went to work full-time for the Corporation for National Research Initiatives. There he joined Guido and other Python hackers to finish JPython. Jim released JPython 1.0 in July 1998 and started work on JPython 1.1, a major revision. Jim's success with JPython led to an offer he couldn't resist from Xerox PARC (the folks who invented the laser printer, the mouse, the desktop metaphor, Ethernet, etc.), and in mid-1999, Jim was off to Palo Alto to work on other Java-related projects. Luckily for the rest of us, Barry Warsaw, another of Guido's esteemed colleagues, took over the maintenance and development of JPython. As of this writing, Barry remains the current ``flag-bearer'' for JPython. If you're itching to install JPython at this point, see the ``Installation Issues'' sidebar.
JPython is a Python interpreter and compiler written in and for Java. The fact that JPython is an interpreter means you can type jpython at a command-line shell and get the familiar >>> prompt. As we will see, most Python code written for CPython (what the original Python implementation is called when contrasted with JPython) runs fine in JPython. The fact that JPython is a Python compiler for Java means you can take a Python program and compile it to Java .class files, which in turn can be used by other Java programs in myriad ways. Finally, the fact that it is written in and for Java means JPython allows the Python programmer absolutely seamless access to any and all Java libraries, and that is, as we'll see, its ``killer feature''.
As a teaser, consider the program calc.py, shown in Listing 1.
A few things are worth noting about this program, which I adapted from one written by Guido van Rossum. If you're a Java programmer, you may notice that calc.py is a useful program which doesn't define a single class. This is one of the ways in which Python (here JPython) provides a simple scripting solution, whereas Java requires the programmer to know about classes, inheritance, public vs. private members, etc. The second aspect of calc.py, which Java programmers might notice, is that the code is more compact than the equivalent Java code, and yet its purpose remains clear. This is one of Python's most valued strengths, and JPython builds on it in several ways. Python programmers will probably first notice they can read the code and it ``feels'' like good old Python; and second, the parts of the code they don't immediately understand refer to objects like JPanel, BorderLayout, JTextField, etc. These are defined in Java's Swing GUI. We won't describe Swing in detail here, but see the Swing sidebar for some discussion on what Python programmers should know about Swing.
The average JPython user won't care one iota how it works, as long as it does. If you're curious, however, knowing how a tool works is a necessary step toward knowing how to use it well. At the highest level of analysis, JPython takes Python scripts (.py files) and compiles them to Java Virtual Machine (JVM) bytecodes, which in turn are executed by the Java Runtime Environment (JRE). Using a program called jpythonc, one can even take Python scripts and compile them to .class files, which are then usable as beans, applets or servlets, or in any other system where compiled Java code can be used.
A more useful description of how JPython works is based on the object model it uses. In CPython, Python objects are divided into two kinds of types: built-in types, such as integers, strings, file objects, etc., and instance and class objects. Furthermore, every instance is linked to a class object via a __class__ attribute, and class objects can be linked to base classes via the __bases__ tuple attribute.
In JPython, all Python objects are instances of a Java class. Thus, the number 42 is an instance of the Java class PyInteger, which is accessible through its __class__ attribute. Furthermore, all objects which the JPython user can manipulate are instances of a class which derives from a common base class, PyObject. If these relationships aren't clear, don't worry; the details are almost never needed. What's important is what this design means: all Python objects (including number, string, instance and class objects) are Java objects. Two consequences follow. First, any Java code which can process a Java Object can process a JPython object. Second, all memory management in JPython is done by the normal Java mechanisms, i.e., by the garbage collector implemented in the runtime environment.
What does this mean for the Python user? It means that, unlike in CPython, you don't have to worry about creating cyclic references which CPython's reference-counting garbage collector won't reclaim--in Java, those will be reclaimed. That's the good news. The bad news is the programmer no longer has complete control over when garbage collection occurs. For example, code which relies on file objects being closed when the references to them go out of scope, such as
data = '' for filename in glob.glob('*.txt'): data = data + open(filename).read()might have problems, depending on when the JVM decides to garbage-collect the file objects created by the open calls. To be conservative, the code above should be written as
data = '' for filename in glob.glob('*.txt'): file = open(filename) data = data + file.read() file.close()The final consequence of JPython's delegating memory management to Java is that __del__ methods in user-defined classes are never called. There is no theoretical reason why they couldn't be enabled in the future, but current JVMs take a massive performance hit if the finalize methods perform nontrivial tasks, which would be required for full __del__-method support. Given that the Java spec doesn't guarantee that finalize methods are ever called, relying on __del__ methods would probably not be a good idea anyway.
Let's analyze the first few lines of another sample program, to explore how Python allows clean interaction with Java libraries, as shown in Figure 2.
In line 1, we see what looks like a standard Python import statement. In fact, what goes on behind the scenes is quite remarkable, as java.awt is neither a .py file nor a compiled extension. In fact, JPython is making a runtime Python module out of a Java package. When JPython tries to import a module and can't find a Python module with the specified name (in this case, java), it looks in CLASSPATH for a .class or .jar file, and then uses Java's introspection mechanism (the reflection API) to figure out which subpackages and classes are defined in the file. Using similar mechanisms, JPython finds out that the awt subpackage in the java package defines a class BorderLayout. That class is then returned to the Python code wrapped in a Python object.
In line 2, we import the swing module from the pawt package. pawt stands for Python AWT; AWT stands for Abstract Windowing Toolkit, which is the official name for the Java GUI. Swing has been packaged in various locations in the package hierarchy through the years, and pawt does the work of figuring out which is the case in your installation.
Lines 3 through 5 simply define a function, which ignores its sole argument and quits Python.
Line 6 creates a JFrame instance with the title ``LinuxJournal Example'' and a keyword argument visible with a value of 1. What is hidden is the fact that JPython figured out many things on the fly; specifically, it found out from Java (using the reflection API) that JFrame, defined in the Swing package, is a class in which the constructor takes at most one argument. That argument must be a Java String, so JPython converts the Python string into a Java String automatically. The keyword argument visible is not part of the constructor signature for JFrames (Java doesn't support keyword arguments), but JPython examines JFrame's complete signature and finds that one of JFrame's base classes, JComponent, defines two methods: setVisible and isVisible. Based on that information, JPython infers that ``visible'' is a property of JFrames. (Properties are also obtained by looking at BeanInfo files, for those who use them.) Once a property has been identified, JPython programmers can set and get them with the standard object.attribute notation, or set them with keyword arguments in constructor calls. In other words, the line
frame = JFrame('LinuxJournal Example', visible=1)could have been written
frame = JFrame('LinuxJournal Example') frame.visible = 1or, in Java style:
frame = JFrame('LinuxJournal Example') frame.setVisible(1)Line 7 is similar to line 6, but JPython had to work a little harder to determine which of the four signatures for JButton's constructor had to be used. (JButton can also be called with an Icon, a String and an Icon, or nothing at all.) Also, the actionPerformed keyword argument is even more sophisticated than the property setting just described. The best way to explain this feature is to pretend it didn't exist. We can write
def exit(event): sys.exit(0) button = JButton('Close Me!', actionPerformed=exit)instead of
class action(awt.event.ActionListener): def actionPerformed(self,event): sys.exit(0) button = JButton("Close Me!") button.addActionListener(action())In other words, in standard Java, the event handler for a widget (e.g., determining what should happen when a button is clicked) is specified by passing an instance of a class which defines an actionPerformed method. This fairly cumbersome mechanism is necessary because Java doesn't allow passing function objects. JPython allows use of the simpler Python idiom of setting a callback attribute with a function object, and does the conversion on the fly. The rest of the program is fairly simple, and cheats only a little in that JTree, when called with no arguments, builds a sample tree automatically (see Demo/swing/treedemo.py in the JPython distribution for an example of how to build a tree from scratch).
JPython can perform several other magic tricks behind the scenes, allowing you to write Python code while talking to Java libraries. All of these are documented on the jpython.org web site. They include automatic type conversion, subclassing from Java classes in Python and vice versa, dealing with Java interfaces, dealing with Java arrays (which are different from Python lists) and dealing with Unicode. You'll also need to go to the web site to learn about jpythonc, the JPython compiler, which compiles Python code to .class files.
There are two more benefits which JPython brings, not to the Java world, but to the Python world. First, because it is written in an object-oriented language, modifying JPython is easier than modifying CPython. For example, Barry Warsaw added string methods to JPython in a few hours, but it took him several days to do the same for CPython. Thus, JPython provides a good testbed for playing with language features which can be evaluated in a working implementation before deciding whether to adopt them in the language specification. The experience gained with JPython's more unified type model will probably influence future development of the Python type system.
In addition, JPython helped the Python community simply by being the first complete ``second implementation'' of the language. By its very existence, JPython forced Guido to decide what aspects of CPython were language features, what aspects were implementation features, and what aspects were, if not bugs, ill-chosen features.
JPython's biggest weakness at this point is probably its relative slowness. The slowness is not too surprising, given the slowness of the underlying Java platform; still, it is a noticeable change from CPython's speed. If speed is a concern, then experimenting with some Just-In-Time compilers (JITs) can provide a significant boost. If speed is not a concern and you're willing to help stamp out the remaining bugs, JPython is a very elegant, full-featured scripting solution for Java systems.
David Ascher, Ph.D., is a Senior Developer at ActiveState Tool Corporation. He has taught Python and JPython courses over the last few years and is co-author with Mark Lutz of Learning Python (O'Reilly & Associates). His work with Python since 1995 has spawned scientific code, open-source projects, commercial applications, and a few zealots along the way. If you have good tips, however, you can reach him at DavidA@ActiveState.com.