A look inside RPM packages and how to use Java to extract information.
by Jean-Yves Mengant
Installing a Linux system for the first time is quite straightforward. You can find good and cheap Linux packages on the market that will quickly install themselves. You need only choose some configuration options describing the type of hardware on which you want to install, and that's it.
As time goes on, you will add some new components to your Linux system, and that's where the nightmare may begin. A Linux system is composed of hundreds of components and dynamic libraries. In order to keep your system up and running, you should be careful, since installing a new version of a component may introduce incompatibilities in your system, making it more unstable as time goes on.
Replacing an existing component with a new version at first looks like a trivial task: you just need to pick up a new compiled version of the component, generally available on the Net in a .tgz file (tar + gzip compression), and install it on your system. Some dynamic libraries of the component you just upgraded may already be used by other installed components and might not be compatible with the new version you just installed.
It would be great to have a tool which could report the dependencies of each component installed on your system. Such a tool could tell you the version of Samba (for instance) installed on your system or that you can't install egcs-1.0.2-8 on your system prior to having binutils 2.9 up and running. This tool already exists on Linux--it is called RPM--and is on many existing Linux distributions, including Red Hat, Caldera, SuSE and Linux Mandrake.
RPM stands for Red Hat Package Manager and is described by its creator as ``an open packaging system available for anyone to use and works on Red Hat Linux as well as other Linux and UNIX systems'' (from the Red Hat installation guide).
Before starting the programming discussion, I will introduce the RPM package manager and give a general overview of it. If you need more information, see Resources for more than 400 pages of interesting details on RPM's history, design, usage and programming.
The diagram in Figure 1 represents the different components involved in the RPM tool. RPM is composed of three main parts:
Let's say you are interested in installing version 1.9.18 of Samba. First, you should look on the Net for an RPM of the Samba package (instead of a tar, gzip package). Once you have it, type:
rpm -uvh samba-1.9.18p8-50.1.i386.rpmThis command will install (or upgrade) a copy of Samba on your system. It will also check that all dependencies needed by this version of Samba are present on your system. If the rpm command completes with no error messages, you're guaranteed the installed package will be ready to run without trouble at the end of the installation process.
This installation process will also update the RPM database which keeps track of all installed packages on your system and all their dependencies.
So if, six months later, you want to find out which version of Samba is installed on your system, typing the following command:
rpm -q sambawill tell you
samba-1.9.18p8-50.1If you want to remove a package from your system, the RPM utilities will remove the files which were installed on your system during installation.
Now we know RPM packages are interesting. Many of them are available on the Net today and having a portable tool able to analyze an RPM package before installing it could be an interesting utility.
I think there are only two possibilities if you want to be portable to multiple UNIX and non-UNIX systems and easy to use in the Internet context: Perl or Java. From a technical point of view, there is no reason to prefer one over the other. The choice is a personal decision.
I have more experience programming Java than Perl. After a long and difficult thought process, I decided to start in Java, reasoning that if I later needed to add graphical presentation classes to the component, I could use the Java Swing package (which is available with JDK1.1 or JDK 1.2).
If you look at the /usr/lib directory of a Red Hat distribution, you will find a librpm.a static archive library. This library is provided with its corresponding C language prototypes: rpmlib.h, header.h and dbindex.h, located in /usr/include/rpm.
You can use those prototypes if you need to develop C utilities which deal with RPM resources. Chapter 21 of E. C. Bailey's book (see Resources) provides detailed information on how to do this. But, since we want to provide an independent Java package, these prototypes are of no interest to us.
The right place to start from (in the same resource) is Appendix A: Format of the RPM file, which gives us the RPM File format. The same appendix also provides us with the following sage advice: ``RPM file format is subject to change.''
If an RPM file format is to be manipulated, you are strongly urged to use RPM routines to access the package file. Why? ``RPM file format is subject to change''!
In our case, we will assume there is no immediate danger in querying an existing RPM package, since we commit to never modifying its structure inside our Java package.
very confusing. Please make sure a pair of technical eyes looks it over to make sure it sounds OK. Dave Wright's changes were incorporated. -Ellen
Figure 2. Structure of the Java RPM Classes Design in UML Format
Figure 2 represents the structure of the Java RPM class designs in UML format (Unified Modeling Language). Let's explain it in more detail. The UML class design provides a clean high-level representation of what an RPM package is.
Content is interesting information on the package and its installation rules. The content itself (not represented in the UML picture, for clarity) is only a compressed archive. When uncompressed, it is a cpio archive in SVR4 format with a CRC checksum (see Resources).
I cleanly separate the RPM object from its graphical representation. The classes in Figure 2 implement only operations on RPM files; they don't provide any graphical representation of them. Another class, called RpmFilePanel, will be added to provide a simple Swing display, which will graphically manipulate the basic RpmFile class, designed to implement the behavior of an RPM file.
The first interesting class is the RpmException class. This class inherits from the basic Java Exception class and implements a default constructor with no parameters and a constructor which takes a String message parameter. This class is the only exception rendered by the RpmFile Java Package. I am convinced that, when writing a new Java package, the first thing you should do is build an exception wrapper for the package. Later on, all the classes of the RpmFile package will throw an RpmException with an accurate message when something goes wrong. From an object-oriented design point of view, this technique improves your design's robustness, providing your package with full isolation from the basic system layer. You can, of course, do the same thing in C++. The only problem is that support for exceptions by some C++ compiler implementations may not be available, and portability of your C++ code could be more difficult to implement.
The next public Java class is the RpmFile class itself. The public methods made available by the RpmFile class implement the following basic services (The constructor gives the ability to build a class instance. It does not take any parameters.):
Now our basic class structure, which is able to manipulate headers of RPM packages, is in place, and it would be a good idea to add a minimal GUI interface to it. We will build this very simple interface using the Java Swing package, available with either JDK 1.1 or JDK 1.2.
This very simple GUI interface is managed by the RpmFilePanel.Java class. This class inherits the basic JPanel graphical component and divides the screen into three parts using a Java BorderLayout:
The RpmFilePanel.java file contains a static main method to launch the Swing frame inside a Java application. You can use the JDK Java utility from the shell to start it:
java rpm.RpmFilePanelWarning: you should have compiled the Java package and installed it in a location accessible from your CLASSPATH.
You should provide a valid RPM package file in URL form and type it into the URL field as shown below. Finally, pushing on the Load button will:
When you need to build the provided source package, you may use either a JDK 1.1 version or a JDK 1.2.2 version. If you use JDK1.1, the prerequisite is to download the Swing package from Sun's site.
Using Java to analyse RPM file formats may be useful, and it gives you cross-platform portability, which may help you display RPM file contents on heterogeneous platforms. This basic tool may easily be enhanced, for instance to convert RPM contents into other formats or extract other parts of an RPM package.
Jean-Yves Mengant (jymengant@ifrance.com) is Chief Architect at SEFAS Technologies in Paris, France. He has been a professional programmer for more than 20 years on multiple OSes (MVS, Linux), languages (assemblers, C, C++, Java and Perl) and software topics (Network, Tp Monitors and GUI interfaces). He has used Linux systems professionally for programming tasks since 1995 and kernel 1.2, and the Java language since 1996 and JDK 1.0.