Can't find a system that has everything you want? Build your own.
by He Zhu
We have seen various Linux distributions, and yet many others continue to appear. Some are as small as DLX which sits on a single floppy; others are as big as Red Hat 6.2, packed in five CDs. Things seem to become more complex and harder to manage as systems grow. How is a Linux system put together from pieces of free code? How can we assemble and customize our own system for a particular purpose? It seems to be a hard task.
However, from the view of the base system, in principle, all distributions are assembled in a similar way. The difference is that big ones are armed with more packages and more fancy stuff targeted at more general audiences, and small ones have fewer goodies and are aimed at relatively narrow and specific user groups.
Fully featured Linux distributions are usually unnecessarily big for specialized situations. For example, embedded applications need a slim base for their particular situations, and there are already small Linux distributions available for these purposes. However, because there are so many factors to consider, no one can claim that their distribution is comprehensive and satisfies all customers.
Usually an application needs a customized base system to work efficiently. You can pay for a solution from many Linux service providers, or you can do it yourself. Sometimes knowing how to build the base system on your own is more beneficial. With such skills and knowledge, engineers can easily control and improve the system's behavior for the needs of their customers. DIY (Do-It-Yourself) is not just fun, but strategically important in some cases, and more people are recognizing the value Linux offers. By doing so, you acquire the ability to customize not only the Linux kernel but also all other components of your system to achieve the optimizations best suited for your requirements.
This article tries to tell readers that building your own base Linux is not a daunting task. It tells our experiences and gives a brief introduction to building and customizing a Linux system. It is a base system: small, clean and ready to go. We try to make the complex simple, without losing generality and effectiveness. We show how to make the building steps as easy as 1-2-3, and how to customize this system to be minimal, accommodated in one floppy. After all, it should be good enough to be used as a start point for a base system to run any typical applications. This is possible because we build the system directly from unchanged sources. This allows us to always use the latest stable versions and make all required kernel services available.
A system is simply defined as a combination of inherently connected parts. A Linux system (only considering software in this article) is a combination of a Linux kernel and other components which make the kernel useful. All software components except the kernel need to be resident in a root file system (there may be a few other file systems, but they must be mounted under the root tree to become visible). So, technically, we can simply consider a Linux system as a combination of a kernel and a root file system. All Linux distributions are arranged in this way. For example, a fully installed Linux system is a kernel plus a big root file system. A Linux installation disk and a rescue system are used to install a full system and to repair problem systems, respectively. They are also organized in the same way; that is, consisting of a kernel and an initial root file system, but the initial root file system is small and only holds a few basic components necessary to do limited jobs (see Figure 1). The Linux kernel has been coded to take special measures to locate the root file system, either from a normal file system or from a compressed image of an initial root file system.
Given an application, we want to run it above the kernel using shared libraries on a box. Assume the application doesn't require any rare features which Linux doesn't provide at the moment. We want a base system that can run this application and provide some basic control and management as well. A typical case of the base system is shown in Figure 2. Note that this figure is not complete because a utility may be statically linked, which doesn't require a shared library and a utility may be an a.out style, which doesn't use a dynamic loader.
If the application is self-sufficient, that is, statically linked with everything required at runtime, it can run right over the kernel without any support from shared libraries. The base system in this case may mean only the Linux kernel itself. However, almost all systems need support from one or more utilities to manage things like file operations and system monitoring, using commands like mount and ps. We consider a base system to be a combination of the kernel, the dynamic loader, a set of libraries and a set of utilities.
Like many other systems, our goal is to show how to create the system on a floppy with both the kernel and a compressed image of an initial root file system, as shown in Figure 3. This compressed initial root file system will be uncompressed by the kernel and put into a ramdisk, that is, a space of RAM set aside to hold a small Linux file system. Creating such a base system is often straight-forward, but rather tedious for most people. We simplify the procedure. The idea is to design a well-organized hierarchy of makefiles which will extract sources, compile, and setup the contents of the initial ramdisk, then prepare and pack the whole system.
Customization of the above base system depends on the requirements of the application that will run on the box. Choose a configuration for the kernel, a dynamic loader, a set of libraries that are necessary for the application and utilities, and a set of basic utilities which are required to control and manage the system. Then, compile all these things in a consistent environment. After that, package the results and make it bootable. If something is missing, you are free to add it to the list, and make it again.
A curious reader might ask why we do this and what it is good for. Like many others, we want to run some complex software on a box, something that can be generalized as a multitasking application on a typical PC-like machine without hard disk or monitor. We need an OS kernel and some elements as the base experimental platform. This platform should be robust, maintainable and customizable. Writing a good OS kernel for this purpose is too scaring for many. Thanks to Linux and the Open Source community, we now have an excellent option.
Basic materials are ready and available for free. Now it is time to pick up pieces we need, assemble our own engine and control it. Then, it is time to enjoy.
Before we start, we need to know the answer to some key questions: How to compile the kernel? How to compile a shared library? How to create an initial root file system? How to put the kernel image and compressed file system onto a floppy or EPROM? How to run an application using shared libraries? How to debug? There are many questions like these. The answers are already documented, not, as far as we know, in a single place, but scattered over a wide range of documents. We don't want to write a comprehensive document for these questions but, rather, tell our story and major part of our answers.
Once the plan has been made for customization, detailed steps can be put into action. General steps in our work are described in the following.
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/specs gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)Compile other components of the system by adding the option -b as:
gcc -b i386-redhat-linux
As far as we know, it is hard to find a document that tell us in detail how to put images, executables, binaries and scripts together; in other words, to package things together, to assemble a system. Different systems may take different approaches to packaging, although components can be created in the same way. The easiest and most popular way is packaging on floppies. The general steps of packaging a bootable system on a single floppy, that is a boot/root floppy, can be summarized in the following few steps:
An application can be started in different ways depending on how the base system is configured. In most of cases, a Linux kernel is configured to run a startup script or a binary executable, called init or linuxrc, in the initial root file system after the kernel is up. This init program usually does things like remount the root file system to allow read/write permissions, mount other file systems like proc, and initialize other parts of the system, such as starting a shell interface or running the application immediately. The SysVInit program is very popular in most Linux distributions for this purpose.
For our base system, we don't need a complex init sequence to demonstrate. So, we simply write a shell script like the following. Anyone is free to change and add more commands to it:
mount -n -o remount,rw / mount /proc /proc -t proc echo MyCompanyName, Version X.Y. Built Z, August 2000 exec /bin/shAs an exercise, it might look better if you have the above echo line in your application, and start the application at the end of the script instead of running the standard shell. An example in C++:
cout << COMPANY << VERSION_NO << BUILD_NO << __DATE__ << __TIME__;In our case, the system prompts after it is up.
pipe-elinux> MyCompanyName, Version X.Y, Build Z, August 2000 pipe-elinux>
After the base system is up, you might think it is not much use without any interesting applications. But it is a base from which you could start your big project. One by one, you can gradually add things into this base system, making it more and more attractive. The following examples might be worth considering:
You might not be satisfied with booting from floppies. Instead, you can implement booting from EPROM or others. To do this, you have to redesign your packaging approach, but the components are mostly unchanged. What's specific here is the kernel image loader. Booting can be implemented like:
One of the advantages of using Linux is that there are many documents and tools to help you customize your system and solve problems. The code is no secret. Everything inside and outside is open. Besides, there are many other useful sources in print and on the Web. There is no other system which can compare to Linux in this respect, not even Free BSD, let alone any proprietary operating systems.
Typically, for a problem, we might work out a few different solutions. We always want to pick the best one, of course, but it is not easy to know which is the best until we have tried each of them. To solve a problem, in many cases, we can find answers by consulting Linux HOWTOS and docs, or asking Linux guys in our organization. As an alternative, you can post a message on a Linux newsgroup and hope someone on there can give you a quick reply. If you want to pay, there are many Linux-related companies providing technical services as well. (If the problem is stubborn, as the last straw, kick your buggy box a few times, as I did sometimes. You must be careful--don't break it and then reboot. That should work; otherwise repeat the problem-solving sequence from the beginning again.)
gdb is an excellent debugging support for applications on the base system. If we don't want or are unable to run a full gdb on the target system, that is, the base system, we can run small remote gdb facilities as either a gdb stub or a gdb server on the target. Besides, things like the syslogd dæmon can also help debugging on the target system.
There are many good problem-solving strategies. Whatever approaches we use, the goal is to find the proper solution. It is usually safe to follow a successful example. For example, we learn something by checking things inside a Red Hat rescue system. We can do this simply with the following few commands:
cat rescue.img | gzip -d > rescue_root.img mkdir rescue_root mount -o loop rescue_root.img rescue_rootHere rescue.img is the compressed rescue floppy image found in the Red Hat distribution's images directory. Then we can check its contents by:
ls rescue_rootIt displays:
bin dev etc lib lost+found mnt proc sbin tmp usrYou get all the detail in the floppy.
This article is only an introduction to customization of the Linux base system. For a particular situation, it could be rather complex, especially when modifications at the code level are required, such as to support specialized hardware. But, we have shown that it is a manageable task. Our purpose is to make things simple in order to encourage people to take the challenge. By creating our own customized base system with a moderate effort, we get a power engine which can drive us into the bright future.
He Zhu (hezhu@yahoo.com) is interested in system software and networking. He is currently working for Bell Labs, New Jersey.