Linux Programming Hints

A Method for Building Shared Libraries

Building shared libraries for Linux is often considered a black art. In this article, Eric explains five simple steps to producing a standard Linux shared library, and tells the curious where to find more information.

by Eric Kasten

Shared libraries are probably most often used because they allow for the creation of shared executables, which take less disk space. They also allow the compression of multiply defined global variables into a single instance of the variable that all program modules share. Also possible is the creation of a compatible, drop-in replacement for an existing shared library. Improvements or fixes in the replacement library are then immediately available to executables the library is linked with. This last possibility is beyond the scope of this article.

Dynamically linked libraries (DLLs) have become an important part of the Linux system. Even though ELF (the executable and linking format designed for Unix SVR4), which makes creating shared libraries trivial, is just over the horizon, the current a.out DLL shared libraries will probably need to be supported for some time. In many cases, older versions of Linux will still need support, and commercial a.out libraries may require that an executable be built using a.out DLLs, because a.out libraries and ELF libraries cannot be mixed in one executable. Until ELF makes its way from the alpha releases of Linux into the more stable releases required for a production environment-and probably even after that-a.out shared libraries will continue to be built and used.

Provided with the source code for a static library, a shared version of the library can be created by completing five well defined steps. This article will explain how to apply these steps to create a simple shared library. Its aim is to help you understand shared libraries and how they are built, so you can successfully create more complicated shared libraries in the future.

Background

This article assumes the use of gcc 2.6.2 and DLL tools 2.16 with libc 4.6.27. Other versions may have slightly different syntax or may operate differently. All these items may be obtained by anonymous ftp from tsx-11.mit.edu in /pub/linux/packages/GCC/ (tools-2.16.tar.gz is in the src directory). Follow closely all the installation instructions in the release notes, or unnecessary problems may result.

Shared libraries consist of two basic parts: the stub and the image. The stub library has an extension of .sa. The stub is the library an executable will be linked to. It provides redirection of shared functions and variables to the location where the real shared functions and variables reside in memory. The library image has an extension of .so, followed by a version number.

The library image contains the actual executable functions used by binary programs. The image also contains two tables of particular note: the jump table and the global offset table (GOT). The jump table contains eight-byte entries which redirect a call to a shared function from the jump table to the real function. The jump table exists to provide a method for creating compatible replacement libraries. Since each function has an entry of fixed size in the jump table, the jump table can provide an entry point for these functions at a location that remains constant between revisions of a library. This allows previously linked executables to continue to function without recompilation. The global offset table functions for global variables as the jump table does for library functions.

Each shared library is loaded at a fixed address between 0x60000000 and 0xc0000000. If an executable is linked to two or more shared libraries, the libraries must not occupy the same address range. If two libraries should overlap, the location an executable is redirected to may not contain the expected function or variable. A list of registered shared libraries can be found in the tools 2.16 distribution in the directory doc/table_description. Examine this file when defining the load address for a new shared library to ensure that it doesn't conflict with the address for an existing library. In addition, you should probably register the address space used by a new shared library so that future libraries will not conflict with it. Registration is particularly important if the library is to be distributed.

Before Beginning

As mentioned earlier, this procedure is directed at the creation of a simple shared library. Although the steps for building a more complex library are the same, the process of modifying multiple or complex makefiles can become somewhat confusing. For your first attempt it is a good idea to select a library which has all the library source in a single directory. A good choice may be the JPEG library, which can be retrieved by anonymous FTP from ftp.funet.fi with file name /pub/gnu/ghostscript3/jpegsrc.v5.tar.gzi. Or you could create several simple source code modules and a makefile to compile and build a static library. This test library need not do anything useful, since it is only for educational purposes. However, since you will already understand the inner workings of the build process, you can avoid the effort of attempting to understand another program's makefile logic. Also, be sure that a static version of the library can be successfully compiled before approaching the construction of a shared one.

Step One: Setup

The method presented here is not the only way to create a shared library, but it has often proved successful. It provides, in the form of a file to include in the makefile, a simple record of the parameters and the method used to build a particular library. First, create the file that will be included in the makefile; call it Shared.inc. The file should look something like:

SL_NAME=libxyz
SL_PATH=/usr/local/lib
SL_VERSION=1.0.0
SL_LOAD_ADDRESS=0x6a380000
SL_JUMP_TABLE_SIZE=1024
SL_GOT_SIZE=1024
SL_IMPORT=/usr/lib/libc.sa
SL_EXTRA_LIBS=/usr/lib/gcc-lib/i486-linux\
	/2.6.2/libgcc.a -lc

SHPARMS=-l$(SL_PATH)/$(SL_NAME)\
	-v$(SL_VERSION) \
	-a$(SL_LOAD_ADDRESS) \
	-j$(SL_JUMP_TABLE_SIZE) \
	-g$(SL_GOT_SIZE)

VERIFYPARMS=-l$(SL_NAME).so.$(SL_VERSION) -- \
	$(SL_NAME).sa

CC=gcc -B/usr/bin/jump

pre-shlib: $(LIBOBJECTS)

shlib-import:
	buildimport $(SL_IMPORT)

shlib:	$(LIBOBJECTS)
	mkimage $(SHPARMS) -- $(LIBOBJECTS)
$(SL_EXTRA_LIBS)
	mkstubs $(SHPARMS) -- $(SL_NAME)
	verify-shlib $(VERIFYPARMS)

The first section consists of a series of variable definitions. These variables have the following meanings:

SL_NAME: The name of the library which is being built.
SL_PATH: The location where the shared library will live.
SL_VERSION: The library version.
SL_LOAD_ADDRESS: The absolute address in memory where the library will be loaded. (Examine the table_description file provided with the DLL tools to make sure this address doesn't overlap with another library).
SL_JUMP_TABLE_SIZE: The size of the jump table. (Give this any value for the moment; an appropriate value will be determined later).
SL_GOT_SIZE: The size of the global offset table. (Give this any value for the moment; an appropriate value will be determined later).
SL_EXTRA_LIBS: Other libraries which are required to build the shared image.

SL_IMPORT indicates other shared libraries to import symbols from. These imported symbols are used to help direct global variable references to their proper locations in other shared libraries. The libraries specified here should be any shared libraries which are required to build the target library. The target shlib-import makes use of a /bin/sh script called buildimport, which is invoked with SL_IMPORT as a parameter. The build import script should contain the following commands:

#!/bin/sh
echo -n > $JUMP_DIR/jump.import
for lib in $*;
  do nm --no-cplus -o $lib | \
     grep '__GOT__' | sed 's/__GOT__/_/'\
      > $JUMP_DIR/jump.import
done

This script uses nm, grep and sed to extract the symbols from the global offset tables of each of the stub libraries specified on the command line to create a file called jump.import (the nm command sequence is excerpted from "Using DLL Tools With Linux"). Be sure to chmod u+x buildimport. SL_EXTRA_LIBS are libraries which will be required to successfully build the library. Usually most of these libraries can be determined by examining a makefile which builds an executable using this library (often there are test programs included with the source for the library). libgcc.a is required with gcc 2.6.2; if it is left out, there will be an unresolved reference for _main. It is usually necessary to explicitly specify libc with -lc. If there should be unresolved references when the library image is made, chances are that a required library was omitted.

The definition of CC as gcc -B/usr/bin/jump is telling the compiler to use an assembler called /usr/bin/jumpas instead of the default assembler. Be sure to check what other parameters are specified in the original makefile (and whether CC was defined as the compiler variable) and make additions and changes as necessary. CC is nearly always defined, and thus has been used in this example. If you use a version of DLL tools earlier than version 2.16, it may be necessary to specify CC as gcc -B/usr/dll/jump/.

The targets pre-shlib and shlib both have LIBOBJECTS as dependencies. You will probably find a list or a variable containing a list of the library dependencies in the target for the static library in the original makefile. You should define LIBOBJECTS as this list of dependencies, or you should replace all instances in Shared.inc with the dependencies specified for the static library. Take care when constructing a dependency list for a shared library; it is not uncommon for source code modules to be compiled even though they are not part of the final library. The only objects that should be compiled during the building of a shared library are those that will eventually become part of the library. If other objects are compiled, the symbols and globals used in those modules will end up in the jump configuration files for the library, and possibly in the library itself. These undesirable functions and variables may result in troublesome behavior or failure of the library build process.

In general, make sure you understand how the library object files are built. Also, make certain that the shared library objects are built using the same flags and options that were present for the original library. Now edit the library makefile (make a backup first), and add the following statement to the end of the list of makefile targets:

 include Shared.inc

Finally, from the source directory of the library, do the following:

 mkdir jump
 JUMP_LIB=libxyz
 export JUMP_LIB
 JUMP_DIR=`pwd`/jump
 export JUMP_DIR

These commands create a work directory for the DLL tools and assembler, and set the necessary environment variables which are required to successfully build a shared library. It will be necessary to use setenv if a csh variant is in use. Remember to replace libxyz with the name of the target library (as specified in SL_NAME).

Step Two: The First Compile

Before each compile remove the old .o files to ensure that the object code is rebuilt. Executing a make clean may be sufficient; however, be careful-many makefiles will remove more than the .o files and you may need to reconfigure the source code. Often an rm *.o will work more dependably.

If everything has been set up properly, it should now be possible to begin the first compile by entering:

 make pre-shlib

This step compiles the library using the assembler prefixed by the -B switch. This will extract the necessary symbols from the library source into a file called jump.log. From jump.log, the global variables and functions will be extracted into the necessary configuration files where the DLL tools will find them. Once all the source has been compiled, change to the directory that was specified in JUMP_DIR. Jump.log should be in this directory. Now execute the following:

 getvars
 getfuncs
 rm -f jump.log

These commands will create the files jump.vars and jump.funcs. jump.vars contains a list of the global variables found during the compile, while jump.funcs contains a list of functions. If, for some reason, you don't want to export a symbol found in jump.funcs or jump.vars, move the entry to a file called jump.ignore in the JUMP_DIR directory. Be sure to remove any entries added to jump.ignore from the original file. Now return to the compile directory.

Step Three: Importing Symbols

Now you should create the jump.imports file. Since a target was previously defined in Shared.inc, simply enter:

 make shlib-import

There now should be a file called jump.imports in the JUMP_DIR directory. Nothing needs to be done with this file; it will be used to determine which global variables should be located in one of the imported libraries.

Step Four: The Second Compile

The second compile is necessary to determine the sizes of the global variables. The sizes of the globals must be known so that the GOT pointers can be set properly. Remove the .o files from the previous compile and then do the following:

 make pre-shlib

Now change to the JUMP_DIR directory and execute:

 getsize > jump.vars-new
 mv jump.vars jump.vars-old
 mv jump.vars-new jump.vars

Step Five: Building The Library

Before actually building the shared image and stub libraries, the jump table and GOT must be allocated enough storage for all the existing functions and global variables as well as for functions or globals that may be added in revisions to the library. To determine the required number of bytes for the jump table and the GOT, execute the following:

 wc -l $JUMP_DIR/jump.funcs
 wc -l $JUMP_DIR/jump.vars

Multiply the resultant line counts by 8 to calculate a lower bound for the number of bytes required for existing functions and global variables, respectively. These values should be padded significantly to allow for future library expansion. Now edit Shared.inc and replace the settings of SL_JUMP_TABLE_SIZE and SL_GOT_SIZE with the values just determined. If you receive an overflow message while building the image, increase these values. Keep in mind that these sizes should be multiples of 8, and that the values calculated are minimums, and will probably not be sufficient to build the library image.

Now everything should be ready to actually build the shared image and stub. Without removing the .o files, execute:

 make shlib

This will first build the image, and then the stub library. Then the stub and image will be verified to check that the libraries were built properly. If all goes well, the last message should be something like:

Used address range 0x6a37f020-0x6a395020 be aware! must be unique! The
stub library and the sharable libraries have identical symbols.

The address range indicated in the first line is somewhat misleading, since a load address of 0x6a380000, not 0x6a37f020, was specified. This is normal. However, make note of the last address since it indicates the last address used by the library. This address is usually padded somewhat to make sure that room is left for expansion. The address range might be recorded as 0x6a380000-0x6a395fff or 0x6a380000-0x6a39ffff, depending on how much space might be required in the future.

The second line indicates that the image and stub libraries were built correctly. If the verification process should indicate that the stub and image differ, an error has occurred. Possibly one of the most common errors is when the JUMP_LIB environment variable and SL_NAME don't match. Double check that these two variables match if there should be a problem. If everything has gone correctly there should now be a stub and image library. The image should be copied to the directory specified by SL_PATH and the stub should be placed where it can be found by the compiler and linker. Once these files have been copied to their final directories, enter:

 ldconfig -v

There should be output similar to the following, indicating that ldconfig has created a symbolic link for the new library in which the name only contains the major version number. This is done because a look-up on the library is done using only the major version number.

 libxyz.so.1 => libxyz.so.1.0.0 (changed)

If ldconfig doesn't find the library, make sure that the directory in which the library is located is included in the list in /etc/ld.so.conf. It should now be possible to make use of the new library. Shared.inc, jump.vars, jump.funcs, jump.import and jump.ignore should be saved. These files will be useful if you need to rebuild the library or create a compatible replacement.

Trail's End

This article has outlined a method for creating a simple shared library from scratch. This basic method provides a starting point for understanding and constructing a shared library. Many other topics are covered and more depth is presented in "Using DLL Tools With Linux" by David Engel and Eric Youngdale. This document can be found in the doc directory provided with the tools 2.16 distribution. Information on both DLLs and ELF can also be found in the GCC FAQ, which can be retrieved via anonymous ftp from www.mrc-apu.cam.ac.uk as file /pub/linux/GCC-FAQ.

Eric Kasten has been a systems programmer since 1989. Presently, he is pursuing his masters in computer science at Michigan State University, where his research focuses on networking and distributed systems. Well thought out comments and questions may be directed to him at tigger@petroglyph.cl.msu.edu.