This month's column discusses RCS, the Revision Control System.
by Arnold Robbins
What is RCS? RCS is the Revision Control System. Its job is to manage the process of revising and updating files. It can and should be used for program text and documentation, as well as for any other files that are revised on a frequent basis. RCS allows you to retrieve earlier versions of files, while keeping a log of what changes were made, who made them, and why. RCS makes it easy to compare any two versions of a file, and provides a mechanism for merging changes from two different development "branches" of a source file.
RCS was originally written by Dr. Walter F. Tichy, at Purdue University. Beginning in 1983, it received wide-spread use in the Unix community with its release as part of the User Contributed Software in 4.2BSD. It was described in an article in "Software--Practice And Experience" in July 1985.
RCS provides a safety net for the software developer. When developing, fixing, and improving a program, changes are inevitable. By saving a stable version of your file in RCS, you can later return to a known state if a set of changes does not work out.
If more than one person is working on the same file, RCS allows you to "lock" a file, so that only one person will be allowed to make changes. Other people can still use the file, e.g., for compiling.
Besides keeping track of what changes were made to a file, RCS tracks who made the change and when. RCS also files a log message describing the change. This makes it easy to figure out who broke the program when the fatal bug is finally isolated.
The user interface is intentionally quite simple, consisting primarily of two commands, ci and co. To start with, make a directory to hold the program and cd into it. Then make a directory named RCS. Although not required, this is the cleanest way to do it; all RCS files will be kept in the RCS subdirectory. We'll also create the first version of the program.
$ mkdir hello $ cd hello $ mkdir RCS $ cat > hello.c # editors are for wimps! :-) #include <stdio.h> int main(void) { printf("hello, world\n"); } ^D $ ls -l hello.c -rw-r--r-- 1 arnold 66 Nov 5 22:33 hello.c
We now have a C source file that is ready to go. When compiled and run, it prints the well-known, friendly greeting beloved by C programmers the world over.
After making sure it compiles, the first thing to do is "check in" the program with RCS. This is accomplished with ci.
$ ci hello.c RCS/hello.c,v <-- hello.c enter description, terminated with single `.' or end of file: NOTE: This is NOT the log message! > world famous C program that prints a friendly message. > . initial revision: 1.1 done $ ls RCS
The first time a file is checked in, RCS wants a description of just what the file is. It reminds us that this is not the log message, thus, something like "initial revision" would be inappropriate here. The > is the prompt for information. Also note that the original file is removed. RCS has the file safely stored in the RCS file hello.c,v in the RCS directory.
Well, a file that we can't compile isn't of much use, so the next thing to do is get a copy so that we can actually compile the program and use it. This is done with co, which stands for "check out".
$ co hello.c RCS/hello.c,v -> hello.c revision 1.1 done $ ls -l hello.c -r--r--r-- 1 arnold 66 Nov 5 22:43 hello.c $ gcc hello.c -O -o hello; ./hello hello, world
Note that the file is returned to us, but with read-only permissions. We are thus allowed to use the file, but not change it. In normal use, for instance, to build a whole source tree to install software, you would check out the files read-only, compile the programs, and then remove the source files.
What about when you want to change a file? Programs do evolve, so how do you get to the next revision? The first thing to do is to check out the file, but with a lock on the file. The lock says that you, and only you, are allowed to check in a new revision of the file. This is necessary if more than one person will be working with the source file, so that two people don't trash each other's work.
$ co -l hello.c RCS/hello.c,v -> hello.c revision 1.1 (locked) done $ ls -l hello.c -rw-r--r-- 1 arnold 66 Nov 5 22:51 hello.c
This checks out the file, and locks it. Note that the permissions now allow writing to the file. We can edit the file, and make our changes to it.
$ sam hello.c # a nifty editor, # watch for a future column on it. $ cat hello.c #include <stdio.h> #include <string.h> int main(int argc, char **argv) { if (argc > 1 && strcmp(argv[1], "-advice") == 0) { printf("Don't Panic!\n"); exit(42); } printf("hello, world\n"); exit(0); } $ gcc -O hello.c -o hello $ ./hello -advice Don't Panic! $ ./hello hello, world
Our program now has a new option, -advice, that prints a friendly piece of advice and exits with a well known, special value. The default behavior remains unchanged, except that exit is now used for the normal case, as well.
We can now check in the new version to RCS. Assuming that we will want to do further work on the file, ci also allows us to use the -l option. With this option, ci will perform the check-in and automatically do a co -l for us, so that we can continue to work on the file.
$ ci -l hello.c RCS/hello.c,v <- hello.c new revision: 1.2; previous revision: 1.1 enter log message, terminated with single `.' or end of file: > Added -advice option, and made regular case use exit. > . done $ ls -l hello.c -rw-r--r-- 1 arnold 208 Nov 5 22:54 hello.c
Here we see where the log message is entered. Log messages should be relatively brief, describing what was changed and why. In a commercial environment, you might enter the bug number associated with a particular fix into the log, as well. We also see that the file is still available for further editing (permissions -rw-r--r--).
You can compare any two versions of a file using the rcsdiff command. This command accepts all the options that the regular diff command does. However, it usurps the -r option for providing revision numbers. By default, rcsdiff compares the current version of the working file with the most recently checked-in version. With one -r option, it compares the current file against the specified previous version. You can supply two instances of the -r option to make it compare two different revisions, neither of which is the current version. Here's an example of the default (and most common) case:
$ rcsdiff -c hello.c ================================================== RCS file: RCS/hello.c,v retrieving revision 1.1 diff -c -r1.1 hello.c *** 1.1 1994/11/06 03:36:45 --- hello.c 1994/11/06 03:54:49 *************** *** 1,6 **** #include <stdio.h> ! int main(void) { printf("hello, world\n"); --- 1,12 ---- #include <stdio.h> + #include <string.h> ! int main(int argc, char **argv) { + if (argc > 1 && strcmp(argv[1], "-advice") == 0) { + printf("Don't Panic!\n"); + exit(42); + } printf("hello, world\n"); + exit(0); }
This generates a "context diff", showing the surrounding context of what was changed, not just the changes themselves. The lines marked with ! indicate a changed line from the old to the new version, and the lines marked with + indicate lines that were added.
The rcsdiff program makes it easy to generate updates that can be applied with patch. When a program is finished, simply check in all the files that make it up with a new, higher level revision number, such as 3.0. Then, for the next release, run rcsdiff against revision 3.0 for all files.
$ rcsdiff -c -r3.0 RCS/* > myprog-3.0-4.0.patch 2>&1
(This doesn't catch the case of brand new files added in 4.0, or deleted files that were in 3.0, but you get the idea.)
As a side note, in order to build and install the RCS software, you need to have the GNU version of diff. Linux systems have this already. If you don't have GNU diff, you should get it anyway, since it is very full- featured and noticeably faster than the standard Unix version of diff.
It is often useful to be able to look at the contents of a source file and tell what version of the file it is. RCS allows you to do this by performing "keyword substitutions" on the contents of your file when it is checked out. There are a large number of these keywords; the co man page documents them in full. The most common ones are $Id$ and $Log$.
The $Id$ keyword is replaced with text describing the filename, revision, date and time of checkout, the author, and the state (e.g., Exp, for experimental). Usually, this is embedded in a C string constant so that a binary, generated from the file, can be identified with the ident command.
The $Log$ keyword is replaced with the text of the most recent log message. This is usually placed inside a comment, so that the source file is self-documenting, showing what was changed and when. This is useful, but should be used with caution: if a file is changed frequently, this log can grow quite a lot.
We'll now add the keywords and show the state of the file, both before and after checking in the changed version.
$ sam hello.c $ cat hello.c #include <stdio.h> #include <string.h> static const char rcsid[] = "$Id$"; /* * $Log$ */ int main(int argc, char **argv) { if (argc > 1 && strcmp(argv[1], "-advice") == 0) { printf("Don't Panic!\n"); exit(42); } printf("hello, world\n"); exit(0); } $ ci -l hello.c RCS/hello.c,v <- hello.c new revision: 1.3; previous revision: 1.2 enter log message, terminated with single `.' or end of file: >> add id and log keywords. >> . done $ cat hello.c #include <stdio.h> #include <string.h> static const char rcsid[] = "$Id: hello.c,v 1.3 1994/11/07 03:41:32 arnold Exp arnold $"; /* * $Log: hello.c,v $ * Revision 1.3 1994/11/07 03:41:32 arnold * add id and log keywords. * */ int main(int argc, char **argv) { if (argc > 1 && strcmp(argv[1], "-advice") == 0) { printf("Don't Panic!\n"); exit(42); } printf("hello, world\n"); exit(0); }
We see that RCS has filled in the information for both keywords. When the program is compiled, the ident command will give us information about all the files used to compile the program that have RCS ids in them.
$ gcc -O hello.c -o hello $ ident hello hello: $Id: hello.c,v 1.3 1994/11/07 03:41:32 arnold Exp arnold $
You can do just about everything you need to with ci, co, and rcsdiff. There are a few other commands that come with RCS that are also of interest.
The rcs command is used for changing the state of RCS files. In particular, it can be used to lock a file that is not locked or to break someone else's lock on an RCS file. This latter operation is perilous and should only be done in an emergency. There are a number of other operations that rcs can perform; see the man page for details.
It is possible to have "branches" off the main line (or "trunk") of development. For instance, assume that the released version of hello.c is 2.6 and that version 2.7 will be the next released version. Programmer Mary is writing version 2.7, while programmer Joe has to maintain version 2.6. Normally, Joe would start a separate branch off the main development trunk, generating versions 2.6.1.1, 2.6.1.2, and so on. RCS can maintain an arbitrary number of branches off the main trunk, as well as branches off the branches. However, as you might imagine, keeping track of many levels of branching can become confusing.
At some point, Mary will want to make sure that all of Joe's fixes are incorporated into her version of hello.c; she would do this using rcsmerge. (rcsmerge uses a separate program that also comes with RCS, named merge, which does the actual work of merging the files.)
Finally, the rlog command will print out all the log messages for a particular source file. This allows you to see the complete change history of a file.
$ rlog hello.c RCS file: RCS/hello.c,v Working file: hello.c head: 1.3 branch: locks: strict arnold: 1.3 access list: symbolic names: comment leader: " * " keyword substitution: kv total revisions: 3; selected revisions: 3 description: world famous C program that prints a friendly message. ---------------------------- revision 1.3 locked by: arnold; date: 1994/11/07 03:41:32; author: arnold; state: Exp; lines: +6 -0 add id and log keywords. ---------------------------- revision 1.2 date: 1994/11/07 03:40:21; author: arnold; state: Exp; lines: +7 -1 Added -advice option, and made regular case use exit. ---------------------------- revision 1.1 date: 1994/11/07 03:38:50; author: arnold; state: Exp; Initial revision ====================================================
Most of the initial stuff that rlog prints out is explained in the RCS man pages. Of interest to us are the description and log message parts of the output, which tell us what the program is, what changes were made, by whom, and when. Interestingly, the timestamps are in UTC, not local time. This is so that developers in different time zones can collaborate without getting discrepancies in their Id strings.
The main problem that RCS does not solve is having multiple people working on a file at the same time and the larger issues of release management, i.e., making sure that the release is complete and up to date.
A separate software suite is available for this purpose: cvs, the Concurrent Version System. From the README file in the cvs distribution:
cvs is a front end to the rcs(1) revision control system which extends the notion of revision control from a collection of files in a single directory to a hierarchical collection of directories consisting of revision-controlled files. These directories and files can be combined together to form a software release. cvs provides the functions necessary to manage these software releases and to control the concurrent editing of source files among multiple software developers.
You can get cvs from ftp.gnu.ai.mit.edu in /pub/gnu. At the time of this writing, the current version is cvs-1.3.tar.gz. By the time you read this, CVS 1.4 may be out, so look for cvs-1.4.tar.gz, and retrieve that version if it is there.
RCS provides complete, flexible revision control in an easy-to-use package. Like make, RCS is a software suite that any serious programmer needs to learn and use.
Thanks to Paul Eggert for reviewing this article. His comments were very useful; several of them were incorporated almost verbatim. Thanks also to Miriam Robbins for forcing me to run spell.
Arnold Robbins is a professional programmer and semi-professional author. He has been doing volunteer work for the GNU project since 1987 and working with Unix and Unix-like systems since 1981.