A look at the MySQL database--where it's been, where it is now, and where it's going.
by David Axmark and Michael Widenius
MySQL's history goes back to 1979 when TcX, the company that developed MySQL, started working with database programs. This first version was a screen builder/reporting tool written in BASIC. At that time, state-of-the-art computers had 4MHz Z80 processors and 16KB of RAM. This tool was moved to UNIX and further developed during the following years. In the mid-1990s, we started having problems with customers who liked the results the tool produced but wanted something they had heard about before (a buzzword was needed). So we started looking at making an SQL (an appropriate buzzword) front end to our low-level libraries. We found mSQL, but it did not work for our purposes. So we started to write an SQL engine from scratch. However, since the mSQL API was useful, we used it as the basis for our own API. This made it easy to port some applications we needed that were available for the mSQL API.
Since this tool would be usable by others, we decided to release it according to the business model pioneered by Peter Deutsch at Aladdin Enterprises with Ghostscript. This copyright is much more free than the mSQL copyright and allows commercial use as long as you don't distribute the MySQL server commercially.
It is not perfectly clear where the name MySQL came from. We have used the prefix ``my'' for libraries and path names since the mid-1980s. The main MySQL developer's daughter is named My--a fairly common name among Swedish-speaking Finns--so naming our database MySQL was very natural.
In May 1996, MySQL version 1.0 was released to a limited group of four people, and in October 1996, MySQL 3.11.1 was released to the public as a binary distribution for Solaris. A month later, a Linux binary and the source distribution were released. The MySQL release included an ODBC driver in source form. This also included many free MySQL clients ported to MySQL.
The initial version of MySQL worked only on Linux and Solaris. The biggest problem in porting to other platforms was that MySQL needed a working POSIX thread library; in January 1997 a modified version of MIT-pthreads was included in the distribution.
To be able to use MySQL from your favorite language, you need an API. The first MySQL version included only C and Perl APIs. Now there are many (see Table 2). With the exception of the Java API, all of these use the C API to communicate with the MySQL server. So, as you can see, MySQL can be used from many popular languages.
When we had gotten a nice working system, we wanted to test it against old versions and against other databases, so we started looking for good benchmarks. We found that most benchmarks (like the TCP ones) represent an SQL server's performance as a single number, often as transactions/second. We regard these to be almost worthless, since comparatively few users run applications that do the same thing as these benchmarks. There is usually no way to determine your application's performance from the numbers given by this type benchmark.
The MySQL benchmarks are designed to show how fast a SQL server is for common operations, such as establishing a connection, performing simple inserts or joining two tables using a key. This also makes it possible to calculate loads on a web site when you know the mix of operations. Of course, you need to actually understand your own application to judge its performance with any database.
Over time, we got many requests on the MySQL mailing lists about MySQL's features and how it compares feature-wise with other databases. As Michael (the main developer) didn't want to dig into old inaccurate reference manuals to find this out, he thought of a program that automatically detects what a SQL server has to offer. He also thought it would be a nice test of how stable the MySQL code is when you start to send it ``abnormal'' queries.
While working with the benchmarks, we needed a list of capabilities for all supported databases. Since doing this by hand was very tedious, we made a tool to do it automatically. While trying early versions of this tool on some different servers, bad things happened--the servers crashed. All this crashing led us to name this tool crash-me. In fact, the only SQL server that has gone through this testing without crashing is Oracle. Of course, all bugs found in MySQL were fixed immediately.
crash-me (and the benchmarks) are implemented as a Perl DBI/DBD program that sends thousands of queries to a database to find out how things work in real life. By doing this, it finds many limits in the server, such as the supported column/query/variable/index lengths.
crash-me is also a nice tool for helping you write portable SQL, since it can provide a list of which functions, types and limits exist in the server you wish to use. Currently, we have crash-me results from Adabas-D, Access, DB2, Empress, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Solid and Sybase. As the crash-me table is big and very detailed, we will not include it here, but it is available at http://www.mysql.com/crash-me-choose.html/.
What operations does the benchmark test? First and foremost, the basic SQL operations are tested: INSERT, UPDATE, DELETE and SELECT. Other tests include a connect followed by a select, and creation of tables and indexes.
The individual tests should give a good indication of how fast an SQL server is for that specific operation. Do not use the ``total time'' as an overall measure of the value of an SQL server. This is because the tests are not weighted against each other. Some tests are run more times with different options and numbers of rows. An SQL server may be extremely bad at some ``unimportant things'', while it's very good at the things for which you actually intend to use it.
We use the total time to compare things like the same database engine on different operating systems. We also use it to see how new versions of MySQL stack up against old ones.
Since all benchmark tables take even more space than the crash-me results, we include only a few observations on how well MySQL runs on different platforms.
There are still many things to be done for both crash-me and the benchmarks. For example, we would like crash-me to report if there are identical functions that do the same thing (such as, instead of CONCAT one can use ``||''). Also, many new tests should be added to test which sub-select constructs an SQL server allows. Of course, the documentation and presentation of the results could be much improved.
Both these tools give invaluable information to any developer who uses more than one SQL server. If they do not test the feature you need, please contribute a new test. More test results can be found at http://www.mysql.com/benchmark.html.
If you checked out the benchmark page above, you will notice that MySQL is many times faster than the competition on almost everything. Why is this so? Some of the speed comes from things that are not supported in MySQL, such as transactions, foreign keys and triggers.
Because the MySQL server is coded mainly by one person with many years of coding experience, very little redundant code is in it. Most of the basic algorithms also come from an era of slow CPUs and small amounts of memory. The algorithms have mostly been extended to use larger caches if there is available memory.
As a result, MySQL has a compact fast design (the code size of the server is less than 1MB on an i386) which normally uses very little memory, but can be configured to take advantage of large amounts of memory.
MySQL has many useful optimizations for maximum speed. Some examples are:
SELECT ... WHERE col IN (a,b,c,d,e,f,...)
SELECT MAX(index_part2) FROM tbl_name WHERE index_part_1=constant;
WHERE index_column LIKE "prefix%".
One of the design decisions that has resulted in the greatest number of questions is the lack of transactions. There are, of course, applications that require transactions to work, but a wide range of applications work very well without them.
Some people believe that since MySQL does not support transactions, it cannot support many simultaneous users. Each MySQL client gets a dedicated thread in the MySQL server, which allows different users to access the same tables at the same time. All MySQL operations are atomic: no other users can change the result for a running query.
When designing MySQL, we had a greater need for speed than for transactions. It's no use having transactions if the SQL server becomes so slow it's unusable for what you need to get done.
Another common misunderstanding is that transactions provide extra robustness through the redo logs. This extra security can be had by simple means in MySQL. That is, do normal backups and then apply the update log to the backup. The update log is a file containing all SQL statements that change any data.
MySQL also has external utility programs to check, optimize and repair individual tables.
As MySQL has about 50 mirrors over the world, and we don't get download statistics from them, it's hard to tell how many MySQL installations are out there.
The WWW and FTP log at http://www.mysql.com/ gives us the information shown in Tables 3 and 4; all counts are based on the number of distinct IPs.
In the Linux community, many sites use MySQL as a back end for dynamic web pages. Among those are http://slashdot.org/, http://freshmeat.net/ and http://www.linux.com/.
On linux.com, every page does somewhere between 10 and 20 queries to the database. And linux.com does anywhere between 500K and 800K page views per day. They run MySQL on its own server, a dual Xeon system with huge amounts of RAM and hard-disk space.
While writing this, I asked Linux Journal what they use as a web back end, and learned they also use MySQL. Among the awards we have been given, we highly value the ``Most Used Database'' 1998 award we got from Linux Journal's readers.
We have no intention of stopping development of MySQL. Over time, MySQL will be 100% ANSI 92 compatible. As we still want MySQL to be fast, we will always give the user the option of specifying the removal of features which make a normal SQL server slow.
As an example, the GRANT system will not have any speed impact unless you use this to restrict table or column access.
The current ``TODO'' list can be found in the MySQL manual at http://www.mysql.com/doc.html. Everything in this list is in the order we plan to implement it.
We have worked many years with GNU/BSD and other programs from the Net and have always believed that programs should be available in source. Because of this, we chose to use the same license as Aladdin GhostScript for the MySQL server on UNIX, and we made the client's completely free.
By the time this article reaches publication, there should be an old version of MySQL (3.20) with a GPL copyright available. We will continue releasing old versions under the GPL.
This means that for normal (even commercial) internal use on UNIX systems, MySQL costs nothing. You do not have to pay us if you do not want to. A license is required only if:
The following points set forth the philosophy behind our licensing policy:
Both David and Michael have been working as software consultants for over 15 years. Among the things they have worked with are software for a one-card computer used by power companies, a video-rental system, a state-of-the-art market research system, advanced business graphics, a word processor that could handle Z80 Assembler+Basic, and a full operating system for an 8-bit computer (Z80) and many other other projects.Michael Widenius lives in Helsingfors, Finland with his wife and his two kids My and Max. David Axmark (david@detron.se) lives in Uppsala, Sweden with his plants and computers.