The testing of software plays a very important role in the process of its development, since potentially every new piece of software contains bugs. Some of the bugs are just simple mistakes, like typos or omissions, that can be spotted quite easily by the developer himself or by the reviewers of his code. However, there are bugs resulting from wrong assumptions made by the developer and usually they can only be found by running the software on a computer in which these assumptions are not satisfied. This is particularly true with respect to operating systems that often are developed under assumptions following from the observed but undocumented behavior of hardware. It is therefore important to test them at the early stage of development on as many different machines as possible in order to make sure that they will work correctly in the majority of practically relevant cases.
The purpose of the present guide is to acquaint the reader with the testing of the Linux kernel. It is divided into several short chapters devoted to the more important issues that every good tester should be familiar with. Still, we do not try to describe very thoroughly all of the problems that can be encountered during the kernel testing, since that would be boring. We also want to leave some room for the reader’s own discoveries.
Why is it a good idea to test the Linux kernel?
Now, in my opinion, we should test it, if we want to be sure that the next versions of the kernel will correctly handle our hardware and will do what we need them to do. In other words, if we seriously want to use Linux, then it is worth spending some time checking if the next version of the kernel is not going to give us trouble, especially that recently the kernel has been changing quite a lot between consecutive stable releases (which is not even realized by the majority of its users). In fact, this is the responsibility of an Open Source user: you should test new versions and report problems. If you do not do this, then later you will not have the right to complain that something has stopped working. Of course, our testing will also benefit the other users, but we should not rather count on someone else to check whether or not the new kernel will work on our hardware.
– Rafael J. Wysocki
Unfortunately, many Linux users tend to think that in order to test the system kernel you should be an expert programmer. Yet, this is as true as the statement that the pilots who test new airplanes should be capable of designing them. In fact, the ability to program computers is very useful in carrying out the tests, because it allows the tester to assess the situation more accurately. It also is necessary for learning the kernel internals. Still, even if you cannot program, you can be a valuable tester.
In most cases, the testing of the Linux kernel is as simple as downloading the tarball containing its sources, unpacking it, configuring and building the kernel, installing it, booting the system and using it for some time in a usual way. Of course it can quickly get complicated as soon as we trigger a kernel failure, but this is where the interesting part of the story begins.
Certainly, you should be able to distinguish kernel failures from problems caused by user space processes. For this purpose it is quite necessary to know how the kernel is designed and how it works. There are quite a few sources of such information, like the books Understanding the Linux Kernel by Daniel Bovet and Marco Cesati (http://www.oreilly.com/catalog/understandlk/), Linux Device Drivers by Jonathan Corbet, Alessandro Rubini and Greg Kroah-Hartman (http://lwn.net/Kernel/LDD3/) or Linux Kernel Development by Robert Love (http://rlove.org/kernel_book/) (the list of all interesting kernel-related books is available from the KernelNewbies web page at http://kernelnewbies.org/KernelBooks). Very useful articles describing the design and operations of some important components of the Linux kernel can be found in the web pages of Linux Weekly News (http://lwn.net/Kernel/Index/). Still, the ultimate source of information on the kernel internals is its source code, although you need to know the C language quite well to be able to read it.
At this point you may be wondering if it is actually safe to use development versions of the kernel, as many people tend to think that this is likely to cause a data loss or a damage to your hardware. Well, this is true as well as it is true that if you use a knife, you can lose your fingers: you should just pay attention when you are doing it. Nevertheless, if your data are so important that you cannot afford to lose them in any case, you can carry out the kernel tests on a dedicated system that is not used for any other purposes. For example, it can be installed on a separate disk or partition, so that you do not have to mount any partitions from a "stable" system while a new kernel is being tested. It also is a good idea to regularly backup your data, regardless of whether the system is a test bed one or it is considered as "stable".