 |
How do I apply a patch ? |
|
 |

The answer to this depends on how the patch was created,
specifically to which directory the patch is done from.
In general though, patches are done in the root of the source tree
(/usr/src/linux), and the following assumes this.
For example's sake, you have unpacked a tarball of Linux 2.4.0,
and you want to apply Linus' patch-2.4.1.bz2, which you
have placed in /usr/src. Do the following :
cd /usr/src/linux
bzip2 -dc /usr/src/patch-2.4.1.bz2 | patch -p1 --dry-run
We used the --dry-run option to check that the patch applies cleanly.
This can be a life-saver sometimes as it can be a real pain to
back out a partially-applied patch. The -p1 option
strips off part of the diff file's pathnames for each changed file (see
the patch(1) manpage for more details). Now you've checked that it
should apply cleanly, do :
bzip2 -dc /usr/src/patch-2.4.1.bz2 | patch -p1
to actually apply it. You're done !
This is actually simple with Linus' standard patches, as you can use
the script linux/scripts/patch-kernel to automatically do the patches for you.
The situation with other patches is not always so simple. For example,
Linus' pre patches (found in pub/linux/kernel/testing) are not incremental,
That is pre10.bz2 must be applied on top of the tarball of the previous
full release kernel. Eg, patch-2.4.8-pre2 goes on top of an unpacked 2.4.7 tarball,
*not* on top of a patched 2.4.8-pre1 kernel. If you have a 2.4.8-pre1 kernel,
you can get back to 2.4.7 by following the section 'Reversing a patch' below.
Alan Cox's ac patches (pub/linux/kernel/people/alan/)
follow the same method, unless you get the incremental patches from bzimage.org.
Occasionally you may want to test a patch from linux-kernel
or similar. Generally these will be incremental against the named version (so, say,
2.4.0-test1-ac22-hosedmm.diff should be applied against 2.4.0-test1-ac22),
and relative to the root. You may need to play with the -p option.
Reversing a patch
You've applied several patches, and now you want to remove them. Simply use the -R
option to patch, with the same patch file, to back out the patch (alas, the patch(1)
manpage is less than clear on this).
|  |
 |
 |
 |
|
 |
How do I compile a kernel ?
|
|
 |

(These instructions assume we are installing
version 2.6.0 of the kernel, replace all instances
with the version you are trying to build. These
instructions are also x86-specific; other architecture's
build procedures may differ.)
- Download your tarball from ftp.XX.kernel.org
where XX is your country code. If there isn't a
mirror for your country, just pick a near one.
- Unpack the tarball in your /usr/src directory
bzip2 -dc linux-2.6.0.tar.bz2 | tar xvf -
(Replace bzip2 with gzip if you downloaded the .gz)
- cd into the linux directory.
You'll now need to configure the kernel to select
the features you want/need. There are several ways
to do this..
a. make config
Command line questions.
b. make oldconfig
(Useful only if you kept a .config from a previous
kernel build)
c. make menuconfig
(ncurses based)
d. make gconfig
(GTK+ based X-Windows configuration)
e. make xconfig
(QT based X-Windows configuration)
- Now we can build the kernel (for older kernel like 2.4.x first build
the dependencies with "make dep").
make
- Wait. When its finished, it will have build both the kernel (bzImage)
and the modules (for older kernels like 2.4.x, you need to run
"make bzImage ; make modules).
- Become root to be able to install modules and kernel. Everything before
this point can and should be done as a normal user, there is really no need
to be root to compile a kernel. It's actually a very bad idea to do
everything as root because root is too powerful, one single mistake is enough
to ruin your system completely.
- Install the modules.
make modules_install
- Install the new kernel..
cp arch/i386/boot/bzImage /boot/vmlinuz-2.6.0
cp System.map /boot/System.map-2.6.0
- Edit /etc/lilo.conf, and add these lines...
image = /boot/vmlinuz-2.6.0
label = 2.6.0
Also copy your root=/dev/??? line here too.
- Run /sbin/lilo, reboot, and enjoy. If you get modversion problems
(symbols ending in _Rxxxxxxxx), have a look at
this question in the linux-kernel
mailing list FAQ to solve the problem.
Still not getting it? Try this
more indepth tutorial
|  |
 |
 |
 |
|
 |
How does get_current() work ?
|
|
 |

static inline struct task_struct * get_current(void)
{
struct task_struct *current;
__asm__("andl %%esp,%0; ":"=r" (current) : "0" (~8191UL));
return current;
}
get_current() is a routine for getting access to the
task_struct of the currently executing task. It uses
the often confusing inline assembly features of GCC to perform this,
as follows :
| __asm__(
This signifies a piece of inline assembly that the compiler must insert into
its output code. The __asm__ is the same as asm, but can't be disabled by
command line flags.
| "andl %%esp,%0
"%%" is a macro that expands to a "%".
"%0" is a macro that expands to the first input/output specification.
So in this case, it takes the stack pointer (register %esp) and ANDs it
into a register that contains 0xFFFFE000, leaving the result in that register.
Basically, the task's task_struct and a task's kernel stack occupy an 8KB
block that is 8KB aligned, with the task_struct at the beginning and the stack
growing from the end downwards. So you can find the task_struct by clearing
the bottom 13 bits of the stack pointer value.
| ; "
The semicolon can be used to separate assembly statements, as can the newline
character escape sequence ("\n").
| :"=r" (current)
This specifies an output constraint (all of which occur after the first colon,
but before the second). The '=' also specifies that this is an output. The 'r'
indicates that a general purpose register should be allocated such that the
instruction can place the output value into it. The bit inside the brackets -
'current' - is the intended destination of the output value (normally a local
variable) once the C part is returned to.
| : "0" (~8191UL));
This specifies an input constraint (all of which occur after the second colon,
but before the third). The '0' references another constraint (in this case,
the first output constraint), saying that the same register or memory location
should be used for both. The '~8191UL' inside the brackets is a constant that
should be loaded into the register allocated for the output value before using
the instructions inside the asm block.
See also the gcc info pages, Topic "C Extensions", subtopic "Extended Asm".
(Mostly courtesy of David Howells of Redhat).
|  |
 |
 |
 |
|
 |
Can I use library functions in the kernel ?
|
|
 |

System libraries (such as glibc, libreadline, libproplist, whatever)
that are typically available to userspace programmers are unavailable to
kernel programmers. When a process is being loaded the loader will
automatically load any dependent libraries into the address space of the
process. None of this mechanism is available to kernel programmers:
forget about ISO C libraries, the only things available is what is
already implemented (and exported) in the kernel and what you can
implement yourself.
Note that it is possible to "convert" libraries to work in the kernel;
however, they won't fit well, the process is tedious and error-prone,
and there might be significant problems with stack handling (the kernel
is limited to a small amount of stack space, while userspace programs
don't have this limitation) causing random memory corruption.
Many of the commonly requested functions have already been implemented
in the kernel, sometimes in "lightweight" versions that aren't as
featureful as their userland counterparts. Be sure to grep the headers
for any functions you might be able to use before writing your own
version from scratch. Some of the most commonly used ones are in
include/linux/string.h.
Whenever you feel you need a library function, you should consider your
design, and ask yourself if you could move some or all the code into
user-space instead.
|  |
 |
 |
 |
|
 |
What is asmlinkage ?
|
|
 |

The asmlinkage tag is one other thing that we should observe about this
simple function. This is a #define for some gcc magic that tells the
compiler that the function should not expect to find any of its arguments in
registers (a common optimization), but only on the CPU's stack. Recall our
earlier assertion that system_call consumes its first argument, the system
call number, and allows up to four more arguments that are passed along to
the real system call. system_call achieves this feat simply by leaving its
other arguments (which were passed to it in registers) on the stack. All
system calls are marked with the asmlinkage tag, so they all look to the
stack for arguments. Of course, in sys_ni_syscall's case, this doesn't make
any difference, because sys_ni_syscall doesn't take any arguments, but it's
an issue for most other system calls. And, because you'll be seeing
asmlinkage in front of many other functions, I thought you should know what
it was about.
|  |
 |
 |
 |
|
 |
How do I intercept system calls ?
|
|
 |

Use something like Linux Trace Toolkit probably.
There is also a horrible hack based on modifying entries in the system
call table. This is strongly unrecommended - it is not safe against
module unloading, it is not architecture independent, and it is just
ugly anyway.
Having said that, it seems it's a common task for those learning their
way around kernel hacking. Checkout syscalltrack module for some code
that actually does this.
Basically each point value in the global sys_call_table is modified to
point to a new address supplied by the kernel module. In this way when
the process calls a system call, it will end up in your routine. You can
then call the old value saved from the system call table to actually
process the meat of the request after collecting whatever info you need.
This fails horribly for the execve system call, and there's a very good
reason for this. Let's look at the prototype of sys_execve() :
asmlinkage int sys_execve(struct pt_regs regs)
Note the argument - that is not a pointer ! Your attempt to
intercept sys_execve in the same is not going to work. This argument
indicates that the process's registers have been saved on the stack.
Code inside sys_execve actually modifies these stack locations to place
the PC value register at the start of the new executable - so you must
let the code access the original point in the stack !
For example code that does the modification of the registers, see
start_thread() called from load_elf_binary().
The simplest way to get around this problem is by calling do_execve()
instead of the saved old sys_execve pointer value, duplicating the
kernel's sys_execve() code. Ugly huh ? Please don't ever do this in real
code. If you want to provide some code in a module that kernel code
needs to call, provide a hook in the kernel code as a patch, then a
module on top of that (an example of this is sys_nfsservctl()).
Note that Linus removed the export of sys_call_table in 2.5 kernels.
|  |
 |
 |
 |
|
|
|
|
|
 |
What about the ac (Alan Cox) series of patches ? |
|
 |

- What's the difference with Linus' kernels?
Alan's kernel patchset usually has a lot of fixes for the latest 2.6 stable kernel.
It contains essential patches out of Linus bitkeeper tree, that got
missed for the previous stable release. Some patches are yet to be
merged and other are Alan's own work (tty locking fixes, ide rework).
(Note that most of the 2.4 -ac tree is merged by Marcelo in current 2.4.
The -pac tree forward ports the other patches.)
While the above is the intent of the -ac series, the -ac series
usually contains the bugfixes that are posted on the kernel mailinglist(s) (and
discussions and reviews have been favorable to them). Note that Alan's kernel patchset
tries hard to avoid any resultant incompatibilities.
- Where do I get them ?
Go to ftp.xx.kernel.org/pub/linux/kernel/people/alan/
where "xx" is your country code, e.g. "uk"
- How do I apply them ?
Example of how to patch from 2.6.9 -> 2.6.9-ac11.
(Assuming linux-2.6.9.tar.bz2 and patch-2.6.9-ac11.bz2 have both
been downloaded to the same directory.
bzip2 -dc linux.2.6.9.tar.bz2 | tar xvf -
cd linux
bzip2 -dc ../patch-2.6.9-ac11.bz2 | patch -p1
- I downloaded the .gz files instead of .bz2
Same process, just different program to unpack.
gzip -dc linux.2.6.9.tar.gz | tar xvf -
cd linux
gzip -dc ../patch-2.6.9-ac11.gz | patch -p1
- But I already applied an ac patch !
Then back it out first :
bzip2 -dc ../patch-2.6.9-ac9.bz2 | patch -p1 -R
bzip2 -dc ../patch-2.6.9-ac11.bz2 | patch -p1
- Yuck. Why can't there be "incremental diffs" between ac revisions ?
If you're lucky, there are, at http://www.bzimage.org/
or ftp://sunsite.icm.edu.pl/pub/Linux/kernel/incr/2.4
|  |
 |
 |
 |
|
|
 |
Where do I begin ? |
|
 |

A common question asked by a newbie is "I've just unpacked this huge tarball,
and I want to help out, but I don't know where to start!"
It may seem daunting to be confronted with such a large amount of source code,
but bear in mind, that very few kernel hackers understand every area of the
kernel tree.
People specialise. If you're interested in TCP/IP, you'll not be needing to
read the filesystem code. Figure out what it is you want to be working on, and
focus on that.
Linux is a professional-quality kernel. This makes it difficult to come up with
small "student projects" by which you can learn: often features are already implemented,
and at a level that requires a good level of understanding before you can hack on them.
However, there are several practical things and useful things you can do until you
have learned enough to really start hacking :
- Test and benchmark
-
New code is constantly evolving, benchmark it. You will certainly notice some odd behaviours: there
is your impetus to understand where the behaviour is coming from. Profile things, trace it (e.g. LTT),
see if you can work out what might be causing problems. You'll learn the code by accident. Try out
experimental patches posted to linux-kernel and trees like mjc's. Try and understand what a particular
patch does, and how it does it.
- Document
-
Sounds boring ? Maybe, but you'll be doing everybody a favour, not least yourself. Forcing yourself
to explain things crystalises your own understanding. Documentation of behaviour requires you to understand
code. You'll find code a lot easier to read if you are directed to answering a specific question.
Write articles for kernelnewbies and get them peer-reviewed in the IRC channel. Identify inaccuracies
in the current man-pages, and fix them. Add source docs to the kernel source.
- Kernel janitors
-
Kernel janitors is a project to fix mis-use of kernel APIs as the code mutates. This can quickly get
pretty interesting. An educational talk on the project can be read here.
|  |
 |
 |
 |
|
 |
Are there any good IDEs? How do I handle all this code?
|
|
 |

When dealing with a source base as large as the kernel, it certainly
helps to have software tools to help understand how the pieces fit
together. Perhaps the most important tool is a good programmers's
text editor. Popular choices are emacs and any vi clone, such as vim. Generally, text editors written for
programmers are programable and have features such as syntax highlighting,
text folding, brace matching, and easy integration with source management
tools, such as make(1), cvs(1), text reformatting, man page lookups,
and more.
Most popular is a tool to quickly find uses, definitions, and
declarations, of C symbols. grep(1) is almost always available, and
the more powerful version, egrep(1), is very useful to know. But
grep(1) requires searching every file on every lookup. Tools
such as cscope,
freescope, etags,
ctags, and idutils
build databases to use when searching for C symbols. Each has their own
idiosyncrasies and features. Some integrate better with your text editor
of choice. (Look especially for plugins to help with integration.)
cgvg is another option, though
it doesn't appear to use a database to speed searches.
|  |
 |
 |
 |
|
|
 |
What's going on with the kernel headers ?
|
|
 |

On any distribution, there are two sets of kernel headers :
System kernel headers
These are the kernel headers actually used by the system. These are the headers
you compile user-space utilities against. They must be installed to compile
anything in userspace.
The headers are usually found in /usr/include/asm and /usr/include/linux.
They are copies and should never be replaced (unless you are doing a C library upgrade).
These headers contain compatibility code etc. to allow them to be used with
a variety of different running kernels, and are conceptually part of the glibc package.
They can often be found in the kernel-headers or libc6-dev RPM/package.
Kernel source headers
These are the kernel header files that are part of the kernel source package. They should
never be used for compiling user-space programs. Old Linux distributions often made
/usr/include/linux and /usr/include/asm symlinks to the right parts of
the kernel source tree installed in /usr/src/linux. This is the wrong thing to do -
userspace programs must use copies of the kernel headers, suitably modified.
Conversely, when compiling the kernel, or kernel modules, these headers must be used. This is
important when compiling externally packaged modules - the module build should look in the right
place for the headers (by e.g. adding -I/lib/modules/`uname -r`/build/include).
Read Linus' explanation
of the situation.
The kernel hackers are thinking about sanitising a separate set of headers
you can include in user space. Let's hope it will happen in linux-2.7. Here's
what Linus says.
|  |
 |
 |
 |
|
 |
Why do a lot of #defines in the kernel use do { ... } while(0)?
|
|
 |

There are a couple of reasons:
- (from Dave Miller) Empty statements give a warning from the compiler
so this is why you see #define FOO do { } while(0).
- (from Dave Miller) It gives you a basic block in which to declare
local variables.
- (from Ben Collins) It allows you to use more complex macros in
conditional code. Imagine a macro of several lines of code like:
#define FOO(x) \
printf("arg is %s\n", x); \
do_something_useful(x);
Now imagine using it like:
if (blah == 2)
FOO(blah);
This interprets to:
if (blah == 2)
printf("arg is %s\n", blah);
do_something_useful(blah);;
As you can see, the if then only encompasses the printf(),
and the do_something_useful() call is unconditional (not within
the scope of the if), like you wanted it. So, by using a block
like do{...}while(0), you would get this:
if (blah == 2)
do {
printf("arg is %s\n", blah);
do_something_useful(blah);
} while (0);
Which is exactly what you want.
- (from Per Persson) As both Miller and
Collins point out, you want a block statement so you can have several lines
of code and declare local variables. But then the natural thing would be to
just use for example:
#define exch(x,y) { int tmp; tmp=x; x=y; y=tmp; }
However that wouldn't work in some cases. The following code is meant to be
an if-statement with two branches:
if(x>y)
exch(x,y); // Branch 1
else
do_something(); // Branch 2
But it would be interpreted as an if-statement with only one branch:
if(x>y) { // Single-branch if-statement!!!
int tmp; // The one and only branch consists
tmp = x; // of the block.
x = y;
y = tmp;
}
; // empty statement
else // ERROR!!! "parse error before else"
do_something();
The problem is the semi-colon (;) coming directly after the block.
The solution for this is to sandwich the block between do and
while(0). Then we have a single statement with the capabilities of
a block, but not considered as being a block statement by the compiler.
Our if-statement now becomes:
if(x>y)
do {
int tmp;
tmp = x;
x = y;
y = tmp;
} while(0);
else
do_something();
|  |
 |
 |
 |
|
 |
Who can I find on #kernelnewbies ?
|
|
 |

| Real name |
Nick |
Kernel responsibility |
| Anton Altaparmakov |
AntonA |
ntfs |
| Arjan van de Ven |
arjan |
kHTTPd,
Powertweak |
| Andre Hedrick |
ata |
IDE guru |
| Jens Axboe |
axboe |
CDROM/DVD layer |
| Ralf Baechle |
Bacchus |
Linux-MIPS |
| Ben LaHaise |
bcrl |
Memory management |
| Dave Jones |
davej |
2.5-dj tree maintainer.Powertweak, random hacks |
| Erik Mouw |
erikm |
ARM Linux, SA1100-Linux |
| f00f |
f00f |
Larting, jumping, and logging | |
| Greg Kroah |
gregkh |
USB |
| Christoph Hellwig |
hch |
Filesystems, kbuild, kernel cleanup |
| Jeff Dike |
jdike |
User Mode Linux |
| Jeff Garzik |
jgarzik |
Network drivers, PCI, kernel cleanup |
| lxrbot |
lxrbot |
The channel oracle. Ask it for definitions and uses
in kernel source, or query its factoid database. |
| Thiago Rondon |
maluco |
Random hacking |
| Marcelo W. Tosatti |
marcelo |
2.4 maintainer |
| Michael J. Cohen |
mjc |
Maintainer of -mjc kernel tree |
| John Levon |
movement |
oprofile, random minor hacking |
| Fabio O. Leite |
olive |
drbd, High Availability, heartbeat |
| Daniel Phillips |
phillips |
TUX2 filesystem, ext2 improvement, memory management hacking | |
| Juan Quintela |
quintela |
Memory management |
| Rik van Riel |
riel |
Memory management |
| Russell King |
rmk |
ARM Linux |
| Tigran Aivazian |
tigran |
Random hacking |
| Momchil Velikov |
velco |
VM hacker etc. |
| Alexander Viro |
viro |
VFS guru |
| William Lee Irwin |
wli |
VM hacking, bootmem, more |
|  |
 |
 |
 |
|
| |
|