Linux ABI
As a newbie you must have wondered how the kernel executes compiled code? How does the compiler running in user space generate code that the kernel is able to understand and execute? The straightforward and easy answer is - through a shared standard called ELF
Amongst the most important features of ELF is the function calling sequence. The function calling sequence defines how the registers are used in a function; some registers are assigned special meaning (frame pointer, stack pointer, etc). ELF defines the functions and names of special registers, including what registers might need to be saved. It defines the stack frame. The stack frame has details of the run time stack, including registers saved, return address and frame pointer (if any) and whether the stack grows upwards or downwards. It also defines the parameter passing convention (whether they are passed through the stack or through registers). As you might have deduced by now, all these details are architecture specific .
If you are curious to know why these details are important to you - look at the following (typical) oops reported on lkml (http://lkml.org/lkml/2006/8/8/157)
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c027c3d2 *pde = 00000000 Oops: 0000 [#3] Modules linked in: qla2xxx ext3 jbd mbcache sg ide_cd cdrom floppy CPU: 0 EIP: 0060:[<c027c3d2>] Not tainted VLI EFLAGS: 00010202 (2.6.17.3 #1) EIP is at dm_put_device+0xf/0x3b eax: 00000001 ebx: ee4fcac0 ecx: 00000000 edx: ee4fcac0 esi: ee4fc4e0 edi: ee4fc4e0 ebp: 00000000 esp: c5db3e78 ds: 007b es: 007b ss: 0068 Process multipathd (pid: 15912, threadinfo=c5db2000 task=ef485a90) Stack: ec4eda40 c02816bd ee4fc4c0 00000000 f7e89498 f883e0bc c02816f6 f7e89480 f7e8948c c0281801 ffffffea f7e89480 f883e080 c0281ffe 00000001 00000000 00000004 dfe9cab8 f7a693c0 f883e080 f883e0c0 ca4b99c0 c027c6ee 01400000 Call Trace: <c02816bd> free_pgpaths+0x31/0x45 <c02816f6> free_priority_group+0x25/0x2e <c0281801> free_multipath+0x35/0x67 <c0281ffe> multipath_ctr+0x123/0x12d <c027c6ee> dm_table_add_target+0x11e/0x18b <c027e5b4> populate_table+0x8a/0xaf <c027e62b> table_load+0x52/0xf9 <c027ec23> ctl_ioctl+0xca/0xfc <c027e5d9> table_load+0x0/0xf9 <c0152146> do_ioctl+0x3e/0x43 <c0152360> vfs_ioctl+0x16c/0x178 <c01523b4> sys_ioctl+0x48/0x60 <c01029b3> syscall_call+0x7/0xb Code: 97 f0 00 00 00 89 c1 83 c9 01 80 e2 01 0f 44 c1 88 43 14 8b 04 24 59 5b 5e 5f 5d c3 53 89 c1 89 d3 ff 4a 08 0f 94 c0 84 c0 74 2a <8b> 01 8b 10 89 d8 e8 f6 fb ff ff 8b 03 8b 53 04 89 50 04 89 02 EIP: [<c027c3d2>] dm_put_device+0xf/0x3b SS:ESP 0068:c5db3e78
As you can see - the kernel did a good job of producing a backtrace for us. Let's see how the kernel and tools are able to produce a backtrace in case you ever need to do it yourself. The ELF convention for your architecture should give you all the required details to produce a backtrace.
ELF i386 convention
Registers
General purpose
eax |
return value |
edx |
dividend register |
ecx |
count register |
ebx |
local register variable |
ebp |
stack frame pointer (optinal) |
esi |
local register variable |
edi |
local register variable |
esp |
stack pointer |
Floating Point
st0 |
floating point stack top, return value |
st1 |
floating point next to stack top |
.. |
.. |
st7 |
floating point stack bottom |
Stack Frame
4n+8(ebp) |
argument n |
High address |
|
... |
... |
||
8(ebp) |
argument 1 |
||
4(ebp) |
return address |
||
0(ebp) |
previous ebp (optional) |
||
-4(ebp) |
unspecified |
Low address |
Parameter Passing
As shown in the stack frame, the arguments are passed in the stack frame, above the return address.
While GCC can generate code with different calling conventions for userspace programs or kernel, by using the -mregparm option, this does not change the userspace<->kernel communication style. glibc enforces the previously discussed method of passing parameters no matter if you pass -mregparm to the compiler or not. In fact, the recent kernel default configuration results in -mregparm=3 code, while the userspace programs are almost always compiled without this option at all (to maintain binary compatibility; -mregparm is usually an "all or nothing" choice).
1. Quiz - Given a function foo(a, b, c) - in what order are arguments pushed
a |
b |
c |
'or'
c |
b |
a |
and why?
i. Hint - Think elipses "..." i. Hint - Look at stdarg.h
Prologue and Epilogue
Each function call requires a setup and tear down of the function call stack frame. The setup is called 'prologue' and the tear down is called 'epilogue'
A typical prologue (generated by gcc is shown below)
80488e4: 55 push %ebp 80488e5: 89 e5 mov %esp,%ebp 80488e7: 83 ec 08 sub $0x8,%esp
A typical epilogue (generated by gcc is shown below)
804856d: c9 leave 804856e: c3 ret
In addition to the registers shown above, the local variable registers (ebx, esi, edi) might also be pushed and popped, depending on whether they are used within the function or not.
1. Quiz - generate a backtrace from the following stack
0xf3e2de34 f3e2de70 c0135351 401ef021 00000000 p.bsQS...p...... 0xf3e2de44 f3e2de81 00000021 f3e2c000 f7950924 ..bs......bs...w 0xf3e2de54 00000000 401ef000 00000246 00000246 .....p..F...F... 0xf3e2de64 00000001 00000001 f7950000 f3e2df18 ...........w..bs 0xf3e2de74 c02898b5 322d7875 23203235 00007820 5...ux.252...x.. 0xf3e2de84 f3e2de98 00000000 f7950bd0 bffff0ec ..bs....P..wlp.. 0xf3e2de94 c70cf660 00000000 00000000 bffff0ec .v.G........lp.. 0xf3e2dea4 f3e2c000 f7950930 7fffffff 00000000 ..bs0..w........ 0xf3e2deb4 00000000 00000001 00000000 bffff07b .............p.. 0xf3e2dec4 f5cfd880 bffff07b 00000000 f5f24740 .XOu.p.......Gru 0xf3e2ded4 c0124f87 00000000 00000000 00200200 .O.............. 0xf3e2dee4 f3e2defc 067b3067 00000000 f5f24740 ..bsg0.......Gru 0xf3e2def4 c0124f87 f7950934 f7950934 c028331b .O..4..w4..w.3.. 0xf3e2df04 00000000 c01b0fe8 f5cfd880 f7950000 ....h....XOu...w 0xf3e2df14 fffffffb f3e2df50 c0283642 00000001 ....P.bsB6...... 0xf3e2df24 00000001 00000001 f3e2c000 f6fe30d0 ..........bsP0.v
i. Hint - the current function can always be determined from EIP
References
TODO
- Add other architectures
- Complete references