== test_wp_bit, or how exceptions work == During early bootup on the i386 architecture, the kernel checks whether the CPU enforces the write protect bit while running in ring 0 (supervisor mode). This happens during early kernel lookup, and looks like this: {{{ Checking if this processor honours the WP bit even in supervisor mode... Ok. }}} Linux tests this by trying to write to a read-only page, and checking whether that write operation fails. Not very spectacular huh? Think again... A write to a read-only page normally results in a page fault CPU exception. The page fault handler detects that the page is a kernel page and already present, so the instruction gets restarted. Of course, the page is still read-only, and your kernel would get stuck in an infinite loop. What we need to do to finish the boot process is skip to the next instruction, instead of retrying the instruction that caused the fault. This is done by telling the exception handler that if an exception occurs at a certain address, all the kernel should do is jump to the fixup address for this exception address. In the `do_test_wp_bit` function below, this is indicated with the `__ex_table` section bit. If the CPU triggers an exception at the address of the instruction at label 1:, the kernel should jump to the instruction at address 2:. In this case, that is the end of the code. {{{ static int noinline do_test_wp_bit(void) { char tmp_reg; int flag; __asm__ __volatile__( " movb %0,%1 \n" "1: movb %1,%0 \n" " xorl %2,%2 \n" "2: \n" ".section __ex_table,\"a\"\n" " .align 4 \n" " .long 1b,2b \n" ".previous \n" :"=m" (*(char *)fix_to_virt(FIX_WP_TEST)), "=q" (tmp_reg), "=r" (flag) :"2" (1) :"memory"); return flag; } }}} == How it works == In `arch/i386/kernel/entry.S` you will see a number of exception handling entry points, which get triggered when the CPU throws a certain kind of exception. In this case we have an attempted write to a read-only page, so we get a page fault. {{{ KPROBE_ENTRY(page_fault) RING0_EC_FRAME pushl $do_page_fault CFI_ADJUST_CFA_OFFSET 4 jmp error_code CFI_ENDPROC .previous .text }}} The function `do_page_fault` has a call to the exception handler. {{{ no_context: /* Are we prepared to handle this kernel fault? */ if (fixup_exception(regs)) return; }}} As you can imagine, the real magic is done in `fixup_exception`. To be precise, it searches the exception table to see if the address of the faulting instruction is in it, and if it is, it gets replaced by the fixup address before returning from the page fault handler. {{{ int fixup_exception(struct pt_regs *regs) { const struct exception_table_entry *fixup; fixup = search_exception_tables(regs->eip); if (fixup) { regs->eip = fixup->fixup; return 1; } return 0; } }}} As you can guess by now, the exception table is just a table with addresses of instructions that are expected to throw exceptions, and the fixup addresses that should be put in place. {{{ struct exception_table_entry { unsigned long insn, fixup; }; }}} I guess this magic isn't so magic after all... ---- ["CategoryFAQ"]