KernelNewbies:

What should I have to fix a bug?

  1. Bug. Well known particular bug.
  2. A buggy kernel.
  3. Bit of luck.

Note:

Function printk().

printk is a very useful function similar to printf(). This function works everywhere and at any time (apart from early stage of booting the kernel when video isn't initialized). It uses log levels to tell the console the importance of each message. Full list of levels:

  1. KERN_EMERG <-- the most important

  2. KERN_ALERT
  3. KERN_CRIT
  4. KERN_ERR
  5. KERN_WARNING
  6. KERN_NOTICE
  7. KERN_INFO
  8. KERN_DEBUG <-- the least important

The console will print messages only with a level higher than console_loglevel. By default printk uses DEFAULT_MESSAGE_LOGLEVEL == KERN_WARNING (but this may be changed in the future).

printk() uses a cyclic buffer to manage the messages. Next klogd reads the messages (using /proc/kmsg) from the buffer and gives them to syslogd which writes them to /var/log/messages. (You can configure syslogd by editing /etc/syslog.conf).

Examples:

From: Linux/arch/mips/sgi-ip27/ip27-berr.c

20 #if 1
321         printk("FIXME: disabling master aborts\n");
322         csrs->POx_MSK_HEI.csr &= ~(3UL << 14);
323 #endif 

Error oops.

An oops is report of a bug in the kernel. When an oops occurs the kernel will print what the registers contain and a "back trace". An oops does not mean the system has crashed, as the system can sometimes recover from the error. If the system can not recover from the error then the kernel will panic and stop running. By default the back trace will contain the addresses of the functions that were called. If you compile your kernel with CONFIG_KALLSYMS=y the oops will be decoded and will display the function names. In the 2.4 kernel you can use ksymoops file_with_oops.txt to see the names of the functions.

http://www.urbanmyth.org/linux/oops/slides.html <-- useful link

Additional compiling options.

These options are very useful when debugging kernel:

CONFIG_PREEMPT=y

CONFIG_DEBUG_KERNEL=y

CONFIG_KALLSYMS=y

CONFIG_SPINLOCK_SLEEP=y

CONFIG_MAGIC_SYSRQ=y

Causing errors and printing extra information.

  1. Sometimes you will want to see oops information about some bug. Use BUG() BUG_ON():

if (bad_thing)

or BUG_ON(bad_thing);

Examples:

From: Linux/arch/arm/plat-omap/dma.c

732         if (omap_dma_in_1510_mode()) {
733                 printk(KERN_ERR "DMA linking is not supported in 1510 mode\n");
734                 BUG();
735                 return;
736         } 

1221         BUG_ON(lcd_dma.active); 
  1. Sometimes you will want to see oops information and then stop system. Use panic():
    if (terrible_error)

    • panic("var = %ld \n", var);
  2. Sometimes you will want to see stack. Use dump_stack(): if (debug_check)
  3. dump_stack();

Examples:

From: Linux/arch/cris/arch-v32/kernel/dma.c

40                 if (options & DMA_PANIC_ON_ERROR)
41                         panic("request_dma error!"); 

From: Linux/drivers/scsi/hosts.c

398         if (!sht->detect) {
399                 printk(KERN_WARNING "scsi_register() called on new-style "
400                                     "template for driver %s\n", sht->name);
401                 dump_stack();
402         } 

Magic SysRq Key.

If you set CONFIG_MAGIC_SYSRQ=y or typed 'echo 1 > /proc/sys/kernel/sysrq', you can use SysRq Key (on PPC or i386) 'Alt+PrintScreen'.

  1. SysRq+b Restart computer

  2. Sysrq+c system crash + crashdump
  3. Sysrq+d dump all locks that are held
  4. SysRq+e Send SIGTERM to all tasks (except init !!!)

  5. SysRq+f call OOM killer to kill a memory hog, but do not panic

  6. SysRq+g used by kgdb (kernel debugger)

  7. SysRq+h Help

  8. SysRq+i Send SIGKILL to all tasks (except init !!!)

  9. SysRq+j Forcibly Just thaw filesystems that are frozen by the FIFREEZE ioctl

  10. SysRq+k Kill all tasks ran from this console

  11. SysRq+l Show stack backtrace for all active CPUs

  12. SysRq+m Dump current memory info on console

  13. SysRq+n Used to make RT tasks nice-able

  14. SysRq+o Halt system and shutdown it

  15. SysRq+p Print CPU registers on console

  16. SysRq+q Dump per-CPU lists of all armed hrtimers and info about all clockvent devices

  17. SysRq+r Change keyboard from RAW to XLATE

  18. SysRq+s Attempt to sync all mounted filesystems

  19. SysRq+t Show current task info on console

  20. SysRq+u Unmount all filesystems, and remount read only

  21. SysRq+v Forcefully restore framebuffer console

    • Do ETM buffer dump [ARM-specific]
  22. SysRq+w Dump tasks that are in uninterruptible (blocked) state

  23. SysRq+x Used by xmon interface [PPC, PowerPC]

    • Show global PMU registers [Sparc64]
      Dump all TLB entries [MIPS]

  24. SysRq+y Show global CPU registers [Sparc64]

  25. SysRq+z Dump the ftrace buffer

  26. SysRq+N N = '0' - '9': sets the console log level ('0' => emergency only; '9' => everything)

Note that every user can use SysRq keys, and it can work improperly on an unstable system.

How to use debuggers?

Before I start talking about debuggers you must know one thing. Linus discourages the use of debuggers, because debuggers don't always tell the truth.

https://sourceware.org/gdb/documentation/ <-- documentation for gdb

http://kgdb.wiki.kernel.org/

https://www.kernel.org/doc/html/latest/dev-tools/kgdb.html

When all else fails.

No one likes bugs. So when you spend hours/days on bug fixing, you may write a short and descriptive email containing your all of the information you have found, and send it to LKML. Good luck with Bug Hunting.

KernelNewbies: KernelHacking-HOWTO/Debugging_Kernel (last edited 2021-01-14 22:12:27 by RandyDunlap)