Differences between revisions 5 and 6
|Deletions are marked like this.||Additions are marked like this.|
|Line 123:||Line 123:|
|* [http://mail.nl.linux.org/kernelnewbies/2003-08/msg00347.html How to debugging kernel OOPs and hangs]||* [http://mail.nl.linux.org/kernelnewbies/2003-08/msg00347.html How to debug kernel OOPs and hangs]|
How to capture dmesg
This only works if your machine doesn't totally crash when you hit the bug. If the system freezes, the dmesg may not be written to disk. In that case, your only alternative is to use either serial console or netconsole to capture the dmesg when the crash happens.
Kernel options to turn on
Other kernel subsystems often have debug config options you can turn on for more verbose debug. For example, to enable verbose USB 3.0 debug, you should turn on CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING.
Netconsole is a powerful Linux kernel debugging tool. The dmesg output from a machine under test is transferred over an ethernet link (via UDP packets) to another machine. That means that you can see the debugging messages from the test machine on the screen of another machine. Netconsole isn't good for debugging early kernel panics, but it is very useful if your new kernel driver hangs your system.
Netconsole is a kernel module, so you will need to compile a custom kernel with CONFIG_NETCONSOLE=m. If you need help compiling a custom kernel, follow the directions on KernelBuild. Which kernel you choose to compile depends on which kernel you want to reproduce the bug on. You may want to download the source of your distribution kernel, or attempt to reproduce the bug on the latest stable kernel.
First, you need to have some tools installed. You'll need netcat, ping, and (optionally) wireshark. You'll also need to have netconsole compiled as a module on the source box. Netconsole has to be a module so you can load it after you get the system set up.
First, on the source machine, make sure you have the daemon that routes kernel messages (sysctl) set up so that messages of all priority types will end up in /var/log/messages. You can do this by running
sarah@xanatos:~$ sudo ifconfig eth1 Link encap:Ethernet HWaddr 12:34:56:78:90:12 inet addr:10.7.201.12 Bcast:10.7.201.255 Mask:255.255.255.0 inet6 addr: fe80::3e97:eff:fe39:d710/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:157762 errors:0 dropped:0 overruns:0 frame:0 TX packets:39377 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:64231841 (64.2 MB) TX bytes:6863064 (6.8 MB) Interrupt:20 Memory:f2500000-f2520000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:4672 errors:0 dropped:0 overruns:0 frame:0 TX packets:4672 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:394237 (394.2 KB) TX bytes:394237 (394.2 KB)
Next, on the target machine, use the same command to find its IPv4 address and mac address. (Netconsole does not currently support IPv6 addresses.) Those are the inet and HWaddr numbers. Let's assume the IP address is 10.7.201.20 and the mac address is 12:34:56:78:90:20.
Next, on the source machine, load the netconsole module. You'll need to load the module with an extra 'netconsole' module parameter. The netconsole documentation describes how you use the module parameters to tell netconsole how to send dmesg packets to your target machine. The parameter format is currently netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]. In our example, we're leaving off the source port and source IP, since the default is fine. The source device (dev) is eth1, from our ifconfig example above. The target IP address is 10.7.201.20, and the target MAC address is 12:34:56:78:90:20, which we found from running ifconfig on the target machine. Thus, the modprobe command would be:
[ 2009.373932] netpoll: netconsole: local port 6665 [ 2009.373941] netpoll: netconsole: local IP 0.0.0.0 [ 2009.373945] netpoll: netconsole: interface 'eth1' [ 2009.373949] netpoll: netconsole: remote port 6666 [ 2009.373952] netpoll: netconsole: remote IP 10.7.201.20 [ 2009.373956] netpoll: netconsole: remote ethernet address 12:34:56:78:90:20 [ 2009.373962] netpoll: netconsole: local IP 10.7.201.12 [ 2009.375261] console [netcon0] enabled [ 2009.375307] netconsole: network logging started