KernelNewbies:

Kernel Janitor's List Maintained by Arnaldo Carvalho de Melo <acme at conectiva dot com dot br>

Please send additions/fixes to kernel-janitor-discuss@lists.sourceforge.net Last Updated in: Wed May 30 18:04:19 EST 2001

This TODO list is maintained in CVS, you can get the latest version in http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/kernel-janitor/kernel-janitor

You should look also at http://linux24.sourceforge.net for more things to do.

E-mail addresses are masked to avoid spammers.


- get rid of check_region, use just request_region checking its return (2.2


- uninitialize static variables initialized to 0, to make it go to the .bss,


- request_irq needs to be checked


- check all init_etherdev return

- check all kmalloc, vmaloc, skb_alloc, etcalloc


- register_netdev has to be checked as well


- release previously successful allocations on failure


- convert drivers to new PCI API


- remove uneeded historic code


- proc_register() is dead. Use create_proc_read_entry() instead. (from Al Viro on lkml)


- check proc_*_create result, again, it can fail, and is common (bad) practice


- check freeing skbs with kfree instead of kfree_skb


- check drivers/ide/ide-probe.c, full of unchecked kmallocs


- get rid of panic function in drivers (watchdogs need to use machine_restart


- get rid of isa_read/write[bwl], use ioremap instead

Note there is a ioremap step at driver init time, and an iounmap step at driver shutdown time. isa_xxx simply doesn't have the remap/unmap stuff.

Read Documentation/IO-mapping.txt for more details.


- sed s/return EWHATEVER/return -EWHATEVER/


- check misc_register return (yes, it can fail, murphy's law applies here as


- converting cli to spinlocks (look at net/netrom/*.c, etc)


- check that net_device interrupt functions use dev_kfree_skb_irq and not


- get rid of verify_area with copy_*_user get/put_user, only needed if


Date: Wed, 14 Feb 2001 12:51:53 +0100 From: Manfred Spraul <manfred at colorfullife dot com>

bug in network drivers:

* dev->mem_start: NULL means "not command line configuration" 0xffffffff means "default". several drivers only check for NULL, not for 0xffffffff.

And then: Date: Wed, 14 Feb 2001 05:54:34 -0600 (CST) From: Jeff Garzik <jgarzik at mandrakesoft dot mandrakesoft dot com>

netdev->mem_start is unsigned long... Should the test be for ~0 instead? The value 0xFFFFFFFF seems wrong for 64-bit machines.



- prumpf suggested:


- prumpf suggested, rmk agreed


Maybe: - check drivers/scsi/ips.c for resource leaks (ips_release doesn't seems


From: Hans Grobler <grobh at sun dot ac dot za>

- audit an ioctl function to make sure there is no way a user can crash

- check for balanced *_lock_* and *_unlock_* along all paths in a function

- check for dev_close calls without rntl_lock held (cases assertion

- check that a freed pointer (*kfree_*) is not used again.

- ioremap must be balanced by an iounmap (this causes a memory leak).


From: Rasmus Andersen <rasmus at jaquet dot dk>

Stuff I observed in drivers/scsi:

o no check of scsi_register's return value o get_free_pages should be added to the list of *allocs to check o check for matching pairs of pci_alloc_consistent and pci_free_consistent


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

* ALL PCI drivers should call pci_enable_device --before-- reading pdev->irq or pdev->resource[]. irq and resource[] may not have correct values until after PCI hotplug setup occurs at pci_enable_device() time. Many PCI drivers need to be evaluated and checked for this. Many of these fixed already at: http://www.lib.uaa.alaska.edu/linux-kernel/archive/2001-Week-11/0150.html

* Net drivers should call SET_MODULE_OWNER instead of MOD_{INC,DEC}_USE_COUNT

* Net drivers should set dev->last_rx immediately after netif_rx


From: David Woodhouse <dwmw2 at infradead dot org>

- Anything which uses sleep_on() has a 90% chance of being broken. Fix

Explanation by Andi Kleen <ak at suse dot de> lkml thread at: http://boudicca.tux.org/hypermail/linux-kernel/2001week05/0305.html

> don't even know WHY it's bad. Not only that, but what am I supposed to use > instead?

You can miss wakeups. The standard pattern is:

When you want to handle signals you can check for them before or after the condition check. Also use TASK_INTERRUPTIBLE in this case.

When you need a timeout use schedule_timeout().

For some cases you can also use the wait_event_* macros which encapsulate that for you, assuming condition can be tested/used lockless.

An alternative is to use a semaphore, although that behaves a bit differently under load.

From: David Woodhouse <dwmw2 at infradead dot org>

> Ok, so how should this code have been written?

DECLARE_WAIT_QUEUE(wait, current);

while(1) {

}

Alternatively, you could up() a semaphore for each task that's do be done, and down() it again each time you remove one from the queue.


From: David Woodhouse <dwmw2 at infradead dot org>

- Oh, and there are still kernel threads which don't use up_and_exit() for


From: Andrew Morton <andrewm at uow dot edu dot au>

Here - have about 300 bugs:

A lot of the timer deletion races are hard to fix because of the deadlock problem.


Date: Mon, 26 Feb 2001 16:53:50 -0800 (PST) From: "David S. Miller" <davem at redhat dot com>

Andrew Morton writes:

This is illegal and broken and will never work.

Once you give an skb to netif_rx() it is not yours to reference any longer.

Donald Becker added:

Easier fix: - np->stats.rx_bytes += skb->len; + np->stats.rx_bytes += pkt_len;

Grouping the writes to np->stats results in better cache usage.

Jeff Garzik added:

It makes use of the existing local 'pkt_len', and it checks off another item that should probably be on the janitor's todo list: Set 'dev->last_rx=jiffies' immediately after netif_rx.


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

> It is highly recommended to always compile with CONFIG_ISAPNP=y due > to these differences. If you grep around for CONFIG_ISAPNP versus > CONFIG_ISAPNP_MODULE, you'll see that many drivers are woefully > unprepared for isapnp support compiled as a module.

Yep.. grep for CONFIG_ISAPNP, look at the code, and evaluate it to make sure that isapnp works for that drivers regardless of whether CONFIG_ISAPNP -or- CONFIG_ISAPNP_MODULE is defined.


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

Audit all memcpy and memmove calls.

If calling memmove, see if the areas ever overlap. If not, use memcpy instead.

If calling memcpy, see if the areas ever overlap. If so, use memmmove instead.


From: Andrey Panin <pazke at orbita dot don dot sitek dot net>

- check printk() calls (should include appropriate KERN_* constant).


From Dave Jones <davej at suse dot de>

- Lots of drivers are doing evil looking things writing to PCI_CACHE_LINE_SIZE

- Checking LINUX_VERSION_CODE & KERNEL_VERSION for specific versions of

- Lots of unnecessary casts in drivers can be removed.

- sound/maestro3.c doesn't pci_free_consistent any buffers if

- drivers/net/aironet*.c could use some functions being made statics.

- de4x5.c doesn't pci_enable_device()

- Many drivers have alignments such as..

- grep POLYNOMIAL drivers/net/*

- drivers/net/rcpci45.c

- Replace uses of suser() and fsuser() with capability checks.

- Check kmallocs for things like GFP_DMA without a memtype.

- Strtok() doesn't seem SMP safe. Replace with strsep() where possible.

- Signedness issues. We have _lots_ of these.

- Audit return codes for..


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

pci_map_* might return 0 for a valid mapping. Some code tests mapping for a non-zero value, which is incorrect.


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

read_eeprom can be marked init in other net drivers also.

akpm: well, not blindly, of course. Need to follow the call graph,


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

The string form

creates two variables in the final assembly output, a static string, and a char pointer to the static string. The alternate string form

is better because it declares a single variable.

For variables marked initdata, the "*foo" form causes only the pointer, not the string itself, to be dropped from the kernel image, which is a bug. Using the "foo[]" form with regular 'ole local variables also makes the assembly shorter.

Look at the diff between include/linux/etherdevice.h in pre7 versus pre8. The assembly for the pre8 version is ~10 lines shorter, with 6 of those removed lines being the string constant stored separately. Maybe this should go into CodingStyle?


From: Jeff Garzik <jgarzik at mandrakesoft dot com>

1) "unsigned int" is preferred to "int", it generates better asm code on all platforms except sh5. This replacement needs to be done manually, because often 'int' is required due to negative values -Exxx commonly passed as error values.

2) someone who knows DocBook, or is willing to learn, should go through and clean up Documentation/DocBook to kill all the warnings that occur during "make pdfdocs" and generally make the documents look nicer, and render smaller PDFs. See, not everyone needs to be a kernel hacker to really help out :)


Pointed out by dwmw2: Audit drivers usage of the return value of ioremap. Some are aparently using this as a pointer, which is wrong.

KernelNewbies: MigratingInProgress/acme-todo (last edited 2008-10-19 00:30:44 by MikeSteiner)