Size: 12983
Comment: Wrapped multi-line code examples in wiki code blocks to display better.
|
Size: 12875
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
-===================- Kernel Janitor TODO list. $Id: TODO,v 1.52 2006/08/29 04:09:25 dobriyan Exp $ -===================- |
|
Line 32: | Line 27: |
NOTE: scripts/kj.pl was a weekend hack. Don't take it too seriously. |
|
Line 42: | Line 35: |
- clean up Documentation/DocBook/ , fix warnings that occur on make *docs. - printk() calls should include appropriate KERN_* constant (of course |
- clean up `Documentation/DocBook/`, fix warnings that occur on `make *docs`. - `printk()` calls should include appropriate `KERN_*` constant (of course |
Line 48: | Line 41: |
- pr_debug() from kernel.h could replace a lot of DPRINTK and similar macros. - same for pr_info() |
- `pr_debug()` from `kernel.h` could replace a lot of `DPRINTK` and similar macros. - same for `pr_info()` |
Line 58: | Line 51: |
use netdev_priv() for network devices - min/max macros from kernel.h are safe, a lot of handcrafted MIN/MAX are not. - ARRAY_SIZE macro has duplicates, remove them. |
use `netdev_priv()` for network devices - min/max macros from `kernel.h` are safe, a lot of handcrafted MIN/MAX are not. - `ARRAY_SIZE` macro has duplicates, remove them. |
Line 85: | Line 78: |
- convert all explicit lock initializations to spin_lock_init() or rwlock_init(). (Besides consistency this also helps automatic lock |
- convert all explicit lock initializations to `spin_lock_init()` or `rwlock_init()`. (Besides consistency this also helps automatic lock |
Line 95: | Line 88: |
- pci_set_dma_mask() and friends should use DMA_{32,64}BIT_MASK instead of 0xffff... This is not 2.4 compatible, so beware of drivers with same code. |
- `pci_set_dma_mask()` and friends should use `DMA_{32,64}BIT_MASK` instead of `0xffff`... This is not 2.4 compatible, so beware of drivers with same code. |
Line 104: | Line 97: |
- make sure BUG() is used correctly (i.e. if(function()) BUG(); is evil) | - make sure `BUG()` is used correctly (i.e. if(function()) BUG(); is evil) |
Line 107: | Line 100: |
- Code in wrong sections (ie. __init code called from __exit): | - Code in wrong sections (ie. `__init` code called from `__exit`): |
Line 109: | Line 102: |
(read_eeprom in net drivers is a good candidate for __init) | (`read_eeprom` in net drivers is a good candidate for `__init`) |
Line 114: | Line 107: |
Note: Set CONFIG_DEBUG_INFO=n to make compiles faster and smaller. - Audit all memcpy and memmove calls. If areas overlap memmove should be used, if they don't memcpy. - drivers/char/i8k.c duplicates arch/i386/kernel/dmi_scan.c Better would be to have dmi_scan.c set a flag which the i8k driver checks. |
Note: Set `CONFIG_DEBUG_INFO=n` to make compiles faster and smaller. - Audit all `memcpy` and `memmove` calls. If areas overlap `memmove` should be used, if they don't -- `memcpy`. - `drivers/char/i8k.c` duplicates `arch/i386/kernel/dmi_scan.c` Better would be to have `dmi_scan.c` set a flag which the i8k driver checks. |
Line 123: | Line 116: |
copy_to/from_user(), put/get_user(), schedule(), kmalloc() except for GPF_ATOMIC, functions with __might_sleep() in body... |
`copy_to/from_user()`, `put/get_user()`, schedule(), `kmalloc()` except for `GPF_ATOMIC`, functions with `__might_sleep()` in body... |
Line 127: | Line 120: |
excessively useful information to non CAP_SYS_RAWIO users, then loosen | excessively useful information to non `CAP_SYS_RAWIO` users, then loosen |
Line 132: | Line 125: |
- check that buffers used in copy_to_user() don't leak information. - check for dev_close calls without rntl_lock held (causes assertion |
- check that buffers used in `copy_to_user()` don't leak information. - check for `dev_close` calls without `rntl_lock` held (causes assertion |
Line 137: | Line 130: |
- timer_del() vs. timer_del_sync() | - `timer_del()` vs. `timer_del_sync()` |
Line 145: | Line 138: |
- check stack usage (make checkstack) and reduce it in the worst cases. | - check stack usage (`make checkstack`) and reduce it in the worst cases. |
Line 154: | Line 147: |
- look at drivers/char/n_tty.c | - look at `drivers/char/n_tty.c` |
Line 238: | Line 231: |
From Dave Jones <davej at suse dot de> | From Dave Jones |
Line 254: | Line 247: |
Adding a -W to the kernel build line shows them. We have lots of code paths that are not executed because comparisons <0 on an unsigned int always |
Adding a `-W` to the kernel build line shows them. We have lots of code paths that are not executed because comparisons <0 on an `unsigned int` always |
Line 275: | Line 268: |
For variables marked __initdata, the "*foo" form causes only the | For variables marked `__initdata`, the "*foo" form causes only the |
Line 289: | Line 282: |
- delete all pci_find_* functions from the kernel tree. Instead of pci_find_device use pci_get_device().. |
- delete all `pci_find_*` functions from the kernel tree. Instead of `pci_find_device` use `pci_get_device()`. |
Line 297: | Line 290: |
- convert from pci_module_init to pci_register_driver | - convert from `pci_module_init` to `pci_register_driver` |
Line 303: | Line 296: |
Read section "Where to get sparse" in Documentation/sparse.txt. | Read section "Where to get sparse" in `Documentation/sparse.txt`. |
Line 365: | Line 358: |
The same task should be done for extern data declarations. | The same task should not be done for extern data declarations because of FRV. |
Before attacking any of these, we suggest sending a few patches in advance to see if you are doing something wrong, or if someone else is already doing same work.
Read related threads in the mailing list archive.
Where it makes sense, sections are supposed to be ordered by increasing difficulty.
Send patches that add/fix items to kernel-janitors@lists.osdl.org.
Links should be marked with:
D: description/information about the issue
E: example patch
NOTE: Search & Replace
Sometimes all your patch will do is simply changing one function to another. Doing this is especially tempting in "convert to new API" section. History shows this is often wrong: old bugs are not fixed, new bugs are introduced.
If all your patch does is search & replace, double check. If in doubt, ask on mailing list.
- add more checks to kj.pl
Task Type |
Current List |
Audit return codes |
|
Function balancing |
|
Old/New functions |
- clean up Documentation/DocBook/, fix warnings that occur on make *docs.
- printk() calls should include appropriate KERN_* constant (of course
- only at beginning of lines). Also consider using dev_printk() and friends.
- pr_debug() from kernel.h could replace a lot of DPRINTK and similar macros.
- same for pr_info()
- Lots of unnecessary casts in drivers can be removed.
- Example:
- struct netdev_private *np = (struct netdev_private *)dev->priv; + struct netdev_private *np = dev->priv;
use netdev_priv() for network devices
- min/max macros from kernel.h are safe, a lot of handcrafted MIN/MAX are not.
- ARRAY_SIZE macro has duplicates, remove them.
- put BIT macro into kernel.h (?) and remove it from drivers
- Purely cosmetic, but far nicer to read.
- for (list = ymf_devs.next; list != &ymf_devs; list = list->next) { + list_for_each(list, &ymf_devs) {
- Convert comments to C99 initializers:
- /* 10 */ DECLARE_PIIX_DEV("ICH2"), + [10] = DECLARE_PIIX_DEV("ICH2"), (if order matters)
- or
- 15, /* foo */ + .foo = 15,
- make non-global functions static.
- convert all explicit lock initializations to spin_lock_init() or
rwlock_init(). (Besides consistency this also helps automatic lock validators and debugging code.)
- fix compilation warnings/errors
- pci_set_dma_mask() and friends should use DMA_{32,64}BIT_MASK instead of
0xffff... This is not 2.4 compatible, so beware of drivers with same code. [D: http://marc.theaimsgroup.com/?t=108001993000001] Don't forget to #include dma-mapping.h
- check kmallocs for things like GFP_DMA without a memtype.
- check that a freed pointer (*kfree_*) is not used again.
- make sure BUG() is used correctly (i.e. if(function()) BUG(); is evil)
- i.e. even when no-op-ing BUG we still have an if (See also: BUG_ON)
- Code in wrong sections (ie. __init code called from __exit):
Run make buildcheck to find offending code. (read_eeprom in net drivers is a good candidate for __init) [D: http://marc.theaimsgroup.com/?l=kernel-janitor-discuss&m=108640220401558]
- Fix gcc 4 warnings.
[D: http://marc.theaimsgroup.com/?t=110348353800003] Note: Set CONFIG_DEBUG_INFO=n to make compiles faster and smaller.
- Audit all memcpy and memmove calls.
If areas overlap memmove should be used, if they don't -- memcpy.
- drivers/char/i8k.c duplicates arch/i386/kernel/dmi_scan.c
Better would be to have dmi_scan.c set a flag which the i8k driver checks.
- sleeping functions should not be called in interrupts or under spinlocks:
copy_to/from_user(), put/get_user(), schedule(), kmalloc() except for GPF_ATOMIC, functions with __might_sleep() in body...
- go through all the tty/serial drivers and make sure they don't give out
excessively useful information to non CAP_SYS_RAWIO users, then loosen permissions.
- check that buffers used in copy_to_user() don't leak information.
- check for dev_close calls without rntl_lock held (causes assertion
- failures).
- timer_del() vs. timer_del_sync()
[D: http://www.uwsg.iu.edu/hypermail/linux/kernel/0005.3/0269.html] A lot of the timer deletion races are hard to fix because of the deadlock problem.
- fix watchdog drivers to use link order rather than explicit
- initialization calls (i810 is particularly broken)
- check stack usage (make checkstack) and reduce it in the worst cases.
[D: http://bugme.osdl.org/show_bug.cgi?id=386] [D: http://marc.theaimsgroup.com/?l=kernel-janitor-discuss&m=110661227615538]
SUSPECTS:
copy_to/from_user:
- look at drivers/char/n_tty.c
Items with longer description in no particular order:
Date: Wed, 14 Feb 2001 12:51:53 +0100 From: Manfred Spraul
bug in network drivers:
* dev->mem_start: NULL means "not command line configuration" 0xffffffff means "default". several drivers only check for NULL, not for 0xffffffff.
And then: Date: Wed, 14 Feb 2001 05:54:34 -0600 (CST) From: Jeff Garzik
netdev->mem_start is unsigned long... Should the test be for ~0 instead? The value 0xFFFFFFFF seems wrong for 64-bit machines.
- drivers that try to find multiple boards, possibly successfully allocating
- for the first ones, then failing for, lets say, the third board, then it just returns failure for init_module, the module is unloaded but the resources remain allocated... in these cases we need to rollback the allocations, freeing it before returning from init_module or equivalent. - sound/oss/maestro3.c doesn't pci_free_consistent any buffers if
- one allocation fails, but others succeeded.
From: Hans Grobler
- audit ioctl functions to make sure there is no way a user can crash
- the machine or do sensitive stuff. For example, check for the necessary capabilities. Check that no userspace pointers are accessed directly. be careful with this... it takes a while to figure out which ioctls are covered by the parent ioctl function (these can be nested to great depths).
Date: Mon, 26 Feb 2001 16:53:50 -0800 (PST) From: "David S. Miller"
Andrew Morton writes:
> Could this be a driver problem?
> netif_rx(skb);
- ..
> new_skb = skb;
This is illegal and broken and will never work.
Once you give an skb to netif_rx() it is not yours to reference any longer.
Donald Becker added:
Easier fix:
- np->stats.rx_bytes += skb->len; + np->stats.rx_bytes += pkt_len;
Grouping the writes to np->stats results in better cache usage.
Jeff Garzik added:
It makes use of the existing local 'pkt_len', and it checks off another item.
From: Jeff Garzik
> It is highly recommended to always compile with CONFIG_ISAPNP=y due > to these differences. If you grep around for CONFIG_ISAPNP versus > CONFIG_ISAPNP_MODULE, you'll see that many drivers are woefully > unprepared for isapnp support compiled as a module.
Yep.. grep for CONFIG_ISAPNP, look at the code, and evaluate it to make sure that isapnp works for that drivers regardless of whether CONFIG_ISAPNP -or- CONFIG_ISAPNP_MODULE is defined.
From Dave Jones
- Lots of drivers are doing evil looking things writing to PCI_CACHE_LINE_SIZE
- This should be done during PCI initialisation.
- Many drivers have alignments such as..
dev->priv = (void *)(((unsigned long)dev->priv + 7) & ~7); jgarzik:
- What they are doing here is DMA'ing into their private board
(dev->priv), which is wrong -- they should be using PCI DMA instead. They can't use init_etherdev because it doesn't pass GFP_DMA to kmalloc for dev->priv, and since they are DMA'ing into dev->priv, they have to do things manually.
- What they are doing here is DMA'ing into their private board
- Signedness issues. We have _lots_ of these.
Adding a -W to the kernel build line shows them. We have lots of code paths that are not executed because comparisons <0 on an unsigned int always evaluates to true. We also need to check for things like...
char foo[4]={1,2,3,133};
From: Jeff Garzik
1) The string form
- [const] char *foo = "blah";
creates two variables in the final assembly output, a static string, and a char pointer to the static string. The alternate string form
- [const] char foo[] = "blah";
is better because it declares a single variable.
For variables marked __initdata, the "*foo" form causes only the pointer, not the string itself, to be dropped from the kernel image, which is a bug. Using the "foo[]" form with regular 'ole local variables also makes the assembly shorter.
2) "unsigned int" is preferred to "int", it generates better asm code on all platforms except sh5. This replacement needs to be done manually, because often 'int' is required due to negative values -Exxx commonly passed as error values.
From: Greg KH
- delete all pci_find_* functions from the kernel tree.
Instead of pci_find_device use pci_get_device().
> Looking at pci.txt, it appears that if I use pci_get_device(), I will > also need to use pci_dev_put().
also check using the proper shutdown callback.
- convert from pci_module_init to pci_register_driver
- - don't do it for drivers that are 2.4 compatible
1. Fix sparse warnings.
Read section "Where to get sparse" in Documentation/sparse.txt. $ make allyesconfig or whatever config you like. Set CONFIG_DEBUG_INFO=n to make compiles faster and smaller. $ make -k C=1 2>&1 | tee ../W_sparse
- or
$ make -k C=1 CF=-DCHECK_ENDIAN 2>&1 | tee ../W_sparse
_Plenty_ of patches here (well, not all of them fix sparse warnings ): [E: http://linux.bkbits.net:8080/linux-2.6/search/?expr=sparse&search=ChangeSet+comments]
2. For portability reasons pointers should be printed as ("%p", ptr), not ("%08x", (int) ptr) or similar.
3. $ make randconfig $ make -k 2>&1 | tee ../W_randconfig
Find out why it doesn't build. Fix warnings that aren't seen with usual "allyesconfig", "allmodconfig", ...
P. S.: Run it several times, think a little and take bets on whether it'll build or not on the next party among local linuxoids.
4. Some drivers do:
if ((jiffies - data->last_updated > HZ * 2) || (jiffies < data->last_updated))
Should be:
#include <linux/jiffies.h> if (time_after(jiffies, data->last_updated + HZ * 2))
[E: http://linux.bkbits.net:8080/linux-2.6/gnupatch@42261f25iCdhlwTMywIQxvBDrIHj1A]
5.
Some function prototypes (in both .h and .c files) specify attributes like init and exit in the prototype. gcc (at least at 3.3.3) uses the last such attribute that is actually specified, without issuing a warning. So we can have:
* Prototype declarations that use one attribute and a function body
- that uses another attribute.
* Functions that from the .c code appear to be normal text but the .h
- file is silently setting a special attribute. Both are potential sources of programmer confusion or bugs.
Identify: * all places where attributes in prototype and function definition
- don't match (choose the correct one, move it to declaration),
* all places where prototype contains attribute but definition doesn't
- (move it to declaration). The same task should not be done for extern data declarations because of FRV.
Once that is done, remove #include <linux/init.h> from all .h files. Only .[cS] files should specify which section the data and text are stored in, .h files should only define the C language information.