Linux kernel module loader enhancements

Userspace loads Linux kernel modules using modprobe. Within the Linux kernel you can also load modules on demand using the request_module() call. This


Problem description

Kernel module loading relies on the kernel usermode helper, this has its own set of issues -- refer to the [ kernel usermode helper enhancement page] for more details. This page is dedicated to a few enhancement ideas for the specific use case of the usermode helper of the kernel to load modules using modprobe via request_module().

Kernel test drivers on kselftests - test_request_module

A test driver has been written to help identify and force issues with the kernel implementation of reqest_module() and other users that rely on this. This test driver should be expanded and generalized for integration upstream for kselftests, to ensure moving forward no regressions are added into the kernel.

kmemleak on early failure on load_module()

There is a memory leak on load_module() if it fails on a certain condition. A fix has been found and will be sent upstream.

kmod thread limitation

We currently limit module loading to 50 concurrent threads, this is statically built into the kernel. The original value was arbitrarily selected. A fix has been written and will be sent upstream for review.

incorrect modprobe failure propagation

If a modprobe fails to load a module there is a period of time during which any subsequent modprobe requests will also fail undeterministically. This needs to be fixed.

races with kmod older than v19

The Linux kernel uses request_module() to enable different parts of the kernel to load needed modules on demand. request_module relies on the kernel usermode helper to call modprobe on behalf of the kernel and give to the kernel the return status from modprobe. A lot of kernel code assumes that if request_module() returned 0 either the module was loaded successfully or the driver was built-in. modprobe is implemented these days using kmod. kmod release older than v19 have a bug whereby modprobe returns 0 even if the module is not loaded yet or the driver was not-built-in, the bug is due to an incorrect heuristic whereby kmod relied on looking for certain sysfs files to determine if a module was built-in or not.

The incorrect logic was as follows:

This logic is incorrect due to a possible races with other kernel threads using request_module() while another kernel thread is loading the same module. The race happens when one thread is doing work in mod_sysfs_setup(), specifically, any delays (msleep(10), so 10ms can easily help reproduce the issues) in between mod_sysfs_init() and add_usage_links() will trigger a race if you have other threads requesting modules in the kernel, in such cases request_module() will return 0 but the module is not ready yet as per the kernel semantics. If any part of the kernel asserts that request_module() returning 0 means a module is loaded or the driver is built-in we can potentially crash the kernel.

kmod has a fix for this in userspace:

A kernel fix exists for this as well, it is a sanity check for modprobe work. However the current fix exposes a few series of issues and optimizations to consider.

Sanity checker for modprobe

If we do want to be able to assert in kernel code that if request_module() returned 0 that a driver was loaded or is built-in -- we need a sanity checker to verify that the work modprobe was supposed to do was done. A sanity checker only makes sense after evaluating all of of the sub-items listed below are also addressed.

Assertions of what request_module() returning 0 means

We need to review if we want the kernel to freely assert that if request_module() return 0 that a module is loaded and ready or if its built-in. If we do want this, then we should consider a sanity checker for modprobe calls as otherwise any bug introduced to kmod/modprobe means the kernel is left in a fragile state and could in the worst case crash the kernel.

built-in O(1) check

Checking if a driver is built-in currently has a complexity check of O(n), with some modifications this could easily be reduced to a complexity operation of O(1). This should be reviewed. This might be a welcomed addition to the kernel regardless of whether or not a sanity checker is desirable for modprobe.

Use of aliases for request_module()

The kernel uses request_module() with a loose set of aliases, these are not well documented. If we wanted a sanity checker that checks modprobe did what it told us it did, we need semantics for aliases well defined and documented. If the driver is not-built-in we need to iterate over all modules and verify if the module requested was loaded. One of the problems with this is that the kernel loosely uses aliases for modules, so for instance, the fs/filesystem.c get_fs_type() will request a module for ext4 as 'fs-ext4', but the module will be loaded as 'ext4' so a check for the module 'ext4' will not suffice in this case. In this case we must also verify if 'ext4' was loaded as checking for 'fs-ext4' will not yield a valid module. Similarly ext4 also supports the alias for 'fs-ext3'.

UMH race issues

request_module() relies on the kernel usermode helper (UMH), the UMH has a series of issues which have also been identified, these can affect request_module(). They should be reviewed and addressed. For details refer to this page:

KernelNewbies: KernelProjects/module-loader-enhancements (last edited 2016-11-17 19:26:14 by mcgrof)