Linux usermode helper
Linux has a shared framework for calling out to userspace to run and execute a userspace program, and returning to the kernel the return status of this program, this is provided by the Linux kernel usermode helper (UMH). This is implemented mainly today via the files:
This page documents some future considerations and enhancements for it.
A usermode helper is setup via call_usermodehelper_setup(), the first argument of which is the path to the userspace program which will be called. Next, call_usermodehelper_exec() executes this userspace helper.
Contents
Review of current usermode helper API users
As of this writing (November 16, 2016) the UMH helper code is currently used for the following purposes:
kernel/kmod.c - Kernel module loading for request_module() -- calls /sbin/modprobe
lib/kobject_uevent.c - calling CONFIG_UEVENT_HELPER_PATH when set for each kobject uevent
fs/coredump.c - Kernel coredump for generating core files for processes
init/do_mounts_initrd.c - running /linuxrc when an old initrd is used
request_module()
The kernel makes use of the usermoder helper to load modules using /sbin/modprobe. This is implemented in kernel/kmod.c, but note that this file is also where most of the implementation of the kernel usermode helper code resides.
The kernel reuqest_module() call enables the kernel to load modules then by relying on /sbin/modprobe and returning to kernel the return status of the request. If modprobe returns 0 it is implied that either the module loading returned successfully, the driver's init has been called or that modprobe detected that the driver is built-in to the kernel. Either way much kernel functionality assumes that after request_module() returns 0 you can assume the driver functionality is ready, whatever that may be.
The request_module() API has its own set of issues, for a list of them refer to:
https://kernelnewbies.org/KernelProjects/module-loader-enhancements
kobject uevents
lib/kobject_uevent.c implements handling kobject uevents. The kernel UMH is used when you have set CONFIG_UEVENT_HELPER_PATH, if set then this binary will be called each time kobject_uevent_env() gets called in the kernel for each kobject uevent triggered.
firmware usermode helper fallback
The firmware API of request_firmware*() optionally makes use of the lib/kobject_uevent.c usermode helper. It is described here since it optionally relies on the kobject uevent, it is also described due to its intertwined complexity with different components and history, and since it is the only current user of the kernel usermode helper locking mechanism.
Let's review the order of operations taken by the firmware API and highlight where the kernel usermode helper code is involved.
First let us be clear that the firmware usermode helper fallback mechanism is available to the kernel today only as a fallback mechanism, that is, it is only used if direct firmware loading fails. Direct firmware loading is always used first. The firmware usermode helper fallback mechanism also has a few optional feature considerations when loading firmware. Despite the fact that the firmware usermode helper is optional and has optional features the firmware API currently always calls the core kernel usermode helper lock. Below we summarize a chronological set of considerations when loading firmware.
- Built-in firmware is checked first, if the firmware is present we return it immediately
- The firmware cache is looked at next, if the firmware is in the firmware cache we return that immediately. The firmware cache is its own beast:
The firmware cache is not used for calls requiring the usermode helper:
- The firmware cache is setup by adding a devres entry for each device that uses a sync call
- If an async call is used the firmware cache is only set up for adevice if the UMH was not explicitly requested, that is if the second argument to request_firmware_nowait() is false
- The firmware devres entry is maintained throughout the lifetime of the device, so even if you release_firmware() the firmware cache is still used on suspend today
- Upon suspend any pending firmware usermode helper requests are killed to avoid stalling the kernel. Kernel calls requiring the usermode helper therefore need to implement their own firwmare cache mechanism but must not use the firmware API on suspend.
- The core kernel usermode helper lock is always used if either of the above two mechanisms fail (this page recommends replacing the lock see below), this can either be:
- usermode helper_read_trylock() for synchronous request
- usermode helper_read_lock_wait() for asynchronous requests
- Direct firmware loading is attempted first by the kernel by looking for the firmware directly from the filesystem using the new shared core file loader call: kernel_read_file_from_path()
Iff the direct firmware loading failed we move on to consider the firmware usermode helper fallback mechanism:
The firmware usermode helper fallback mechanism is optionally built. It is only available if you have enabled in your kernel CONFIG_FW_LOADER_USER_HELPER.
The firmware usermode helper fallback mechanism is also optionally triggered, you can always force it on by enabling another kernel configuration option, the CONFIG_FW_LOADER_USER_HELPER_FALLBACK. This kernel configuration should be enabled only if you have a requirement by device drivers to look for firwmare in a non-standard firmware path.
- If CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled the firmware usermode helper fallback mechanism is always turned on for both synchronous and asynchronous calls
- If CONFIG_FW_LOADER_USER_HELPER_FALLBACK is disabled but you do have have CONFIG_FW_LOADER_USER_HELPER enabled the firmware usermode helper fallback mechanism is only used if an asynchronous firmware request explicitly requested it.
- The firmware usermode helper fallback mechanism works by creating a new struct device using the name of the firmware passed. The new device's parent is the device used to request firmware, this device, and hence the parent may be null.
- Kernel kobject uevents are supressed for this new firwmare specific device that was just created by default at first
- The firmware API enables calls to specify that a kobject uevent is desirable to be triggered. These kobject uevents are always enabled for all synchronous requests. These kobject events are only enabled by asynchronous requests when the second argument to request_firmware_nowait() is true. There are only two device drivers which have this set to false:
- As documented below kernel kobject uevents get a core kernel usermode helper triggered only if CONFIG_UEVENT_HELPER_PATH is set, this is the actual core kernel usermode helper binary used by the kernel.
The firmware usermode helper is implemented by exposing to userspace a sysfs directory and file entries to enable userspace to upload a file into the kernel for a driver firmware request. It is assumed that a monitor userspace program exists which will look for the respective sysfs files for when to load firmware, and then use them to load firmware. This custom program was originally intended for firmware which had non-standard firmware paths. Later on, udev started being shipped, and it would interpret kobject uevents from the kernel and use these to load firmware using the sysfs interface as a fallback mechanism. The call fw_create_instance() creates the custom firmware device. The _request_firmware_load() adds this device to the kernel, and then optionally sends a kobject uevent. It waits for a timeout period of time.
coredump
When a program running receives certain signals (SIGSEGV or SIGQUIT) the kernel will terminate the process and generate a core dump file which can be used to debug the process. These signals can be used to debug processes which terminated unexpectedly. The kernel coredump functionality is implemented in fs/coredump.c, you can tune the pattern for the name of the file dumped using the procfs file:
/proc/sys/kernel/core_pattern
The function format_corename() parses this file and uses it to determine the core filename used. For more details refer to:
initrd linuxrc
Typically Linux distributions need to boot Linux from a block device using a filesystem, generalizing boot for all possible hardware (typical Linux distribution requirements) implicates having a large kernel with a lot of built-in functionality. Linux distributions do not wish to provide all of these drivers as built-in due to size constraints, so long ago a strategy was devised to stuff files into a file which can be mounted on RAM, and from it extract needed drivers to boot systems. One of the first generation mechanisms for this was initrd. The evolution to this mechanism is initramfs.
Upon boot the kernel processes an initrd by creating the special device file /dev/ram0 and then loading onto it the initrd file present on the filesystem as /initrd.image (see rd_load_image() implemented on init/do_mounts_rd.c. rd_load_image() loads an initrd to ram disk /dev/ram0. This does a copy, block by block of the contents of the file onto the ramdisk. After this handle_initrd() creates /dev/root.old0 which is used (?) to mount the initrd to /root/. After this it creates the directory /old/, changes directory to it, loads default modules (load_default_modules()) and finally executes /linuxrc using the kernel UMH. Finally it mounts the initrd root to /old/ and chroots to /.
Enhancement ideas
Filesystem suspend
There are very likely races when using the UMH code for each if the kernel had any scheduled kernel call or had a kthread scheduled requesting a UMH during each of these events:
- chroot
- pivot_root()
- suspend
- resume
To fix this we should review and generalize use of calling freeze_super() to queue superblock filesystem operations prior to each of these operations. For inspiration we can review and consider the patch being worked on by Jiri Kosina which freezes all filesystems during suspend:
Review removal of the UMH lock
The UMH is 'only' used today by the firmware API on drivers/base/firmware_class.c, this is used if the firmware requested was found not to be built-in, regardless of whether or not the kernel firmware usermode helper fallback is used. The firmware UMH fallback was added years ago by Rafael to help trigger a warning if drivers used the request_firmare*() API on resume, given this used is flawed when using the firmware UMH fallback. The firmware UMH fallback uses sysfs to help a user binary load firmware. Recall that back in the day we always used the firmware UMH, it was only until recent years that direct filesystem loaded was added by Linus to the kernel. Although the UMH lock was added to help warn if users called the firmware API on resume, these days it also helps protect a race against early users of the kernel trying to read files directly from the filesystem early in boot.
If we are to remove the general UMH lock we need to provide a general facility which both secures races against suspsend / resume and lastly initialization. Note that these days direct filesystem loading was generalized given a slew of different areas in the kernel were determined to use similar code to load files directly from the kernel. This effort is documented here and that work is now complete:
https://kernelnewbies.org/KernelProjects/common-kernel-loader
The new APIs then to consider are:
- kernel_read_file()
- kernel_read_file_from_path()
- kernel_read_file_from_fd()
Although direct filesystem loading is separate from the UMH in the kernel, addressing races against the UMH should help also consider race issues with the direct filesystem loading.
The firmware cache added to the firmware API is only supported for drivers which do not explicitly require the firmware UMH fallback, this is because upon suspend the firmware code kills all pending UMH processes, and waiting on any pending UMH calls proves to in practice stall suspends. The firmware cache then is only provided for drivers that rely on and depend only on the direct filesystem loading.
TODO
rebase 20200608-umh-fixes onto linux-next
- Test to ensure that this change does not regress KVM qemu guests on x86_64, as this was a reported regression long ago which pushed the changes to be dropped
- ask 0-day to enable kmod selfttests on all git trees with kdevops