KernelNewbies:

Linux 4.7 was released on Sun, 24 Jul 2016.

Summary: This release adds support for the recent Radeon RX 480 GPUs, support for parallel pathname lookups in the same directory, a new experimental 'schedutils' frequency governor that should be faster and more accurate than existing governors, support for the EFI 'Capsule' mechanism for upgrading firmware, support for virtual USB Devices in USB/IP to make emulated phones behave as real USB devices, a new security module 'LoadPin' that ensures that all kernel modules are loaded from the same filesystem, an interface to create histograms of events in the ftrace interface, support for attaching BPF programs to kernel tracepoints, support for callchains of events in the perf trace utility, stable support for the Android's sync_file fencing mechanism, and many other improvements and new drivers.

1. Prominent features

1.1. Support for Radeon RX480 GPUs

This release includes support for just released Radeon RX 480 GPUs in the amdgpu driver, which is the first device based on the new Polaris architecture. Support is on par with the rest of devices of the amdgpu driver.

Code: (merge)

1.2. Parallel directory lookups

The directory cache caches information about path names to make them quickly available for pathname lookup. This allows to speed up many common operations; for example, it allows to determine if a particular file or directory exists without having to read the disk. This cache uses a mutex to serialize lookup of names in the same directory.

In this release, the serializing mutex has been switched to a read-write semaphore, allowing for parallel pathname lookups in the same directory. Most workloads won't notice any improvement (cached pathname lookups are fast and having locking contention issues there is very rare), specific workloads that make very heavy use of pathname lookups in the same directory will be faster because they will be able to do them in parallel. Most filesystems have been converted to allow this feature.

Code: commit, commit, commit, commit, commit, commit

1.3. New 'schedutil" frequency governor

This release adds a new governor to the dynamic frequency scaling subsystem (cpufreq). There are two main differences between it and the existing governors. First, it uses information provided by the scheduler directly for making its decisions. Second, it can invoke cpufreq drivers and change the frequency to adjust CPU performance right away, without having to spawn work items to be executed in process context or similar.

What this means is that the latency to make frequency changes in the face of workload variations should be very small, and thanks to the information provided by the scheduler, it can make more accurate decisions. Note also that the schedutil governor, as included in this release is very simple and it's regarded as a foundation for improving on the integration of the scheduler with CPU power management; but it works and the preliminary results are encouraging. The governor shares some tunables management with other governors.

Recommended LWN article: Improvements in CPU frequency management

Code: commit

1.4. Histograms of events in ftrace

'Hist' triggers are a new addition to ftrace, the Linux tracing infrastructure available since 2.6.27 that it's embedded in the kernel and lives at /sys/kernel/debug/tracing/. This release adds the "hist" command, which provides the ability to build "histograms" of events by aggregating event hits. As an example, let's say a user needs to get a list of bytes read from files from each process. You can get this information using hist triggers, with the following command command:

echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger

What this strange command does is to write a command to the trigger file of the sys_enter_read event (the one corresponding to a process entering the read() system call, that is, trying to read a file). Triggering this event will run the following hist command (hist:) that means the following: for each hit on the event, get the PID (common_pid (you can see all the possible fields to query in /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/format) and convert it to process names (.execname suffix); this will be used as key (key=) in the histogram. The val=count parameter makes the hist command to also query the count field, which in the sys_enter_read event it means the number of bytes read. Finally, after the : separator, the sort=count.descending makes the command sort the result by the field count in descending order. This is the resulting output (note that the hits for the same PID will be aggregated):

{{{ # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active]

{ common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512 { common_pid: Xorg [ 1309] } hitcount: 525 count: 256640 { common_pid: compiz [ 2889] } hitcount: 59 count: 254400 { common_pid: bash [ 8710] } hitcount: 3 count: 66369 { common_pid: dbus-daemon-lau [ 8703] } hitcount: 49 count: 47739 { common_pid: irqbalance [ 1252] } hitcount: 27 count: 27648 { common_pid: 01ifupdown [ 8705] } hitcount: 3 count: 17216 { common_pid: dbus-daemon [ 772] } hitcount: 10 count: 12396 { common_pid: Socket Thread [ 8342] } hitcount: 11 count: 11264 { common_pid: nm-dhcp-client. [ 8701] } hitcount: 6 count: 7424 { common_pid: gmain [ 1315] } hitcount: 18 count: 6336 . . . { common_pid: postgres [ 1892] } hitcount: 2 count: 32 { common_pid: postgres [ 1891] } hitcount: 2 count: 32 { common_pid: gmain [ 8704] } hitcount: 2 count: 32 { common_pid: upstart-dbus-br [ 2740] } hitcount: 21 count: 21 { common_pid: nm-dispatcher.a [ 8696] } hitcount: 1 count: 16 { common_pid: indicator-datet [ 2904] } hitcount: 1 count: 16 { common_pid: gdbus [ 2998] } hitcount: 1 count: 16 { common_pid: rtkit-daemon [ 2052] } hitcount: 1 count: 8 { common_pid: init [ 1] } hitcount: 2 count: 2

}}}

This output shows what processes are reading files, how much (count), and how often they try to read (hitcount, which wasn't specified but it is included by default). For more information about hist and its possibilities, see the hist triggers documentation in Documentation/trace/events.txt, or read this recommended blog post from Brendan Egg Hist Triggers in Linux 4.7. For development context of the feature, also see this recommended LWN article: Ftrace and histograms: a fork in the road. For more documentation on ftrace, see Documentation/trace/ftrace.txt or this recommended LWN article.

Code: commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit

1.5. perf trace calls stack

In this release, perf trace adds the ability of printing a userspace callchain each time an system call is hit. An example of a callchain for a recvmsg() syscall issued by gnome-shell:

{{{3292.421 ( 0.002 ms): gnome-shell/2287 recvmsg(fd: 11<socket:[35818]>, msg: 0x7ffc5ea266e0 ) = 32

You can try it with commands such as # trace --call dwarf ping 127.0.0.1. You can also only print callchains for a single event, for example: perf trace --event sched:sched_switch/call-graph=fp/ -a sleep 1. Tracing page faults (option -F/--pf) also support it, for example, tracing write syscalls and major page faults with callchains while starting firefox, limiting the stack to 5 frames, can be done with # perf trace -e write --pf maj --max-stack 5 firefox. An excerpt of a system wide perf trace --call dwarf session can be found here.

1.6. Allow BPF programs to attach to tracepoints

Tracepoints are a sort of dynamic printf()s that developers introduce in their code so that they can be used later to analyse the system behaviour. Tracepoints can be accessed from several utilities: LTTng, perf, SystemTap, ftrace...but they couldn't be accessed by BPF programs.

This release adds a new type of BPF program (BPF_PROG_TYPE_TRACEPOINT) that can be used to build BPF programs that can be attached to kernel tracepoints. This makes possible to build programs that collect data from tracepoints and process them in the BPF program. This is a faster alternative to access tracepoints than kprobes, it can make the tracing programs more stable, and allows to build more complex tracing tools.

Recommended LWN article: Tracepoints with BPF

Code: commit, commit, commit, commit, commit, commit, commit, commit, commit, commit

1.7. EFI 'Capsule' firmware updates

This release adds support for the the EFI Capsule mechanism, which allows to pass data blobs to the EFI firmware. The firmware then parses them and makes some decision based upon their contents. The most common use case is to bundle a flashable firmware image into a capsule that the firmware can use to upgrade in the next boot the existing version in the flash. Users can upload capsule by writting the firmware to the /dev/efi_capsule_loader device

Recommended blog: Better Firmware Updates in Linux using UEFI Capsules

Code: commit, commit

1.8. Support for creating virtual USB Device Controllers in USB/IP

USB/IP allows to shared USB devices over the network. The USB devices need, however, to be real devices. This release brings the ability to create virtual USB Device Controllers without needing any physical USB device, using the USB gadget subsystem.

This feature has several uses; for example, it makes possible to improve phone emulation in development environments. Emulated phones can be now connected to developer's machine or another virtual machine as if it would be a physical phone. It is also useful for testing USB and for educational purposes.

Code: commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit

1.9. Android's sync_file fencing mechanism considered stable

In this release, the sync_file code that was in the staging/ directory has been moved to the real kernel. The Linux Kernel only had an implicit fencing mechanism where the fence are attached directly to buffers and userspace is unaware of what is happening; explicit fencing is not supported.

sync_file is a explicit fencing mechanism designed for Android that help the userspace handles fences directly. Instead of attaching a fence to the buffer a producer driver, it sends the fence related to the buffer to userspace via a sync_file, which can then be sent to the consumer, that will not use the buffer for anything before the fence(s) signals. With this explicit fencing we have a global mechanism that optimizes the flow of buffers between consumers and producers, avoid a lot of waiting. So instead of waiting for a buffer to be processed by the GPU before sending it to DRM in an Atomic IOCTL we can get a sync_file fd from the GPU driver at the moment we submit the buffer processing. The compositor then passes these fds to DRM in a atomic commit request, that will not be displayed until the fences signal, i.e, the GPU finished processing the buffer and it is ready to display.

Documentation: Documentation/sync_file.txt

Code: commit, commit, commit, commit

1.10. LoadPin, a security module to restrict the origin of kernel modules

LoadPin is a new Linux Security Module that ensures all files loaded by the kernel (kernel modules, firmware, kexec images, security policies) all originate from the same filesystem. The expectation is that the filesystem is backed by a read-only device such as a CDROM or dm-verity (this feature comes from ChromeOS, where the device as a whole is verified cryptographically via dm-verity). This allows systems that have a verified and/or unchangeable filesystem to enforce module and firmware loading restrictions without needing to sign the files individually.

Recommended LWN article: The LoadPin security module

Code: commit, commit

2. Core (various)

3. File systems

4. Memory management

5. Block layer

6. Security

7. Tracing, perf, BPF

8. Virtualization

9. Networking

10. Architectures

11. Drivers

11.1. Graphics

11.2. Storage

11.3. Staging

11.4. Networking

11.5. Audio

11.6. Input devices: Tablets, touch screens, keyboards, mouses

11.7. TV tuners, webcams, video capturers

11.8. USB

11.9. Serial Peripheral Interface (SPI)

11.10. Watchdog

11.11. Serial

11.12. ACPI, EFI, cpufreq, thermal, Power Management

11.13. Real Time Clock (RTC)

11.14. Voltage, current regulators, power capping, power supply

11.15. Pin Controllers (pinctrl)

11.16. Memory Technology Devices (MTD)

11.17. Multi Media Card (mmc)

11.18. Industrial I/O (iio)

11.19. Multi Function Devices (MFD)

11.20. Inter-Integrated Circuit (I2C)

11.21. Hardware monitoring (hwmon)

11.22. General Purpose I/O (gpio)

11.23. Clocks

11.24. System On Chip specific Drivers

11.25. PCI

11.26. DMA Engine

11.27. Various

12. List of merges

13. Other news sites

KernelNewbies: Linux_4.7 (last edited 2017-12-30 01:30:24 by localhost)