WARNING: This document may not be completely finished at the time of the release. Sorry. You can look at the LWN list of 2.6.25 features ([ 1], [ 2], and [ 3])

Linux kernel version 2.6.25 Released ([ full SCM git log])


1. Short overview (for news sites, etc)

2. Important things (AKA: the cool stuff)

2.1. RCU Preempt support

Recommended LWN article: [ "The design of preemptible read-copy-update"]

[ RCU] is a very powerful locking scheme used in Linux to scale to [ very large] number of CPUs on a single system. However, it wasn't well suited for the Real Time patchsets that have been developed to make Linux a RT OS, because some parts weren't preemptible, causing latencies too big for RT workloads. In 2.6.25, RCU can be preempted, eliminating that source of latencies and making Linux a bit more RT-ish.

Code: [;a=commit;h=e260be673a15b6125068270e0216a3bfbfc12f87 (commit)]

2.2. FIFO ticket spinlocks

Recommended article: [ "Ticket spinlocks"]

In certain workloads, spinlocks can be unfair, ie: a process spinning on a spinlock can be starved up to 1,000,000 times. Usually starvation in spinlocks is not a problem, because it becomes a performance problem before any starvation is noticed, but testing has showed the contrary. And it's always possible to find an obscure corner case that will generate a lot of contention on some lock, and the processor that will grab the lock does it randomly. With the new spinlocks, the processes grab the spinlock in FIFO order. Spinlocks configured to run in more than 255 CPUs will also use a 32-bit value (instead of the 16 bits used when NR_CPUS < 255) that allows a theoretical limit of up to 65536 processors.

Code: [;a=commit;h=314cdbefd1fd0a7acf3780e9628465b77ea6a836 (commit)]

2.3. Better process memory usage measurement

Recommended LWN article: [ "How much memory are applications really using?"]

Measuring how much memory processes are using is more difficult than it looks, specially when processes are sharing the memory used. Features like /proc/$PID/smaps (added in [ 2.6.14]) help, but it has not been enough. 2.6.25 adds new statistics to make this task easier. A new /proc/$PID/pagemaps file is added for each process. In this file the kernel exports (in binary format) the physical page localization for each page used by the process. Comparing this file with the files of other processes allows to know what pages they are sharing. Another file, /proc/kpagemaps, exposes another kind of statistics about the pages of the system. The author of the patch, Matt Mackall, proposes two new statistic metrics: "proportional set size" (PSS) - divide each shared page by the number of processes sharing it; and "unique set size" (USS) (counting of pages not shared). The first statistic, PSS, has also been added to each file in /proc/$PID/smaps. In [ this HG repository] you can find some sample command line and graphic tools that exploits all those statistics.

Code: (commit [;a=commit;h=1e88328111aae3ea408f346763ba9f9bad71f876 1], [;a=commit;h=304daa8132a95e998b6716d4b7bd8bd76aa152b2 2], [;a=commit;h=161f47bf41c5ece90ac53cbb6a4cb9bf74ce0ef6 3], [;a=commit;h=85863e475e59afb027b0113290e3796ee6020b7d 4])

2.4. Memory Resource Controller

Recommended LWN article: [ "Controlling memory use in containers"]

The memory resource controller is a cgroups-based feature. Cgroups, aka "Control Groups", is a feature that was merged in [ 2.6.24], and its purpose is to be a generic framework where several "resource controllers" can plug in and manage different resources of the system such as process scheduling or memory allocation. It also offers a unified user interface, based on a virtual filesystem where administrators can assign arbitrary resource constraints to a group of chosen tasks. For example, in [ 2.6.24] they merged two resource controllers: Cpusets and Group Scheduling. The first allows to bind CPU and Memory nodes to the arbitrarily chosen group of tasks, aka cgroup, and the second allows to bind a CPU bandwidth policy to the cgroup.

The memory resource controller isolates the memory behavior of a group of tasks -cgroup- from the rest of the system. It can be used to:

The configuration interface, like all the cgroups, is done by mounting the cgroup filesystem with the "-o memory" option, creating a randomly-named directory (the cgroup), adding tasks to the cgroup by catting its PID to the 'task' file inside the cgroup directory, and writing values to the following files: 'memory.limit_in_bytes', 'memory.usage_in_bytes' (memory statistic for the cgroup), 'memory.stats' (more statistics: RSS, caches, inactive/active pages), 'memory.failcnt' (number of times that the cgroup exceeded the limit), and 'mem_control_type'. OOM conditions are also handled in a per-cgroup manner: when the tasks in the cgroup surpass the limits, OOM will be called to kill a task between all the tasks involved in that specific cgroup.

Code: (commit [;a=commit;h=1b6df3aa457690100f9827548943101447766572 1], [;a=commit;h=8cdea7c05454260c0d4d83503949c358eb131d17 2], [;a=commit;h=e552b6617067ab785256dcec5ca29eeea981aacb 3], [;a=commit;h=78fb74669e80883323391090e4d26d17fe29488f 4], [;a=commit;h=8a9f3ccd24741b50200c3f33d62534c7271f3dfc 5], [;a=commit;h=66e1707bc34609f626e2e7b4fe7e454c9748bad5 6], [;a=commit;h=67e465a77ba658635309ee00b367bec6555ea544 7], [;a=commit;h=0eea10301708c64a6b793894c156e21ddd15eb64 8], [;a=commit;h=c7ba5c9e8176704bfac0729875fa62798037584d 9], [;a=commit;h=8697d33194faae6fdd6b2e799f6308aa00cfdf67 10], [;a=commit;h=bed7161a519a2faef53e1bce1b47595e297c1d14 11], [;a=commit;h=e1a1cd590e3fcb0d2e230128daf2337ea55387dc 12])

2.5. Latencytop

Recommended LWN article: [ "Finding system latency with LatencyTOP"]

Slow servers, Skipping audio, Jerky video - everyone knows the symptoms of latency. But to know what's really going on in the system, what's causing the latency, and how to fix it... those are difficult questions without good answers right now. LatencyTOP is a Linux tool for software developers (both kernel and userspace), aimed at identifying where system latency occurs, and what kind of operation/action is causing the latency to happen. By identifying this, developers can then change the code to avoid the worst latency hiccups.

There are many types and causes of latency, and LatencyTOP focus on type that causes audio skipping and desktop stutters. Specifically, LatencyTOP focuses on the cases where the applications want to run and execute useful code, but there's some resource that's not currently available (and the kernel then blocks the process). This is done both on a system level and on a per process level, so that you can see what's happening to the system, and which process is suffering and/or causing the delays.

You can find the latencytop userspace tool, including screenshots, at []. Code: [;a=commit;h=9745512ce79de686df354dc70a8d1a74d801892d (commit)]

== SMACK, Simplified Mandatory Access Control === Recommended LWN article: [ "Smack for simplified access control"]

The most used MAC solution in Linux is SELinux, a very powerful security framework. SMACK is an alternative MAC framework, not so powerful as SELinux but simpler to use and configure. Linux is all about flexibility, and in the same way it has several filesystems, this alternative security framework doesn't pretends to reemplaze SELinux, it's just an alternative for those who find it more suited to its needs.

From the LWN article: Like SELinux, Smack implements Mandatory Access Control (MAC), but it purposely leaves out the role based access control and type enforcement that are major parts of SELinux. Smack is geared towards solving smaller security problems than SELinux, requiring much less configuration and very little application support.

Code: [;a=commit;h=e114e473771c848c3cfec05f0123e70f1cdbdc99 (commit)]

2.6. BRK and PIE executable randomization

[ Exec-shield] is a Red Hat that was started in 2003 by Red Hat to implement several security protections and is mainly used in Red Hat and Fedora. Many features have already been merged lot of time ago, but not all of them. In 2.6.25 two of them are being merged: brk() randomization and PIE executable randomization. Those two features should make the address space randomization on i386 and x86_64 complete.

Code [;a=commit;h=c1d171a002942ea2d93b4fbd0c9583c56fce0772 (commit)],[;a=commit;h=cc503c1b43e002e3f1fed70f46d947e2bf349bb6 (commit)]

2.7. EXT4 update

Recommended article: [ "A better ext4"]

EXT4 mainline snapshot gets an update with a bunch of features: Multi-block allocation, large blocksize up to PAGEZIZE (Shouldn't this be "PAGESIZE"? -zamb), journal checksumming, large file support, large filesystem support, inode versioning, and allow in-inode extended attributes on the root inode. These features should be the last ones that require on-disk format changes. Other features that don't affect the disk format, like delayed allocation, have still to be merged.

Code: (commit [;a=commit;h=c9de560ded61faa5b754137b7753da252391c55a 1], [;a=commit;h=0040d9875dcccfcb2131417b10fbd9841bc5f05b 2], [;a=commit;h=0fc1b451471dfc3cabd6e99ef441df9804616e63 3], [;a=commit;h=c14c6fd5c56a0d0495d8a7c0f2bc330be658663e 4], [;a=commit;h=25ec56b518257a56d2ff41a941d288e4b5ff9488 5], [;a=commit;h=725d26d3f09ccb5bac4b4293096b985a312a0d67 6], [;a=commit;h=7a224228ed79d587ece2304869000aad1b8e97dd 7], [;a=commit;h=8180a5627d126362c2f64e4fa886d6f608d9632a 8], [;a=commit;h=818d276ceb83aa9fdebb5e0a53188290312de987 9], [;a=commit;h=8e85fb3f305b24b79c6d9cb7a56d22b062335ad3 10], [;a=commit;h=afc7cbca5bfd556c3e12d3acefbee5ab0cbd4670 11])

2.8. MN10300/AM33 architecture support

The MN10300/AM33 architecture is now supported under the "mn10300" subdirectory. 2.6.25 adds support MN10300/AM33 CPUs produced by MEI. It also adds board support for the ASB2303 with the ASB2308 daughter board, and the ASB2305. The only processor supported is the MN103E010, which is an AM33v2 core plus on-chip devices. Code: [;a=commit;h=b920de1b77b72ca9432ac3f97edb26541e65e5dd (commit)]

3. Subsystems

3.1. Various

3.2. Filesystems

3.3. Networking

3.4. Crypto

3.5. Security

3.6. Architecture-specific changes

4. Drivers

4.1. Graphics


4.3. Sound

4.4. SCSI

4.5. Network

4.6. V4L/DVB

4.7. I2C

4.8. HID

4.9. Input

4.10. USB

4.11. RDMA

4.12. Hwmon

4.13. MTD

4.14. ACPI

* intel_menlo: introduce new platform specific driver [;a=commit;h=cc0573b3250214034062ddf8c64359596d8af521 (commit)]

4.15. RTC/W1

4.16. Leds

4.17. Various

* mcp23s08 spi gpio expander support [;a=commit; h=e58b9e2762a6ef99e20dba47aba21b911658541d (commit)]

KernelNewbies: Linux_2_6_25 (last edited 2008-03-31 20:18:03 by diegocalleja)