Linux kernel version 2.6.26 Released 2008 ([ full SCM git log])

Summary: 2.6.26 adds support for read-only bind mounts, x86 PAT (Page Attribute Tables), PCI Express ASPM (Active State Power Management), ports of KVM to IA64, S390 and PPC, other KVM improvements including basic paravirtualization support, preliminar support of the future 802.11s wireless mesh standard, a simple memory tester, a kernel debugger, BDI statistics and parameters exposure in /sys/class/bdi, a new /proc/PID/mountinfo file for more accurate information about mounts, per-process securebits, device white-list for containers users, support for the OLPC, some new drivers and many small improvements


1. Important features (AKA: the cool stuff)

1.1. Read-only bind mounts

Recommended LWN article: [ "Read-only bind mounts"]

Since 2.4.0 Linux has supported bind mounts. Bind mounts are a sort of directory symlinks that allow to share the contents of a directory in two different paths. For example, "mount --bind /foo /bar" will "bind" the contents of /foo not only to /foo, but also /bar. IOW, /foo and /bar would have the same content - and any modification in one directory is visible in the other. This has been useful for things like chroots or ftp/webservers, but until now, if /foo was writable, there was no way to stop /bar from being also writable.

In Linux 2.6.26, you can make those bind mounts read-only. If we made the bind mount in the previous example read-only, the contents of /foo would show up in /bar - but an application trying to modify a file in /bar will not be able to do it (/foo could continue being writable, of course). This has a number of uses. It allows chroots to have parts of filesystems writable. It's useful for containers because users may have root inside a container, but should not be allowed to write to some filesystems. It allows security enhancement by making sure that parts of your filesystem read-only (such as when you don't trust your FTP server), when you don't want to have entire new filesystems mounted, or when you want atime selectively updated.

(The current implementation does not allow to make a bind mount directly read-only: you need to make the bind mound first - mount --bind /foo /bar - and then remount the bind as ro - mount -o remount,ro /bar)

Code: [;a=commit;h=8366025eb80dfa0d8d94b286d53027081c280ef1 commit 1], [;a=commit;h=aceaf78da92a53f5e1b105649a1b8c0afdb2135c 2], [;a=commit;h=0622753b800e4cc6cb9319b36b27658c72dd7cdc 3], [;a=commit;h=49e0d02cf018d4edf24bfc8531a816a26367e4ce 4], [;a=commit;h=463c3197263bd26ac59a00d2484990e17e35c50e 5], [;a=commit;h=75c3f29de7451677c59580b0a959f694f36aac28 6], [;a=commit;h=9079b1eb1753f217c3de9f1b7dd7fd549cc3f0cf 7], [;a=commit;h=18f335aff86913de3c76f88d32c8135c1da62ce6 8], [;a=commit;h=a761a1c03a739f04afd6c8d37fd16405bbe754da 9], [;a=commit;h=cdb70f3f74b31576cc4d707a3d3b00d159cab8bb 10], [;a=commit;h=20ddee2c75339cc095f6191c3115f81da8955e96 11], [;a=commit;h=42a74f206b914db13ee1f5ae932dcd91a77c8579 12], [;a=commit;h=74f9fdfa1f229284ee1ea58fa47f2cdeeb12f6fe 13], [;a=commit;h=2af482a7edfb8810539cacc2fdd8242611ca43bb 14], [;a=commit;h=4a3fd211ccfc08a88edc824300e25a87785c6a5f 15], [;a=commit;h=9ac9b8474c39c3ae2c2b37d8e1f08db8a9146124 16], [;a=commit;h=2f676cbc0d60ae806216c7a61c6971bd72dedde8 17], [;a=commit;h=ec82687f29127a954dd0da95dc1e0a4ce92b560c 18], [;a=commit;h=2c463e95480829a2fe8f386589516e13b1289db6 19], [;a=commit;h=2e4b7fcd926006531935a4c79a5e9349fe51125b 20], [;a=commit;h=3d733633a633065729c9e4e254b2e5442c00ef7e 21], [;a=commit;h=ad775f5a8faa5845377f093ca11caf577404add9 22]

1.2. KVM ported to IA64, PPC and S390

KVM, the virtualization solution included in [ Linux 2.6.20], has been rearchitected to give support to architectures others than x86: IA64 (Itanium), S390 and PPC

S390: [;a=commit;h=402b08622d9ac6e32e25289573272e0f21bb58a7 (commit 1], [;a=commit;h=5b7baf05783b1ac97a510243d7e82293416a7cf6 2], [;a=commit;h=8a88ac6183975c73c65b45f365f6f3b875c1348b 3], [;a=commit;h=b0c632db637d68ad39d9f97f452ce176253f5f4e 4], [;a=commit;h=8f2abe6a1e525e878bdf58f68ccd146d543fde84 5], [;a=commit;h=ba5c1e9b6ceebdc39343cc03eb39f077abd3c571 6], [;a=commit;h=453423dce2785b8e22077e3b3eeecb4f60fe3470 7], [;a=commit;h=5288fbf0ef041ba0e8b4dcb2df4536b5e3a48b32 8], [;a=commit;h=e28acfea5dd9dbc67c2594cbefc140129dbd0e3f 9], [;a=commit;h=77b455f1bcfa0fddb31b8e6f9f2adc246acb4216 10], [;a=commit;h=5ecee4ba4eb2ada7ece7c41eb08cf7bc51b579e2 11], [;a=commit;h=fa5877439d5a062d91c3abd5a690483bbdb4268e 12], [;a=commit;h=e976a2b997fc4ad70ccc53acfe62811c4aaec851 13)]

IA64: [;a=commit;h=1a9c1ac46990194f6b6ddc591c24e385e611fa25 (commit 1], [;a=commit;h=a4f500381ac91969fa4f8b0a4e39e76dbf00a913 2], [;a=commit;h=b024b79322aad213cd2d4f30c23a6c626a0d5b31 3], [;a=commit;h=bb46fb4af160ec7ae6e5102a79a3b2518eaee7af 4], [;a=commit;h=964cd94a2ae3b20f9da9bd43b31aac32c4fe9aee 5], [;a=commit;h=fbd4b5621c8db767f78c89d1ac708ac4bb276caf 6], [;a=commit;h=e30af4ce7fea3d3a470f8f9996c53564f34e4754 7], [;a=commit;h=a793537a970584720347293935a4bb6323791a05 8], [;a=commit;h=60a07bb9baa83e17d4b540a2f371661ecc353c6c 9], [;a=commit;h=7fc86bd9c0830651826d88c65b6aad55086a6e01 10], [;a=commit;h=d62998a681f4688605895bb7068d76d25132e3a2 11], [;a=commit;h=827fa691e41a538bbe941d9c988e07e6abea1648 12], [;a=commit;h=ad86b6c36bbb9c1cac610f1b8a310d87eafea778 13], [;a=commit;h=b693919ca983e9eb989d89dac5493ef3c5e98e77 14], [;a=commit;h=fdae862f91728aec6dd8fd62cd2398868c906b6b 15)]

PowerPC 440: [;a=commit;h=bbf45ba57eaec56569918a8bab96ab653bd45ec1 (commit)]

1.3. Wireless mesh networking (802.11s) draft support

A year ago, in [ Linux 2.6.22], Linux included a new wireless stack. In 2.6.26 that stack is adding support for the draft of wireless mesh networking ([ 802.11s]), thanks to the [ open80211s project]

Code: [;a=commit;h=37c5798968d0ce4d479f114f1d5785551b57bfa5 (commit 1], [;a=commit;h=cc0672a1066829be7e1b0128a13e36a2d0a15479 2], [;a=commit;h=2e3c8736820bf72a8ad10721c7e31d36d4fa7790 3], [;a=commit;h=2ec600d672e74488f8d1acf67a0a2baed222564c 4], [;a=commit;h=33b64eb2b1b1759cbdafbe5c59df652f1e7c746e 5], [;a=commit;h=6032f934c818e5c3435c9f17274fe1983f53c6b4 6], [;a=commit;h=c3896d2ca4dd97be290f000cb1079ed759d28574 7], [;a=commit;h=ccf80ddfe4923ae75cd3536723880277d285e779 8], [;a=commit;h=ee3858551ae6d044578f598f8001db5f1a9fd52e 9], [;a=commit;h=f709fc696d72d31273a77b82aa32cb6d19857011 10], [;a=commit;h=050ac52cbe1f3de2fb0d06f02c7919ae1f691c9e 11], [;a=commit;h=9f42f607058a80bfb7b4f687bb84016ae129cfd1 12], [;a=commit;h=c5dd9c2bd0b2422dbcd57fe8158d1d7d36c07dd9 13], [;a=commit;h=eb2b9311fd00a868e9bf85ab66e86b7dee1643e1 14], [;a=commit;h=f7a921443740d7dafc65b17aa32531730d358f50 15], [;a=commit;h=2f5ce793c0817d8d38f1c7ad23945607d57e47d6 16], [;a=commit;h=5c142e8db4b2a10dad103d49f309381cb9fc6a87 17)]

1.4. x86 PAT support

PAT (Page Attribute Table) is a feature found in x86 processors that allows for setting the memory attribute at the page level granularity. PAT is complementary to the MTRR settings which allows for setting of memory types over physical address ranges. However, PAT is more flexible than MTRR due to its capability to set attributes at page level and also due to the fact that there are no hardware limitations on number of such attribute settings allowed. The Linux support for this has been in the works for a long time (since 2006!), but it's finally here.

Documentation [;a=commit;h=d27554d874c7eeb14c8bfecdc39c3a8618cd8d32 (commit)] Code: [;a=commit;h=2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6 (commit)]

1.5. Per-process securebits

Recommended LWN article: [ "Per-process securebits"]

Filesystem capability support makes it possible to do away with (set)uid-0 based privilege and use capabilities instead. That is, with filesystem support for capabilities but without this present feature, it is (conceptually) possible to manage a system with capabilities alone and never need to obtain privilege via (set)uid-0. Of course, conceptually isn't quite the same as currently possible since few user applications, certainly not enough to run a viable system, are currently prepared to leverage capabilities to exercise privilege. Further, many applications exist that may never get upgraded in this way, and the kernel will continue to want to support their setuid-0 base privilege needs. Where pure-capability applications evolve and replace setuid-0 binaries, it is desirable that there be a mechanisms by which they can contain their privilege. In addition to leveraging the per-process bounding and inheritable sets, this should include suppressing the privilege of the uid-0 superuser from the process' tree of children. The feature added in 2.6.26 can be leveraged to suppress the privilege associated with (set)uid-0. This suppression requires CAP_SETPCAP to initiate, and only immediately affects the 'current' process (it is inherited through fork()/exec()). This reimplementation differs significantly from the historical support for securebits which was system-wide, unwieldy and which has ultimately withered to a dead relic in the source of the modern kernel.

Code: [;a=commit;h=3898b1b4ebff8dcfbcf1807e0661585e06c9a91c (commit)]

1.6. KGDB

For many years Linux has not included a kernel debugger. Linus Torvalds vetoed them for years, for reasons that he explained quite well in a [ well know email]: "When things crash and you fsck and you didn't even get a clue about what went wrong, you get frustrated. Tough. There are two kinds of reactions to that: you start being careful, or you start whining about a kernel debugger [...] Quite frankly, I'd rather weed out the people who don't start being careful early rather than late. That sounds callous, and by God, it _is_ callous. But it's not the kind of "if you can't stand the heat, get out the the kitchen" kind of remark that some people take it for. No, it's something much more deeper: I'd rather not work with people who aren't careful. It's darwinism in software development. [..] I happen to believe that not having a kernel debugger forces people to think about their problem on a different level than with a debugger. I think that without a debugger, you don't get into that mindset where you know how it behaves, and then you fix it from there. Without a debugger, you tend to think about problems another way. You want to understand things on a different _level_."

Despite of those objections, many people wanted a debugger and KGDB is finally going in. It's a remote debugger, it needs two machines. x86 and sparc machines are supported

Code: [;a=commit;h=82da3ff89dc2a1842cff9b0d4cbc345cb90b59e1 (commit 1], [;a=commit;h=dc7d552705215ac50a0617fcf51bb9c736255b8e 2], [;a=commit;h=6cdf6e06d70dcf42314edb2c43b7c7ebc56e32e5 3], [;a=commit;h=7c3078b637882303b1dcf6a16229d0e35f6b60a5 4], [;a=commit;h=64e9ee3095b61d0300ea548216a57d2536611309 5], [;a=commit;h=e3e2aaf7dc0d82a055e084cfd48b9257c0c66b68 6], [;a=commit;h=e8d31c204e36e019b9134f2a11926cac0fcf9b19 7], [;a=commit;h=e2fdd7fd99dd68b77caaf2a2272b75b5da890de7 8)]

1.7. Device whitelist on cgroups

Recommended LWN article: [ "Device whitelist on cgroups"]

This feature implements a functionality wanted by some virtualization users: The ability to control the access to devices in a per-container basis. A cgroup is used to track and enforce open and mknod restrictions on device files. More details can be found in the commit link.

Code: [;a=commit;h=08ce5f16ee466ffc5bf243800deeecd77d9eaf50 (commit)]

1.8. Memtest

[ Memtest] is a commonly used tool for checking your memory. In 2.6.26 Linux is including his own in-kernel memory tester. The goal is not to replace memtest, in fact this tester is much simpler and less capable than memtest. But it has some advantages, like the fact that it runs in plataforms others than x86, and it's enabled easily with the "memtest" boot parameter.

Code:[;a=commit;h=272b9cad6e7a2f61b13cfcd7dde0010e02e9376e (commit)], [;a=commit;h=c64df70793a9c344874eb4af19f85e0662d2d3ee (commit)]

1.9. Export BDI attributes in sysfs

[ Linux 2.6.24] merged per-device dirty thresholds: The limits that the kernel put to the amount of memory that a process can "dirty" changed from being global to be per-device. 2.6.26 exposes a interface in /sys/class/bdi that allow to set several parameters. There's another set of read-only parameters that are exposet in debugfs (debug/bdi/<bdi>/stats)

Code: [;a=commit;h=cf0ca9fe5dd9e3693d935757a7b2fc50fc576554 (commit 1], [;a=commit;h=fa799759f9801137f665dbedda2c0815f1bf6f1b 2], [;a=commit;h=b6f2fcbcfca9db2bd7aa24940224fcd3bbdbb8aa 3], [;a=commit;h=189d3c4a94ef19fca2a71a6a336e9fda900e25e7 4], [;a=commit;h=76f1418b485da2707531178e517bbb5cf06b3c76 5], [;a=commit;h=a42dde04152750426cc620fd277e80fffae2f65a 6)]

1.10. /proc/pid/mountinfo

The work being done these days in the VFS like per-process namespaces and such is obsoleting some things, like /proc/mounts (which is always a link to /proc/self/mounts). In its current form lacks important information and suffers some problems (see the code link). 2.6.26 introduces /proc/PID/mountinfo which addresses these deficiencies. Information about the information that can be found on these new files is explained in the commit links.

Code: [;a=commit;h=6092d048183b76bfa3f84b32f8158dd8d10bd811 (commit 1], [;a=commit;h=9d1bc60138977d9c79471b344a64f2df13b2ccef 2], [;a=commit;h=73cd49ecdde92fdce131938bdaff4993010d181b 3], [;a=commit;h=719f5d7f0b90ac2c8f8ca4232eb322b266fea01e 4], [;a=commit;h=a1a2c409b666befc58c2db9c7fbddf200f153470 5], [;a=commit;h=2d4d4864ac08caff5c204a752bd004eed4f08760 6], [;a=commit;h=97e7e0f71d6d948c25f11f0a33878d9356d9579e 7)]

1.11. Generic semaphores

Since the introduction of mutexes, semaphores are no longer performance-critical, so the old architecture-specific (and often asm-handcoded) implementation has been reemplaced by a generic one written in C for maintainability, debuggability and extensibility. It removes 7365 LoC

Code: [;a=commit;h=64ac24e738823161693bf791f87adc802cf529ff (commit)]

2. Subsystems

2.1. Various

2.2. Filesystems

2.3. Networking

2.4. Crypto

2.5. Security

2.6. KVM

3. Architecture-specific changes

4. Drivers


4.2. Networking

4.3. Graphics

4.4. Sound

4.5. Input

4.6. V4L/ DVB

4.7. SCSI


4.9. HWMON

4.10. USB

4.11. Infiniband

4.12. ACPI and Power Management

4.13. MTD

4.14. I2C

4.15. Various

KernelNewbies: Linux_2_6_26 (last edited 2008-07-01 22:59:07 by diegocalleja)