KernelNewbies:

Linux 2.6.22 Released, 2007 ([http://kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.22 full SCM git log])

TableOfContents()

Short overview (for news sites, etc)

Important things (AKA: ''the cool stuff'')

New Slab allocator: SLUB

Recommended article from [http://lwn.net]: [http://lwn.net/Articles/229984/ "The SLUB allocator"]

The slab allocator is a object-caching kernel memory allocator used for dealing with "objects that are frequently allocated and freed" (see the [http://citeseer.ist.psu.edu/bonwick94slab.html "slab allocator" paper from Jeff Bonwick]). It's a critical piece of the inners of the memory management subsystem, and a critical piece to get good performance. The Linux slab allocator works quite well for pretty much everybody; however some people (SGI) has found its current design inefficient in some cases. For example, in 1K nodes/processors configurations, several GB of memory are wasted only in object queues, not counting the objects themselves. It also has become too complex when it grown features like proper NUMA policy support.

As result, a new slab allocator called "SLUB" has been developed by Christoph Lameter from SGI, to solve those and other problems. Its design is simpler, but it also adresses some problems that can result in better performance in some cases and more efficient memory usage (see the full design notes in this [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=81819f0fc8285a2a5a921c019e3e3d7b6169d225 commit link]). It also has better debug capabilities. There's a slabinfo userspace tool that you can find in Documentation/vm/slabinfo.c.

Its aim is to replace transparently slab, but in 2.6 this new slab allocator is optional and not enabled by default. You can enable it at compile time (making it the third option along with SLOB, the embedded-oriented slab allocator). SLUB has been tested for some time and it's solid enougth to try it on your systems, but due to the importance of this part of the kernel, it won't completely replace the current slab allocator until more exposure and testing has been done, hence it's not recommended to use it in production systems. Testing reports, specially regressions, are greatly appreciated.

[http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=mm/slub.c;hb=HEAD source code of mm/slub.c]; [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/vm/slub.txt user documentation]

New Wireless stack

For too long, the Linux wireless support hasn't been as bright as it should. 2.6.22 changes this. In this release a completely new, better wireless stack is being included. This new wireless stack has been donated by the know wifi specialist company Devicescape (many thanks to [http://www.devicescape.com Devicescape] for your contribution and [http://www.devicescape.org support to open source]!). This wireless stack has many features, like a complete software MAC implementation, wep, wpa, a "link-layer" bridging module, hostapd, QoS support to prioritize thins like VOIP, 802.11g support, full debug capabilities....all in a single implementation that drivers can use without writing themselves part of those features, like sadly has been done some times in the linux wifi world.

Another feature of this stack is a completely new user interface. The old stack(s) have a (ugly) ioctl-based interface which were standarized under the name of wext, "wireless extensions". The new interface uses a netlink-based interface, suited for the needs of desktop-based configuration interfaces, but retaining at the same time userspace compatibility with the old interface.

The disadvantage is the lack of drivers using this stack: the drivers that have been in the tree for a long time do not support this stack, and will need to be ported (which be hopefully not that hard, since the new stack is actually a much better ground to build drivers upon that the current mess). There're quite a lot of not-currently-merged new drivers and ported drivers that are already using the new stack and have not been merged in this release, but will get merged in future releases, like the RT2x00 drivers, the bcm43xx driver, zd1211rw, adm8211, rtl818x, Intel iwlwifi (ipw3945 and ipw4965).... (distros like Ubuntu and Fedora already are using them).

In any case, this is the starting building block that will bring top-class wireless support to linux

[http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f0706e828e96d0fa4e80c0d25aa98523f6d589a0 (commit 1)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=64a327a7029d3860ddf6a024816afa9e6673eb57 2], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a9de8ce0943e03b425be18561f51159fcceb873d 3], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e9f207f0ff90bf60b825800d7450e6f2ff2eab88 4], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=704232c2718c9d4b3375ec15a14fc0397970c449 5], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2a5e1c0eb9efe26eed1dd072fe08de5797a7efd5 6], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9e101eab153073d8a1fc7ea22b20af65de8ab44b 7)]

New Firewire stack

The FireWire stack is also getting a rewrite, with the old stack being kept around. The main driver behind this work, according the author, is "to get a small, maintainable and supportable FireWire stack, with an acceptable backwards compatibility story".

This stack has many advantages:

The regressions are:

Code: [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3038e353cfaf548eb94f02b172b9dbe412abd24c (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9ba136d0fe5a3dd33533b4a2a21156aa22f80ebe (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ed5689122f4cdb5cb8c6770ad1a2c8561b32d9b3 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=19a15b937b26638933307bb02f7b1801310d6eb2 (commit)]

Blackfin architecture

2.6.22 adds support for yet another architecture: The Analog Devices Blackfin processor architecture, and currently supports the BF533, BF532, BF531, BF537, BF536, BF534, and BF561 (Dual Core) devices, with a variety of development platforms including those avaliable from Analog Devices (BF533-EZKit, BF533-STAMP, BF537-STAMP, BF561-EZKIT), and Bluetechnix! Tinyboards.

The Blackfin architecture was jointly developed by Intel and Analog Devices Inc. (ADI) as the Micro Signal Architecture (MSA) core and introduced it in December of 2000. Since then ADI has put this core into its Blackfin processor family of devices. The Blackfin core has the advantages of a clean, orthogonal,RISC-like microprocessor instruction set. It combines a dual-MAC (Multiply/Accumulate), state-of-the-art signal processing engine and single-instruction, multiple-data (SIMD) multimedia capabilities into a single instruction-set architecture.

The Blackfin architecture, including the instruction set, is described by the [http://blackfin.uclinux.org/gf/download/frsrelease/29/2549/Blackfin_PRM.pdf ADSP-BF53x/BF56x Blackfin Processor Programming Reference]. The Blackfin processor is already supported by major releases of gcc, and [http://blackfin.uclinux.org/gf/project/toolchain/frs there are available binary and source rpms/tarballs for many architectures]. There is [http://docs.blackfin.uclinux.org/ complete documentation, including "getting started" guides], which provides links to the sources and patches you will need in order to set up a cross-compiling environment for bfin-linux-uclib. All the code is actively supported by Analog Devices Inc, at: http://blackfin.uclinux.org

[http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1394f03221790a988afc3e4b3cb79f2e477246a9 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a5f6abd4f7558fea97bc4021fd0eb7dcc5d16a77 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=8cc75c9a1498913d668b6d3559940c6837cee8bf (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d24ecfcc3953f9c3b833508cd839be614a3f3c64 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0851a2848cfd40012063ca9cf86fb67b7bebceff (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=194de5612777a9ff4f96dae1932f77a5a89e5f0a (commit)]

UBI

The shortest description for UBI is "LVM for NAND flash memory devices". Why duplicate LVM? Well, because flash devices can't really be handled as typical hard disks. UBI provides wear-leveling support across the whole flash chip. BI completely hides 2 aspects of flash chips which make them very difficult to work with: 1. wear of eraseblocks; 2. bad eraseblocks. UBI also makes it possible to dynamically create, delete and re-size flash partitions (UBI volumes).

http://www.linux-mtd.infradead.org/doc/ubi.html

Secure RxRPC sockets

* Provide secure RxRPC sockets for use by userspace and kernel both [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=17926a79320afa9b95df6b977b40cca6d8713cea (commit)] * Add an interface to the AF_RXRPC module for the AFS filesystem to use [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=651350d10f93bed7003c9a66e24cf25e0f8eed3d (commit)] * Make the in-kernel AFS filesystem use AF_RXRPC instead of the old RxRPC code [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=08e0e7c82eeadec6f4871a386b86bf0f0fbcb4eb (commit)] * Delete the old RxRPC code. [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=63b6be55e8b51cb718468794d343058e96c7462c (commit)]

Signal/timer events notifications through file decriptors

Linux currently lacks a proper way to get complete event reporting like other systems do. poll/epoll isn't a solution for everything, because it only works in file descriptors so things like timer and signal notifications aren't covered by it, so to get fe. signal notifications in the main event loop people has needed to use (clever) hacks, like writing a byte between two internal pipes.

After considering the inclusion of [http://linux-net.osdl.org/index.php/Kevent an implementation] of a [http://www.freebsd.org/cgi/man.cgi?query=kevent&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html FreeBSD/OSX-like ] generic event notification mechanism, a simpler, more Unixy solution ([http://groups.google.com/group/linux.kernel/msg/1f3fc521db812a07 inspired by Linus] some years ago) has been adopted.

Three new syscalls have been added: signalfd()/timerfd()/eventfd(). What those syscalls do is to implement event delivery into file descriptors. You can use the standard read(), select(), poll(), epoll() on those fds. signalfd() and timerfd() handle

http://lwn.net/Articles/225714/

The signalfd() system call implements signal delivery into a file descriptor. That fd supports the standard calls poll(), read(), select(), epoll() etc. This allows a program to receive signals via that file descriptor, which are more flexible.

The timerfd() system call implements timers event delivery into file descriptors, so you can use standard calls poll(), epoll(), select(), read(2)

The eventfd() system call is a very simple and light file descriptor, that can be used as event wait/dispatch by userspace (both wait and dispatch) and by the kernel (dispatch only). It can be used instead of pipe(2) in all cases where those would simply be used to signal events. Their kernel overhead is much lower than pipes, and they do not consume two fds. When used in the kernel, it can offer an fd-bridge to enable, for example, functionalities like KAIO or syslets/threadlets to signal to an fd the completion of certain operations. But more in general, an eventfd can be used by the kernel to signal readiness, in a POSIX poll/select way, of interfaces that would otherwise be incompatible with it. The API is:

Code: Anonymous inode source [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5dc8bf8132d59c03fe2562bce165c2f03f021687 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=da66f7cb0f69ab27dbf5b9d0b85c4b97716c44d1 (commit)] ; signalfd: [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fba2afaaec790dc5ab4ae8827972f342211bbb86 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2121e24bd8dd16b4e3f8d995428e2a748d5180cc (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6d18c9220965b437287c3a7e803725c24992ceac (commit)]; timerfd: [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b215e283992899650c4271e7385c79e26fb9a88e (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=57ac8898508638ca6d15ecd8b911a431d673ff30 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=83f5d1266926c75890f1bc4678e49d79483cb573 (commit)]; eventfd: [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e1ad7468c77ddb94b0615d5f50fa255525fde0f0 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fdb902b1225e1668315f38e96d2f439452c03a15 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9c3060bedd84144653a2ad7bea32389f65598d40 (commit)]

Process footprint measurement facility

2.6.22 adds a "Referenced" line to each VMA in /proc/pid/smaps, which indicates how many pages within it are currently marked as referenced or accessed. There's also a new /proc/pid/clear_refs file. When any non-zero number is written to this clear_refs file, the Reference fiel is cleared-

With those mechanism it is now possible to measure approximately how much memory a task is using by clearing the reference bits with "echo 1 > /proc/pid/clear_refs" and checking the reference count for each VMA from the /proc/pid/smaps output at a measured time interval (fe. 1 second). This is a valuable tool to get an approximate measurement of the memory footprint for a task.

[http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f79f177c25016647cc92ffac8afa7cb96ce47011 (commit)], [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b813e931b4c8235bb42e301096ea97dbdee3e8fe (commit)]

utimensat()

The next revision of POSIX will support fine-grained filesystem timestamps. struct stat will report nanosecond values. During the development one additional problem was found: there is no interface to set the file timestamp with that precision. utimes only takes a timeval structure which allows only micro-second resolution. This is why the utimensat() interface was created. It is basically the same as futimesat() interface but it takes a timespec structure.

Code: [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1c710c896eb461895d3c399e15bb5f20b39c9073 (commit)]

New drivers

Crashing soon a kernel near you

This is a list of some of the ongoing patches being developed at the kernel community that will be part of future Linux releases. Those features may take many months to get into the Linus' git tree, or may be dropped. The features are tested in the -mm tree, but be warned, it can crash your machine, eat your data (unlikely but not impossible) or kidnap your family (just because it has never happened it doesn't mean you're safe):

Various core changes

Architecture-specific changes

Various subsystems

Filesystems

Networking

DM

SELinux

Audit

* auditing ptrace [http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a5cb013da773a67ee48d1c19e96436c22a73a7eb (commit)]

Crypto

KVM

Power Management

Drivers

Network drivers

SATA/IDE/SCSI

Graphics

Sound

(commit)]

Input

MTD

USB

V4L/DVB

Added VIDIOC_INT_G_STD_OUTPUT and VIDIOC_INT_S_STD_OUTPUT to allow drivers to set the TV standard for video output separately from the video capture. This is needed for cx23415 support where the decoder is separate from the encoder and can have a different TV standard. Modified the saa7127 module to listen to VIDIOC_INT_G/S_STD_OUTPUT instead of VIDIOC_G/S_STD.

I2C

Bluetooth

Cpufreq

ACPI

HwMon

Watchdog

Various

KernelNewbies: Linux_2_6_22 (last edited 2007-06-06 21:26:53 by diegocalleja)