The linux 2.6.27 kernel ([ full SCM git log]) is neither released nor finished. Its merge window has officially been closed by the [ 2.6.27-rc1 release on July 28th 2008.]

Summary: 2.6.27 adds support for ...


1. Important features (the cool stuff)

1.1. UBIFS

UBIFS is a new filesystem designed to work with flash devices, developed by Nokia with help of the University of Szeged. It's important to understand that UBIFS is very different to any traditional filesystem: UBIFS does not work with block based devices, but pure flash based devices, handled by the MTD subsystem in Linux. Hence, UBIFS does not work with what many people considers flash devices like flash-based hard drives, SD cards, USB sticks, etc; because those devices use a block device emulation layer called FTL (Flash Translation Layer) that make they look like traditional block-based storage devices to the outside world. UBIFS instead is designed to work with flash devices that do not have a block device emulation layer and that are handled by the MTD subsystem and present themselves to userspace as MTD devices.

UBIFS works on top of UBI volumes. UBI is a LVM-like layer which was included in [ Linux 2.6.22], which itself works on top of MTD devices. UBIFS offers various advantages to JFFS2: faster and scalable mount times (unlike JFFS2, UBIFS does not have to scan whole media when mounting), tolerance to unclean reboots (UBIFS is a journaling filesystem), write-back (which improves dramatically the performance), and support of on-the-flight compression.

Documentation: UBIFS [ FAQ], more [ documentation]

Code: [;a=commit;h=1e51764a3c2ac05a23a22b2a95ddee4d9bffb16d (commit)], [;a=commit;h=0d7eff873caaeac84de01a1acdca983d2c7ba3fe (commit)], [;a=commit;h=e56a99d5a42dcb91e622ae7a0289d8fb2ddabffb (commit)]

1.2. Ext4: Delayed Allocation

In this release, Ext4 is adding one of its most important planned features: Delayed allocation, also called [ "Allocate-on-flush"]. It doesn't changes the disk format in any way, but it improves the performance in a wide range of workloads. This is how it works: When an application write()s data to the disk, the data is usually not written immediately to the disk, it's cached in RAM for a while. But despite of not being written immediately to the disk, the filesystem allocates the neccesary disk structures for it immediately. Delayed allocation consists on not allocating space for that cached data - instead, only the free space counter is updated when write() is called. The on-disk blocks and structures are allocated only when the cached data is finally written to the disk - not when a process writes something (IOW: "delayed allocation"). This approach, used by filesystems such as XFS, btrfs, ZFS, or Reiser 4, improves noticeably the performance on many workloads. It also results in better block allocation decisions, because when allocation decisions are done at write()-time, the block allocator can not know if any other write()s are going to be done.

Code: [;a=commit;h=29a814d2ee0e43c2980f33f91c1311ec06c0aa35 (commit 1], [;a=commit;h=64769240bd07f446f83660bb143bb609d8ab4910 2], [;a=commit;h=d2a1763791a634e315ec926b62829c1e88842c86 3], [;a=commit;h=cd1aac32923a9c8adcc0ae85e33c1ca0c5855838 4], [;a=commit;h=dd919b9822c5fd9fd72f95a602440130297c3857 5)]

There's also a new implementation of the default data=ordered journaling mode based in inodes, not in jbd buffer heads. Code: [;a=commit;h=c851ed540173736e60d48b53b91a16ea5c903896 (commit 1], [;a=commit;h=678aaf481496b01473b778685eca231d6784098b 2], [;a=commit;h=87c89c232c8f7b3820c33c3b9bc803e9358027da 3], [;a=commit;h=772cb7c83ba256a11c7bf99a11bef3858d23767c 4)]

1.3. ftrace, sysprof support

Ftrace is a very simple function tracer -unrelated to kprobes/SystemTap- which was born in the -rt patches. It uses a compiler feature to insert a small, 5-byte No-Operation instruction to the beginning of every kernel function, which NOP sequence is then dynamically patched into a tracer call when tracing is enabled by the administrator. If it's disabled, the overhead of the instructions is very small and not measurable even in micro-benchmarks. Although ftrace is the function tracer, it also includes an plugin infrastructure that allows for other types of tracing. Some of the tracers that are currently in ftrace include a tracer to trace context switches, the time it takes for a high priority task to run after it was woken up, how long interrupts are disabled, the time spent in preemption off critical sections.

The interface to access ftrace can be found in /debugfs/tracing, which are documented in Documentation/ftrace.txt. There's also a sysprof plugin that can be used with a development version of sysprof - "svn checkout sysprof"

Code: [;a=commit;h=7c731e0a495e25e79dc1e9e68772a67a55721a65 (commit 1], [;a=commit;h=502825282e6f79c975a644afc124432ec1744de4 2], [;a=commit;h=6e766410c4babd37bc7cd5e25009c179781742c8 3], [;a=commit;h=16444a8a40d4c7b4f6de34af0cae1f76a4f6c901 4], [;a=commit;h=bc0c38d139ec7fcd5c030aea16b008f3732e42ac 5], [;a=commit;h=1b29b01887e6032dcaf818c14999c7a39593b4e7 6], [;a=commit;h=35e8e302e5d6e32675df2fc1dd3a53dfa6630dc1 7], [;a=commit;h=352ad25aa4a189c667cb2af333948d34692a2d27 8], [;a=commit;h=81d68a96a39844853b37f20cc8282d9b65b78ef3 9], [;a=commit;h=6cd8a4bb2f97527a9ceb30bc77ea4e959c6a95e3 10], [;a=commit;h=3d0833953e1b98b79ddf491dd49229eef9baeac1 11], [;a=commit;h=b0fc494fae96a7089f3651cb451f461c7291244c 12], [;a=commit;h=4e491d14f2506b218d678935c25a7027b79178b1 13] [;a=commit;h=f06c38103ea9dbca27c3f4d77f444ddefb5477cd 14], [;a=commit;h=f984b51e0779a6dd30feedc41404013ca54e5d05 15], [;a=commit;h=014c257cce65e9d1cd2d28ec1c89a37c536b151d 16)]

1.4. Mmiotrace

Mmiotrace is a tool for trapping [ memory mapped IO] (MMIO) accesses within the kernel. Since MMIO is used by drivers, this tool can be used for debugging and especially for reverse engineering binary drivers.

Code: [;a=commit;h=8b7d89d02ef3c6a7c73d6596f28cea7632850af4 (commit)], Documentation: [;a=commit;h=c6c67c1afcce71335b18ed8769b1165c468bfb03 (commit)]

==Voltage and Current Regulator == This framework is designed to provide a generic interface to voltage and current regulators. The intention is to allow systems to dynamically control regulator output in order to save power and prolong battery life. This applies to both voltage regulators (where voltage output is controllable) and current sinks (where current output is controllable). This framework is designed around SoC based devices and has also been designed against two Power Management ICs (PMICs) currently on the market - namely the Freescale MC13783 and the Wolfson WM8350, however it is quite generic and should apply to all PMICs.

Code: [;a=commit;h=571a354b1542a274d88617e1f6703f3fe7a517f1 (commit 1], [;a=commit;h=e2ce4eaa76214f65a3f328ec5b45c30248115768 2], [;a=commit;h=414c70cb91c445ec813b61e16fe4882807e40240 3], [;a=commit;h=48d335ba3164ce99cb8847513d0e3b6ee604eb20 4], [;a=commit;h=4b74ff6512492dedea353f89d9b56cb715df0d7f 5], [;a=commit;h=4c1184e85cb381121a5273ea20ad31ca3faa0a4f 6], [;a=commit;h=c080909eef2b3e7fba70f57cde3264fba95bdf09 7], [;a=commit;h=6392776d262fcd290616ff5e4246ee95b22c13f0 8], [;a=commit;h=8e6f0848be83c5c406ed73a6d7b4bfbf87880eec 9], [;a=commit;h=ba7e4763437561763b6cca14a41f1d2a7def23e2 10], [;a=commit;h=e7d0fe340557b202dc00135ab3cc877db794a01f 11], [;a=commit;h=e8695ebe5568921c41c269f4434e17590735865c 12], [;a=commit;h=e941d0ce532daf8d8610b2495c06f787fd587b85 13], [;a=commit;h=0eb5d5ab3ec99bfd22ff16797d95835369ffb25b 14)]

1.5. Lockless page cache and get_user_pages()

The page cache is the place where the kernel keeps in RAM a copy of a file to improve performance by avoiding disk I/O when the data that needs to be read is already on RAM. Each "mapping", which is the data structure that keeps track of the correspondence between a file and the page cache, is SMP-safe thanks to its own lock. So when different processes in different CPUs access different files, there's no lock contention, but if they access the same file (shared libraries for example), they can hit some contention on that lock. In 2.6.27, thanks to some rules on how the page cache can be used and the usage of RCU, the page cache will be able to do lookups (ie., "read" the page cache) without needing to take the mapping lock, and hence improving scalability.

Code: [;a=commit;h=47feff2c8eefe85099f87c43d3096855f0085ca0 (commit 1], [;a=commit;h=e286781d5f2e9c846e012a39653a166e9d31777d 2], [;a=commit;h=a60637c85893e7191faaafa6a72e197c24386727 3)]

Lockless get_user_pages(): get_user_pages() is a function used in direct I/O operations to pin the userspace memory that is going to be transferred. It's a complex function that requires to hold the mmap_sem semaphore in the mm_struct struct of the process and the page table lock. This is a scalability problem when there're several processes using get_user_pages in the same address space (for example, databases that do Direct I/O), because there will be lock contention. In 2.6.27, a new get_user_pages_fast() function has been introduced, which does the same work that get_user_pages() does, but its simplified to speed up the most common workloads that exercise those paths within the same address space. This new function can avoid taking the mmap_sem semaphore and the page table locks in those cases. Benchmarks showed a 10% speedup running a OLTP workload with a IBM DB2 database in a quad-core system

Code: [;a=commit;h=21cc199baa815d7b3f1ace4be20b9558cbddc00f (commit 1], [;a=commit;h=8174c430e445a93016ef18f717fe570214fa38bf 2], [;a=commit;h=f5dd33c494a427b1d1a3b574de5c9e511c888864 3], [;a=commit;h=bc40d73c950146725e9e768e856a416ec8949065 4], [;a=commit;h=652ea695364142b2464744746beac206d050ef19 5], [;a=commit;h=30002ed2e41830ec03ec3e577ad83ac6b188f96e 6)]

2. Architecture-specific changes


3. Crypto

4. Drivers

4.1. Graphics

4.2. Network

4.3. SCSI

4.4. Sound

4.5. V4L/DVB

4.6. Input

4.7. MTD

4.8. RTC


5. External Links

If you want to know what is waited before this kernelnewbies summary is completed, you can read

* A wireless 2.6.27 feature-list at [ Linux Wireless]


KernelNewbies: Linux_2_6_27 (last edited 2008-09-18 17:47:54 by diegocalleja)