Lightweight userspace priority inheritance (PI) support

Details on this [ LWN article]. PI is a critical feature for RT-ish apps. Currently (without PI), if a high-prio and a low-prio task shares a lock, even if all critical sections are coded carefully to be deterministic (i.e. all critical sections are short in duration and only execute a limited number of instructions), the kernel cannot guarantee any deterministic execution of the high-prio task: any medium-priority task could preempt the low-prio task while it holds the shared lock and executes the critical section, and could delay it indefinitely. User-space PI helps achieving/improving determinism for user-space applications. Glibc patch (neccesary): [ (link)], justification for this feature and design documentation: [;a=commit;h=a6537be9324c67b41f6d98f5a60a1bd5a8e02861 (commit)]; code: [;a=commit;h=e2970f2fb6950183a34e8545faa093eb49d186e1 (commit)], [;a=commit;h=b29739f902ee76a05493fb7d2303490fc75364f4 (commit)], [;a=commit;h=23f78d4a03c53cbd75d87a795378ea540aa08c86 (commit)]

lockdep, a kernel lock validator

Linux's locking style is know for being quite simple compared with other Unix SMP-friendly derivatives. Still, locking is a neccesary evil that is hard to get right for most of normal programmers, and locking bugs can be very difficult to find. The kernel lock validator is a debugging tool that tries to makes such things easier. lockdep is a complex infrastructure to the kernel which can then be used to prove that none of the locking patterns observed in a running system could ever deadlock the kernel (more details in this [ LWN article)]. Design documentation: [;a=commit;h=f3e97da38e1d69d24195d76f96b912323f5ee30c (commit)], code: [;a=commit;h=fbb9ce9530fd9b66096d5187fa6a115d16d9746c (commit)]

Power saving policy for the process scheduler

In machines with several multicore/smt "packages" (which will become increasinly common in the future), the power consumption can be improved by letting some packages idle while others do all the work, instead of spreading the tasks over all CPUs, so a optional power saving policy has been developed to make this possible. When this power savings policy is enabled - set to 1 the sysfs entry 'sched_mc_power_savings' or 'sched_smt_power_savings' placed under /sys/devices/system/cpu/cpuX/ - and under light load conditions, the scheduler will minimize the physical packages/cpu cores carrying the load and thus conserving power, but impacting the performance depending on the workload characteristics (when there's lot of work to do all CPUs will be used, to completely disable individual CPUs use the already available CPU hotplugging feature by writing 0 to the "online" file in that same sysfs directory). Read the "Chip Multi Processing(CMP) aware Linux Kernel Scheduler" talk from [ the OLS 2005] (page 201 and onwards) for more details on the effect of this policy [;a=commit;h=5c45bf279d378d436ce45825c0f136696c7b6109 (commit)]

'SMPnice': take priority into account when balancing processes between CPUs

One of the design principles of the new 2.6 scheduler (aka, "Ingo's O(1) scheduler") was the idea of having a separate run queue of processes for each CPU present on the system, instead of a single run queue for all CPUs, for scalability reasons. Periodically, the scheduler would balance the per-cpu run queues to distribute all the jobs and keep all the CPUs busy. However, priority levels were not taken into account at the time of doing this balance and it was possible recreate scenarios where the kernel was being unfair, specially with unprivileged processes. "SMPnice" is a implementation of a solution for this problem [ (LWN article)], [;a=commit;h=2dd73a4f09beacadde827a032cf15fd8b1fa3d48 (commit)]

Swapless page migration

Being able to migrate pyshical pages between nodes in NUMA-like systems - to improve the [ locality of reference] - was introduced in [ Linux 2.6.16], but it didn't use a very clean method: pages were swapped out in purpose, and then the next time those pages would be faulted, they'd be swapped in to the node where you wanted to move those pages instead of the old one. This trick was used but now the feature has been completed with "direct page migration": Now pages are moved directly from one node to another, without using swap. This feature includes a new system call which allows to move individual pages of a process from one node to another: long move_pages(pid, number_of_pages_to_move, addresses_of_pages[], nodes[] or NULL, status[],lags) - the swap-based migration had already added a migrate_pages() syscall and a MPOL_MF_MOVE option to the set_mempolicy() syscall). For full details, read this [ (LWN article)]. Code: [;a=commit;h=0697212a411c1dae03c27845f2de2f3adb32c331 (commit)], [;a=commit;h=6c5240ae7f48c83fcaa8e24fa63e7eb09aba5651 (commit)], [;a=commit;h=d75a0fcda2cfc71b50e16dc89e0c32c57d427e85 (commit)], [;a=commit;h=04e62a29bf157ce1edd168f2b71b533c80d13628 (commit)], [;a=commit;h=8d3c138b77f195ca0eee6fb639ae73f5ea9edb6b (commit)], [;a=commit;h=742755a1d8ce2b548428f7aacf1758b4bba50080 (commit)]

Generic IRQ layer

This is Yet More Generalization of the IRQ layer. Not all architectures were using the current IRQ layer (specially ARM) and the current one had some shortcomings, so . Details in this [ LWN article]: These patches attempt to take lessons learned about optimal interrupt handling on all architectures, mix in the quirks found in the fifty (yes, fifty) ARM subarchitectures, and create a new IRQ subsystem which is truly generic, and more powerful as well. Design documentation: [;a=commit;h=11c869eaf1a9c97ef273f824a697fac017d68286 (commit)]; code: [;a=commit;h=6a6de9ef5850d063c3d3fb50784bfe3a6d0712c6 (commit)], [;a=commit;h=94d39e1f6e8132ea982a1d61acbe0423d3d14365 (commit)], [;a=commit;h=6550c775cb5ee94c132d93d84de3bb23f0abf37b (commit)], [;a=commit;h=a4633adcdbc15ac51afcd0e1395de58cee27cf92 (commit)], [;a=commit;h=dd87eb3a24c4527741122713e223d74b85d43c85 (commit)], [;a=commit;h=e76de9f8eb67b7acc1cc6f28c4be8583adf0a90c (commit)], [;a=commit;h=3418d72404e35eb19e7995cbf3e7a76ba8fefbce (commit)], [;a=commit;h=ba9a2331bae5da8f65be3722b9e2d210f1987857 (commit)]

Zoned vm counters

[;a=commit;h=34aa1330f9b3c5783d269851d467326525207422 (commit)]

[;a=commit;h=9a865ffa34b6117a5e0b67640a084d8c2e198c93 (commit)]

[;a=commit;h=ca889e6c45e0b112cb2ca9d35afc66297519b5d5 (commit)]

Big libata (SATA) update

[ (LWN article)] Mainstreamn libata has been missing some features like NCQ and hotplug. The code had been written a while ago (more than a year ago in the case of NCQ) but only now it has been considered stable. The features included in this update are: a revamped error handling across all the libata code, which makes libata more robust to errors and failures, and makes easier to debug problems [;a=commit;h=022bdb075b9e1f224088a0b268de56268d7bc5b6 (commit)]; NCQ ([ Native Command Queuing]) which improves the performance greatly for many workloads) [;a=commit;h=3dc1d88193b9c65b01b64fb2dc730e486306649f (commit)], hotplug [;a=commit;h=084fe639b81c4d418a2cf714acb0475e3713cb73 (commit)], warmplug [;a=commit;h=83c47bcb3c533180a6dda78152334de50065358a (commit)], and bootplug - boot probing via hotplug path - support [;a=commit;h=3e706399b03bd237d087d731d4b1b029e546b33d (commit)], interrupt-driven PIO mode (instead of the inefficient poll method), [;a=commit;h=312f7da2824c82800ee78d6190f12854456957af (commit)]

[;a=commit;h=734efb467b31e56c2f9430590a9aa867ecf3eea1 (commit)]

[;a=commit;h=5d0cf410e94b1f1ff852c3f210d22cc6c5a27ffa (commit)]

[;a=commit;h=669bfad906522e74ee8d962801552a8c224c0d63 (commit)]

[;a=commit;h=2e3646e51b2d6415549b310655df63e7e0d7a080 (commit)]

[;a=commit;h=f6a88aa86027bdecfc74ef7c6bf6c68233e86bb3 (commit)]

[;a=commit;h=3d5631e0631a11633c649bc995a6537ec21b67b4 (commit)]

[;a=commit;h=a6a888b3c20cf559c8a2e6e4d86c570dda2ef0f5 (commit)]

[;a=commit;h=0080e667550db5ae8c9318181500c413b99ff164 (commit)]

[;a=commit;h=4552d5dc08b79868829b4be8951b29b07284753f (commit)]

[;a=commit;h=4961f10e2205d0ededa291e12ec634efc58aa93c (commit)]

[;a=commit;h=c220153654ede57b41900159eb8d1f6029d85642 (commit)], base support for the Freescale MPC8349E-mITX eval board [;a=commit;h=00280166993af8469dbfee24b779b61d3dd326c3 (commit)], 85xx CDS board support [;a=commit;h=591f0a4287d0de243493fd0c133c862e1d1f1c97 (commit)], 86xx HPCN platform support [;a=commit;h=4ca4b6274c30d53d22014fb6974efe2b3e52cfdc (commit)], [;a=commit;h=b809b3e86f39651475b30ceb1caf535071534d4d (commit)], [;a=commit;h=c9b484b5c1201321f40b04870e8b417033b6fe76 (commit)], [;a=commit;h=9674ed38d8e4a9ce15c61b4306ef803cad0e1dc0 (commit)], [;a=commit;h=96abe9358becb543c21121699c711897374bcbdf (commit)], [;a=commit;h=6b543404058a5ffdca8c48e95e0b8a69bb4bdba9 (commit)], Freescale mpc7448 (Taiga) board support [;a=commit;h=c5d56332fd6c2f0c7cf9d1f65416076f2711ea28 (commit)]

[;a=commit;h=b184a4c9a4e542890265b4cdd3ff7908f4adc9c4 (commit)]

[;a=commit;h=6b2652936b9e61df47664a8dde46872a74d7dba2 (commit)]

[;a=commit;h=189acaaef81b1d71aedd0d28810de24160c2e781 (commit)]

[;a=commit;h=1717ffc58850dfa9e08b4977f8d0323cb3336863 (commit)]

[;a=commit;h=1b06e6ba25a37fe1c289049d0e0300d71ae39eff (commit)]

[;a=commit;h=fe610671d7a88e363e8cebcb7e2f32078b0151ce (commit)]

[;a=commit;h=b0c9ad8e0ff154f8c4730b8c4383f49b846c97c4 (commit)]

[;a=commit;h=9e8e30a0cc0ccb43773d14d8b8b84bcc585e9cc1 (commit)]

[;a=commit;h=04837f6447c7f3ef114cda1ad761822dedbff8cf (commit)]

[;a=commit;h=dcdcf63ef12dc3fbaa17a6d04f16ada8e63bb4d0 (commit)]

[;a=commit;h=c067a7899790ed4c03b00ed186c6e3b6a3964379 (commit)], L5D [;a=commit;h=ebccb84810729f0e86a83a65681ba2de45ff84d8 (commit)], A3G [;a=commit;h=ed2cb07b2bb04f14793cdeecb0b384374e979525 (commit)], A4G [;a=commit;h=f78c589d108f4b06a012817536c9ced37f473eae (commit)], LED display support [;a=commit;h=42cb891295795ed9b3048c8922d93f7a71f63968 (commit)]

[;a=commit;h=45dc2de1e53a29f898b81326b8a16e6192d52e4e (commit)]

[;a=commit;h=d4dbd0250ea1d24bb3d2d13559432fa069d795e2 (commit)]

[;a=commit;h=9e653b6342c94016f5cc9937061ef99e9c4b4045 (commit)]

[;a=commit;h=f655675b3fe09c4d0506d357527fe07544623009 (commit)]

[;a=commit;h=a94213b1fa7b26dcc271bf4b4f9eebf1f1af33a2 (commit)]

[;a=commit;h=c93983bf517c100a31e40ef087e19bd3d7aa2d28 (commit)]

[;a=commit;h=59e35ba1257903eaff5203f62f77554da02f5b63 (commit)]

KernelNewbies: Linux_2_6_18 (last edited 2006-07-12 19:50:31 by diegocalleja)