KernelNewbies:

Direct from Central Square, in Cambridge, Massachusetts for the weekend of May 31st 2009, I'm Jon Masters with a summary of Friday through Sunday's Linux Kernel Mailing List traffic.

In today's issue: Xen, page allocator sanitization, poisonous hardware, magic sysrq, System Management Interrupts, and Intel Atom CPU support.

Xen. Jeremy Fitzhardinge reminded everyone that "Xen is a feature" and that it shouldn't be dismissed out of hand just because some people have personal objections to it. He wanted to steer the conversation purely toward the technical merits of the patches he has sent recently - which, it has to be said - have been many in number and frequently rebased in response to concerns. Jeremy noted that Xen is far from niche, and that it has over 500K established commercial users who are currently forced to use various out-of-tree patched kernels. There is a certain compelling argument for making life easy for these folks - who include large brand names, the Mozilla and Debian projects included (Greg Kroah-Hartman posted a followup mentioning the particular support problems those projects face in running on patched kernels). Ian Campbell also posted yet another set of non-Xen specific cleanup patches. Expect to see a few fireworks as these folks try to push Xen dom0 support in the forthcoming 2.6.31 merge window and community members push back on the specific kernel changes - whether truly intrusive, or simply unconfortable.

Page allocator sanitization. As covered twice over the past two weeks, Larry Highsmith has been proposing various patches for low-level page "sanitization". By this, he means more than simply calling kzfree to nullify pages on release, but also wants to essentially zero every page when it is freed using a tunable runtime kernel command line option. The idea is one that certain security folks really find attractive since it helps mitigate against particular classes of attack - similarly to how NX and SELinux can help to reduce the impact of existing code flaws. Consequently, there was some interest on a technical level. But much of the interest was instead focused on the tone and nature of the conversation, which Ingo Molnar described as "condescending", and Linus Torvalds described in saying: "I'm also not in the least convinced about how you just dimiss everybodys concerns". Others were annoyed too, including David Miller, and Pekka Enberg (who noted an additional off-list discussion on the #mm IRC channel). Pekka (the SLAB maintainer) repeatedly asked for Larry to explain where kzfree is broken, to which Ray Lee later added: "How about, for the third time, just sharing that information with the whole rest of us reading along?". Clearly, the LKML masses were not amused.

There is obvious concern whenever someone claims that fundamental kernel algorithms are broken, whether or not that is actually proven to be true. Alan Cox had a more basic concern that various different issues were being muddled together, so he split out three different concerns: "#1. Is ksize() buggy?", "#2 Using kzfree() to clear specific bits of memory", "#3 People wanting to be able to select for more security *irrespective* of performance cost". On a technical level, Ingo Molnar remained unconvinced that the sledgehammer approach to zeroing memory was better than a few carefully placed kzfree() calls in slow path key/crypto/similar sections (which Larry did follow up with examples of). Ingo wasn't alone in that - his views were echoed by Peter Zjilstra, although Rik van Riel did point out the mitigating nature of this, similar to how the SELinux features can be selectively enabled by those wanting to reduce the impact of an already compromised piece of user software. Additionally, Ingo pointed out that little is being done to zero out the kernel stack, which might persist for some time in the case of long-running tasks, and might contain sensitive data that was written at depth of certain complex crypto functions (which might not be overwritten if other code is not called that has similarly deep call-chains involved in kernel stack footprint).

Out of the discussion came some useful points, including an exercise in how not to introduce new patches, documentation on the use of kzfree and how it may wind up (correctly) zeroing out more memory than the programmer might actually be anticipating, a plan to overhaul the low-level allocation of pages used by the tty code, and likely a lot more.

Poisonous hardware. Andi Kleen posted another version of the "HWPOISON" patchset. These patches enable support for newer Intel hardware that implement so-called "MCA recovery" - that is to say that the hardware can detect certain ECC memory corruption and respond by marking the backing pages as corrupt, or poisoned. The OS assists in this process and ultimately kills of the unfortunate task that happens to be using that memory, if indeed it was a task. It isn't clear how happy the ending is for other kernel code. Andi responded to "all feedback" except that he didn't move handlers into separate files, prefering to keep them all together for now, and he is retaining the "pagepoison" bit because he views it as cleaner than other hacks that were proposed in response to his initial posting of the patch series. Andi posted a fairly lengthy (separate) summary justifying why a dedicated bit is necessary, and why "pageflags compression" as suggested by Alan Cox isn't actually going to be practical, at least for the moment.

Magic sysrq. Jason Wessel posted patches for everyone's favorite USB dongle chipset - the highly prolific Prolific 2303 - that would add sysrq support. In preparing these patches, he re-discovered a long-standing problem with serial hardware initialization that has plagued those of us working with UNIX (think Sun Solaris) and Cisco hardware in the past: the port might receive arbitrary characters that include a NULL as it is being reset, which might be mis- interpreted as a sysrq. Jason proposes only registering sysrq when a port is actually marked as being a serial console, which in this author's view makes a great deal of sense.

An SMI detector. Jon Masters posted version 2 of his SMI detector. SMIs, or System Management Interrupts, are a mechanism provided by modern Intel (and other compatible) processors that enable the CPU to enter a special state in order to execute support code provided by the BIOS vendor to facilitate anything from fan and thermal control management to full-blown remote management of systems. They are also often used to emulate legacy hardware - for example, to provide a legacy IDE interface where the underlying hardware actually is SATA based, for older Operating System environments, for floppy disk emulationn support, and many other purposes. Since SMIs are a kind of "poor man's" alternative to dedicated hardware, they steal CPU cycles from the Operating System, which typically won't notice a few microseconds, and isn't normally able to detect that SMIs are even occuring, except perhaps a few performance counters on the most recent of CPU implementations (unlike NMIs, these also switch the CPU into a near debug-like mode with a separate memory map). However, certain SMI implementations can steal the CPU for noticable periods of time - even up in to the milliseconds - especially on "Real Time" systems. The SMI detector uses stop_machine to steal the CPU for configurable time intervals, sampling the CPU timestamp counter, to detect SMIs.

Intel Atom finally got a custom config option. Recent versions of GCC specifically support the in-order pipelining architecture re-visited in the Atom, for which many people have proposed less efficient compilation options (thinking that Atom is akin to the original Pentium in ways it is not). When this patch is applied, and a new version of GCC is in use, the -mtune=atom setting will be passed through to the GCC build options.

Finally today, floppy disk hibernation support finally comes to Linux. Ondrej Zary posted a patch that expanded upon work done by Ingo Molnar in 2006. The question of course is, is anyone still using floppies three years later?

In today's announcements: No major announcements today. No 2.6.30-rc8 release has snuck out as of yet, which means one is likely forthcoming at any moment.

The current kernel release is 2.6.30-rc7, which was released by Linus over the US Memorial Day Weekend holiday last weekend. Andrew Morton posted an mm-of-the-moment for 2009-05-28-17-48 which includes a lot of interesting patches that have been sitting in his queue.

Stephen Rothwell posted a linux-next tree for May 29th. Since Thursday, the pxa tree lost its build failure and the powerpc tree continues to fail to build in an allyesconfig powerpc build configuration. The total sub-tree count remains steady today at 141.

Rafael J. Wysocki posted a list of regressions in recent git snapshots. For example, since 2.6.29 there are currently a total of 37 regressions of which 35 are pending and 28 are marked as unresolved. These include performance, boot hang, and a suspend/resume problem.

That's a summary of the weekend's LKML traffic. For further information visit kernel.org. I'm Jon Masters.

KernelNewbies: KernelPodcast20090531 (last edited 2017-12-30 01:30:30 by localhost)