Linux 3.6 has been released on 30 Sep 2012
Summary: This Linux release includes new features in Btrfs: subvolume quotas, quota groups and snapshot diffs (aka "send/receive"). It also includes support for suspending to disk and memory at the same time, a TCP "Fast Open" mode, a "TCP small queues" feature to fight bufferbloat; support for safe swapping over NFS/NBD, better Ext4 quota support, support for the PCIe D3cold power state; and VFIO, which allows safe access from guest drivers to bare-metal host devices. Many small features and new drivers and fixes are also available.
Prominent features in Linux 3.6
- Btrfs: subvolume quotas, quota groups, snapshot diff, cross-subvolume file clones
- Suspend to disk and memory at the same time
- Preparatory work to support the SMBv2 protocol
- TCP Fast Open (client side)
- Bufferbloat fight: TCP small queues
- Safe swap over NFS/NBD
- ext4: better quota support
- PCIe D3cold power state support
- VFIO: bare-metal safe access to devices from userspace drivers
- Driver and architecture-specific changes
- Various core changes
- Memory Management
- File systems
- Other news sites that track the changes of this release
1. Prominent features in Linux 3.6
1.1. Btrfs: subvolume quotas, quota groups, snapshot diff, cross-subvolume file clones
1.1.1. Subvolume quotas and quota groups
A size limit can be set for each subvolume. Once the subvolume reachs that limit, it won't be possible to write more data in it. This feature can be used as a substitute of quotas, assigning to each user home a subvolume and setting a size limit to it.
However, handling subvolumes quotas individually can be hard. Btrfs supports the concept of quota groups. It is possible to create a quota group and toss multiple subvolumes into that group: the quota limits will be automatically applied to all subvolumes in the group. The command used for this feature is btrfs qgroup create/destroy assign/remove show/limit
(commit 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
1.1.2. Snapshot diffs, aka "send/receive"
Btrfs can compute the differences between two snapshots and store the differences into a file. This file can be replayed later to reconstruct the sent subvolumes/snapshots. The main, but not only, usage for send/receive is backups.
1.1.3. Cross-subvolume file clones
The Btrfs copy-on-write design allows to have many files sharing the same underlying data. This allows to copy (using cp --reflink) files or directories without duplicating the space usage. This had a limitation, though: it was not possible to clone across different subvolumes. This restriction has been removed (it is still not possible to clone files when they cross vfsmounts, ie. two subvolumes from one filesystem mounted separately)
Recommended LWN article: Btrfs send/receive
1.2. Suspend to disk and memory at the same time
In portable devices it is useful to write a hibernation image to disk, and then suspend. If the battery runs out or power is otherwise lost, the computer will power off, but it will be resumed from the hibernated image. If not, it will resume normally from memory suspend, and hibernation image will be discarded.
If you would like to write hibernation image to swap and then suspend to RAM, you can try "echo suspend > /sys/power/disk; echo disk > /sys/power/state"
1.3. Preparatory work to support the SMBv2 protocol
Note : The SMBv2 support isn't actually available in this release, it got turned off before the release. It will be available in 3.7.
The cifs networking filesystem has added support for the version 2 of the SMB protocol. The SMB2 protocol is the successor to the popular CIFS and SMB network file sharing protocols, and is the native file sharing mechanism for Windows OSs since it was introduced in Windows Vista in 2006. SMB2 enablement will eventually allow users better performance, security and features, than would not be possible with previous protocols.
1.4. TCP Fast Open (client side)
"Fast Open" is a optimization to the process of stablishing a TCP connection that allows the elimination of one round time trip (RTT) from certain kinds of TCP conversations. Fast Open could result in speed improvements of between 4% and 41% in the page load times on popular web sites. In this version only the client-side has been merged.
Recommended LWN article: TCP Fast Open: expediting web services
Code: (commit 1, 2, 3, 4, 5, 6, 7)
1.5. Bufferbloat fight: TCP small queues
TCP small queues is another mechanism designed to fight bufferbloat. TCP Small Queues goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) and < 8ms on 100Mbit (instead of 132 ms).
Recommended LWN article: TCP small queues
1.6. Safe swap over NFS/NBD
The Linux Terminal Server Project recommends the use of the Network Block Device (NBD) for swap according to the manual. There is also documentation and tutorials on how to setup swap over NBD at some places. The nbd-client also documents the use of NBD as swap. Despite this, a machine using NBD for swap could deadlock within minutes if swap was used intensively. This release allows safe swapping over NBD and also adds support for swapping over NFS.
Recommended LWN article: Safely swapping over the net
1.7. ext4: better quota support
ext4 has added support for quotas as a first class feature in ext4; which is to say, instead of as separate files visible in the file system directory hierarchy, the quota files are stored in hidden inodes as file system metadata, and will be managed directly by e2fsprogs, and quota will be enabled automatically as soon as the file system is mounted. The repquota program will not function initially, until a new QUOTASCAN_OPEN interface is implemented. More details at https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4
1.8. PCIe D3cold power state support
This release adds PCI Express runtime D3cold power state support. D3cold is the deepest power saving state for a PCIe device, where its main power is removed.
1.9. VFIO: bare-metal safe access to devices from userspace drivers
The VFIO driver is an IOMMU/device agnostic framework for exposing direct device access to userspace, in a secure, IOMMU protected environment. In other words, this allows safe, non-privileged, userspace drivers. Why does Linux wants that? Virtual machines often make use of direct device access ("device assignment") when configured for the highest possible I/O performance. From a device and host perspective, this simply turns the VM into a userspace driver, with the benefits of significantly reduced latency, higher bandwidth, and direct use of bare-metal device drivers. Some applications, particularly in the high performance computing field, also benefit from low-overhead, direct device access from userspace. Examples include network adapters (often non-TCP/IP based) and compute accelerators.
Recommended LWN article: Safe device assignment with VFIO
2. Driver and architecture-specific changes
All the driver and architecture-specific changes can be found in the Linux_3.6_DriverArch page
3. Various core changes
Add symlink and hardlink restrictions to the Linux VFS, which helps to solve a long-standing class of security issues consisting in the symlink-based time-of-check-time-of-use race. Some distributions have been using this functionality for a while. Recommended LWN article: Tightening security: not for the impatient Code: (commit 1, 2)
IOMMU groups (commit)
process scheduler: Remove broken power estimation (commit)
Thermal: Add Hysteresis attributes (commit), make Thermal trip points writeable (commit)
cpuidle: add support for states that affect multiple CPUs (commit)
RCU: Control RCU_FANOUT_LEAF from boot-time parameter (commit)
4. Memory Management
Allow swap readahead IOPS to be merged, it improves throughput and at the same time lowers CPU consumption (commit)
Add a new operation code (BLKPG_RESIZE_PARTITION) to the BLKPG ioctl that allows altering the size of an existing partition, even if it is currently in use (commit)
Device mapper RAID: Add support for MD RAID10 (commit)
Device mapper thin: add read-only and fail I/O modes (commit)
Device mapper: remove persistent data debug space map checker (commit)
md/raid1: prevent merging too large request (commit)
hists browser: Implement printing snapshots to files (commit)
Add sort by src line/number (commit)
Add PMU event alias support (commit)
Add tcm_vhost, a vhost-level TCM fabric driver for virtio SCSI initiators into KVM guest (commit)
virtio: rng: s3/s4 support (commit)
Add mcelog support for Xen platform (commit)
Delete ipv4 routing cache (commit)
tcp: implement the RFC 5691 3.2 mitigation against Blind Reset attack using RST bit (commit) and SYN bit, RFC 5961 4.2 (commit)
VTI support: Virtual (secure) IP: tunneling. This can be used with xfrm mode tunnel to give the notion of a secure tunnel for IPSEC and then use routing protocol on top (commit)
mac802154: add WPAN device-class support (commit)
Add interface option to enable routing of 127.0.0.0/8 (commit)
tun: experimental zero-copy tx support (commit)
Add support for 40GbE link (commit)
Add 802.11ad (60 GHz band) support (commit)
Added kernel support in EEE Ethtool commands (commit)
Speedup /proc/net/unix (commit)
wireless: remove wext sysfs (commit)
Hardware acceleration in Atmel processors for the following algorithms: AES (commit), DES/TDES (commit) and SHA-1/SHA-256 (commit)
CRC hardware driver for Blackfin BF60x family processors. (commit)
caam: add support for SEC v5.x RNG4 (commit), ahash HMAC support (commit), hwrng support (commit)
serpent: add x86_64/avx assembler implementation (commit)
twofish: add x86_64/avx assembler implementation (commit)
talitos: add sha224, sha384 and sha512 to existing AEAD algorithms (commit)
11. File systems
Copy up POSIX ACL and read-only flags from lower mount (commit)
Reduce file fragmentation (commit)
12. Other news sites that track the changes of this release
H-Online: What's new in Linux 3.6