Linux 3.14 has been released on Sun, 30 Mar 2014.
Summary: This release includes the deadline task scheduling policy for real-time tasks, a memory compression mechanism is now considered stable, a port of the locking validator to userspace, ability to store properties such as compression for each inode in Btrfs, trigger support for tracing events, improvements to userspace probing, kernel address space randomization, TCP automatic coalescing of certain kinds of connections, a new network packet scheduler to fight bufferbloat, new drivers and many other small improvements.
Contents
-
Prominent features
- Deadline scheduling class for better real-time scheduling
- zram: Memory compression mechanism considered stable
- Btrfs: inode properties
- Trigger support for tracing events
- Userspace probes access to all arguments
- Userspace locking validator
- Kernel address space randomization
- TCP automatic corking
- Antibufferbloat: "Proportional Integral controller Enhanced" packet scheduler
- Drivers and architectures
- Core
- Memory management
- Block layer
- File systems
- Networking
- Virtualization
- Security
- Crypto
- Tracing/perf
- Other news sites that track the changes of this release
1. Prominent features
1.1. Deadline scheduling class for better real-time scheduling
Operating systems traditionally provide scheduling priorities for processes: The higher priority a process has, the more scheduling time that process it can get with respect other processes with lower priorities. In Linux, users usually set scheduling priorities from a value of -20 to 19 using the nice(2) tool (in addition, Linux supports the notion scheduling classes: each class provides different scheduling policies; for example, there is a SCHED_FIFO class with a "first in, first out" policy, and a SCHED_RR with a round-robin policy).
The approach of process priorities is, however, not well suited for real-time tasks. Evidence Srl and the ReTiS Lab have created an alternative designed around real time concepts: deadline scheduling, implemented as a new scheduling policy, SCHED_DEADLINE.
Deadline scheduling gets away with the notion of process priorities. Instead, processes provide three parameters: runtime, period, and deadline. A SCHED_DEADLINE task is guaranteed to receive "runtime" microseconds of execution time every "period" microseconds, and these "runtime" microseconds are available within "deadline" microseconds from the beginning of the period. The task scheduler uses that information to run the process with the earliest deadline, a behavior closer to the requirements needed by real-time systems. For more details about the scheduling algorithms, read the documentation
Recommended LWN article: Deadline scheduling: coming soon?
Recommended page on Wikipedia: SCHED_DEADLINE
Documentation: Documentation/scheduler/sched-deadline.txt
Code: commit 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13
1.2. zram: Memory compression mechanism considered stable
zram provides RAM block devices. Everything written to these block devices gets compressed. If zram block devices are used as swap, when the system tries to move parts of memory to swap it will be effectively moving memory from one part of the RAM to another, except that the data will be compressed before being copied to the destination. This effectively works as a cheap memory compression mechanism to improve responsiveness in systems with limited amounts of memory. Zram is being used by TV companies, Android 4.4, Cyanogenmod, Chrome OS, Lubuntu...
Zram has been in staging since Linux 2.6.33. In this release, zram has been moved out of staging to drivers/block/zram.
1.3. Btrfs: inode properties
This release adds infrastructure in Btrfsto attach name/value pairs to inodes as xattrs. The purpose of these pairs is to store properties for inodes, such as compression. These properties can be inherited, this means when a directory inode has inheritable properties set, these are added to new inodes created under that directory. Subvolumes can also have properties associated with them, and they can be inherited from their parent subvolume. This release adds one specific property implementation, named "compression", whose values can be "lzo" or "zlib" and it's an inheritable property.
Code: commit
1.4. Trigger support for tracing events
The tracing infastructure in the Linux kernel allows to easily register probe functions as events (for more details, see Documentation/trace/events.txt. This release allows these events to conditionally trigger "commands". These commands can take various forms, examples would be enabling or disabling other trace events or invoking a stack trace whenever the trace event is hit. Any given trigger can additionally have an event filter, the command will only be invoked if the event being invoked passes the associated filter.
For example, the following trigger dumps a stacktrace the first 5 times a kmalloc request happens with a size >= 64K: {{{# echo 'stacktrace:5 if bytes_req >= 65536' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger}}}
For more details, see Section 6 in Documentation/trace/events.txt
Recommended LWN article: Triggers for tracing
Code: commit 1, 2, 3, 4,5, 6, 7
1.5. Userspace probes access to all arguments
Userspace probes are a Linux 3.5 feature that allows to set tracing probes in userspace programs at runtime. This release enables to fetch other types of argument for the uprobes: memory, stack, deference, bitfield, retval and file offset. For more details see here.
1.6. Userspace locking validator
The Linux kernel has (since 2.6.18) a lock validator that can find locking issues at runtime. This release makes possible to run the Linux locking validator in userspace, making possible to debug locking issues in userspace programs. For more details, see the recommended LWN link.
Recommended LWN article: User-space lockdep
Code: commit 1, 2, 3, 4, 5, 6, 7
1.7. Kernel address space randomization
This release allows to randomize the physical and virtual address at which the kernel image is decompressed, as a security feature that deters exploit attempts relying on knowledge of the location of kernel internals.
Recommended LWN article: Kernel address space layout randomization
Code: 1, 2, 3, 4, 5, 6, 7, 8, 9
1.8. TCP automatic corking
When applications do consecutive small write()/sendmsg() system calls, the Linux kernel will try to coalesce these small writes as much as possible, to lower total amount of sent packets - this feature is called "automatic corking". Automatic corking is done if at least one prior packet for the flow is waiting in Qdisc queues or device transmit queue. Applications can still use TCP_CORK for optimal behavior when they know how/when to uncork their sockets. A new sysctl (/proc/sys/net/ipv4/tcp_autocorking) has been added to control this feature, which defaults to enabled. For benchmarks and more details see the commit link.
Code: commit
1.9. Antibufferbloat: "Proportional Integral controller Enhanced" packet scheduler
Bufferbloat is a phenomenon where excess buffers in the network cause high latency and jitter. As more and more interactive applications (e.g. voice over IP, real-time video streaming and financial transactions) run in the Internet, high latency and jitter degrade application performance. There has been a number of features and improvements in the Linux kernel network stack that try to address this problem.
This release adds a new network packet scheduler: PIE(Proportional Integral controller Enhanced) that can effectively control the average queueing latency to a target value. Simulation results, theoretical analysis and Linux testbed results have shown that PIE can ensure low latency and achieve high link utilization under various congestion situations. The design incurs very small overhead. For more information, please see technical paper about PIE in the IEEE Conference on High Performance Switching and Routing 2013. Also you can refer to the IETF draft submission. All relevant code, documents and test scripts and results can be found at ftp://ftpeng.cisco.com/pie/.
Code: commit
2. Drivers and architectures
All the driver and architecture-specific changes can be found in the Linux_3.14-DriversArch page
3. Core
Tool for suspend/resume performance analysis and optimization commit
futexes: Increase hash table size for better performance commit
IPC queues: remove limits for the amount of system-wide queues that were added in 93e6f119c0ce commit
kexec: add sysctl to disable future kexec usage commit
lib: introduce arch optimized hash library commit
locking: Optimize lock_bh functions commit
scheduler: Drop sysctl_numa_balancing_settle_count sysctl commit
scheduler: add tracepoints related to NUMA task migration commit
stackprotector: Introduce CONFIG_CC_STACKPROTECTOR_STRONG commit
swap: add a simple detector for inappropriate swapin readahead commit
sysfs, kernfs: add skeletons for kernfs commit
rcutorture: Add --bootargs argument to specify additional boot arguments commit, add --buildonly dry-run capability commit, add --kmake-arg argument to kvm.sh commit, add --no-initrd argument to kvm.sh commit, add --qemu-args argument to kvm.sh commit, add KVM-based test framework commit, add SRCU Kconfig-fragment files commit, add datestamp argument to kvm.sh commit, add per-Kconfig fragment boot parameters commit, add per-version default Kconfig fragments and module parameters commit, add v3.12 version, which adds sysidle testing commit, eliminate --rcu-kvm argument commit, eliminate configdir argument from kvm-recheck.sh script commit, remove decorative qemu argument commit
4. Memory management
/proc/meminfo: provide estimated available memory commit
Add overcommit_kbytes sysctl variable, it allows a more finer grain configuration than overcommit_ratio in machines with lots of memory commit
Document improved handling of swappiness==0 (implemented long time ago) commit
5. Block layer
Immutable bio vecs commit
rbd: add support for single-major device number allocation scheme commit, enable extended devt in single-major mode commit
- Device Manager
6. File systems
- Btrfs
Incompatible format change to remove hole extents commit
Add a few mount options so that features can be changed on remounts: "barrier" commit, "datacow" commit, "datasum" commit, "noautodefrag" commit, "nodiscard" commit, "noenospc_debug" commit, "noflushoncommit" commit, "noinode_cache" commit, "treelog" commit
Publish btrfs internal information in sysfs, some of the features can be changed commit 1, 2, 3, 4, 5, 6, 7, 8
Add ioctls to query/change feature bits online commit
Performance improvements: Various performance improvements, see each commit for details commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit, commit
Add ioctl to export size of global metadata reservation, for better btrfs df reporting commit
- f2fs
hfsplus: add HFSX subfolder count support commit
exofs: Allow O_DIRECT open commit
ext4: enable punch hole for bigalloc commit
XFS: Allow logical-sector sized O_DIRECT commit
9P: Introduction of a new cache=mmap model. commit
7. Networking
ipv6 addrconf: add IFA_F_NOPREFIXROUTE flag to suppress creation of IP6 routes commit
ipv6: enable anycast addresses as source addresses for datagrams commit
ipv6: router reachability probing commit
ipv6: send Change Status Report after DAD is completed commit
ipv6: support IPV6_PMTU_INTERFACE on sockets commit
- ipv6: add the option to use anycast addresses as source addresses in echo reply
- mac80211
macvtap: Add support of packet capture on macvtap device. commit
net-gre-gro: Add GRE support to the GRO stack commit
net-sysfs: add support for device-specific rx queue sysfs attributes commit
Add GRO support for UDP encapsulating protocols commit
Add GRO support for vxlan traffic commit
Add NETDEV_PRECHANGEMTU to notify before mtu change happens commit
if_arp: add ARPHRD_6LOWPAN type commit
net_tstamp: Add SIOCGHWTSTAMP ioctl to match SIOCSHWTSTAMP commit
netconf: add proxy-arp support commit, add support for IPv6 proxy_ndp commit
- netfilter
Add IPv4/6 IPComp extension match support commit
Introduce l2tp match extension commit
nf_nat: add full port randomization support commit
nf_tables: add "inet" table for IPv4/IPv6 commit
nf_tables: add nfproto support to meta expression commit
nf_tables: add support for multi family tables commit
nfnetlink_queue: enable UID/GID socket info retrieval commit
nft: add queue module commit
nft_ct: Add support to set the connmark commit
nft_meta: add l4proto support commit
nft_reject: support for IPv6 and TCP reset commit
numa: add a sysctl for numa_balancing commit
openvswitch: Allow user space to announce ability to accept unaligned Netlink messages commit, enable memory mapped Netlink i/o commit
packet: improve socket create/bind latency in some cases commit, introduce PACKET_QDISC_BYPASS socket option commit, use percpu mmap tx frame pending refcount commit
tcp: metrics: New netlink attribute for src IP and dumped in netlink reply commit
sunrpc: add an "info" file for the dummy gssd pipe commit
tun: Add support for RFS on tun flows commit
pktgen, xfrm: Add statistics counting when transforming commit
rtnetlink: provide api for getting and setting slave info commit
IB: Add flow steering support for IPoIB UD traffic commit, ethernet L2 attributes in verbs/cm structures commit
NFC: NCI: Add set_config API commit
af_packet: Add Queue mapping mode to af_packet fanout operation commit
batman-adv: add bonding again commit
bonding: add netlink attribute support: ad_info commit, ad_select commit, all_slaves_active commit, arp_all_targets commit, arp_interval commit, arp_ip_target commit, add arp_validate commit, downdelay commit, fail_over_mac commit, lacp_rate commit, lp_interval commit, miimon commit, min_links commit, num_grat_arp commit, packets_per_slave commit, primary commit, resend_igmp commit, updelay commit, use_carrier commit, xmit_hash_policy commit
bonding: add sysfs /slave dir for bond slave devices. commit
bonding: add option lp_interval for loading module commit
cfg80211: Add support for QoS mapping commit
filter: bpf_dbg: add minimal bpf debugger commit
8. Virtualization
Add support for Hyper-V reference time counter commit
virtio-net: auto-tune mergeable rx buffer size for improved performance commit
virtio-net: initial rx sysfs support, export mergeable rx buffer size commit
xen/pvh: Support ParaVirtualized Hardware extensions (v3). commit
xen-netfront: add support for IPv6 offloads commit
xen/events: Add the hypervisor interface for the FIFO-based event channels commit
xen: balloon: enable for ARM commit
9. Security
Smack: Make the syslog control configurable commit
- audit
10. Crypto
Support for AMD Cryptographic Coprocessor which can be used to accelerate or offload encryption operations such as SHA, AES and more commit 1, 2, 3, 4, 5, 6, 7
mxs - Add Freescale MXS DCP driver commit, remove the old DCP driver commit
aesni: AVX and AVX2 version of AESNI-GCM encode and decode commit
11. Tracing/perf
perf kvm: Introduce option -v for perf kvm command. commit, make perf kvm diff support --guestmount. commit
perf probe: Support basic dwarf-based operations on uprobe events commit
perf record: add --initial-delay option commit, default -t option to no inheritance commit, make per-cpu mmaps the default. commit, rename --initial-delay to --delay commit, rename --no-delay to --no-buffering commit
perf report: Add --header/--header-only options commit
perf script: add --header/--header-only options commit, add an option to print the source line number commit, print callchains and symbols if they exist commit, print comm, fork and exit events also commit, print mmap events also commit
perf timechart: Add --highlight option commit, add backtrace support commit, add backtrace support to CPU info commit, add option to limit number of tasks commit, add support for -P and -T in timechart recording commit, add support for displaying only tasks related data commit, always try to print at least 15 tasks commit, group figures and add title with details commit,
perf tools: Add 'build-test' make target commit, add build and install plugins targets commit, allow '--inherit' as the negation of '--no-inherit' commit
perf trace: Add support for syscalls vs raw_syscalls commit
perf ui/tui: Implement header window commit
perf stat: Add event unit and scale support commit