Linux 4.4 [https://lkml.org/lkml/2016/1/10/305 has been released] on Sun, 10 Jan 2016.

Summary: This release adds support for 3D support in virtual GPU driver, which allows 3D hardware-accelerated graphics in virtualization guests; loop device support for Direct I/O and Asynchronous I/O, which saves memory and increases performance; support for Open-channel SSDs, which are devices that share the responsibility of the Flash Translation Layer with the operating system; the TCP listener handling is completely lockless and allows for faster and more scalable TCP servers; journalled RAID5 in the MD layer which fixes the RAID write hole; eBPF programs can now be run by unprivileged users, they can be made persistent, and perf has added support for eBPF programs aswell; a new mlock2() syscall that allows users to request memory to be locked on page fault; and block polling support for improved performance in high-end storage devices. There are also new drivers and many other small improvements.

1. Prominent features

1.1. Faster and leaner loop device with Direct I/O and Asynchronous I/O support

This release introduces support of Direct I/O and asynchronous I/O for the loop block device. There are several advantages to use direct I/O and AIO on read/write loop's backing file: double cache is avoided due to Direct I/O which reduces memory usage a lot; unlike user space direct I/O there isn't cost of pinning pages; avoids context switches in some cases because concurrent submissions can be avoided. See commits for benchmarks.

Code: [https://git.kernel.org/torvalds/c/ab1cb278bc7027663adbfb0b81404f8398437e11 commit], [https://git.kernel.org/torvalds/c/2e5ab5f379f96a6207c45be40c357ebb1beb8ef3 commit], [https://git.kernel.org/torvalds/c/5b5e20f421c0b6d437b3dec13e53674161998d56 commit], [https://git.kernel.org/torvalds/c/bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6 commit], [https://git.kernel.org/torvalds/c/e03a3d7a94e2485b6e2fa3fb630b9b3a30b65718 commit]

1.2. 3D support in virtual GPU driver

virtio-gpu is a driver for virtualization guests that allows to use the host graphics card efficiently. In this release, it allows the virtualization guest to use the capabilities of the host GPU to accelerate 3D rendering. In practice, this means that a virtualized linux guest can run a opengl game while using the GPU acceleration capabilities of the host, as show in [https://www.youtube.com/watch?v=ONFGnUaln-4 this] or [https://www.youtube.com/watch?v=ZuuF092RDDc this] video. This also requires running [http://wiki.qemu.org/ChangeLog/2.5#virtio QEMU 2.5].

[https://virgil3d.github.io/Outdated? project page ]

[https://www.youtube.com/watch?v=rPeMrmeLTig 44m linux.conf talk about the project]

Code: [https://git.kernel.org/torvalds/c/3187567222178d4b3742e88242f7abb3c3b7a215 commit]

1.3. LightNVM adds support for Open-Channel SSDs

Open-channel SSDs are devices that share responsibilities with the operating system in order to implement and maintain features that typical SSDs keep strictly in firmware. These include the Flash Translation Layer (FTL), bad block management, and hardware units such as the flash controller, the interface controller, and large amounts of flash chips. In this way, Open-channels SSDs exposes direct access to their physical flash storage, while keeping a subset of the internal features of SSDs.

LightNVM is a specification that gives support to Open-channel SSDs. LightNVM allows the host to manage data placement, garbage collection, and parallelism. Device specific responsibilities such as bad block management, FTL extensions to support atomic IOs, or metadata persistence are still handled by the device. This Linux release adds support for lightnvm, (and adds support to NVMe as well).

Recommended LWN article: [https://lwn.net/Articles/641247/ Taking control of SSDs with LightNVM]

Code: [https://git.kernel.org/torvalds/c/48add0f5a6f46919dd307575aad6ea3de7c9cb2a commit], [https://git.kernel.org/torvalds/c/cd9e9808d18fe7107c306f6e71c8be7230ee42b4 commit], [https://git.kernel.org/torvalds/c/ca0640850e43f5f80c6029e2895b119b705f23bd commit], [https://git.kernel.org/torvalds/c/b2b7e00148a203e9934bbd17aebffae3f447ade7 commit], [https://git.kernel.org/torvalds/c/ae1519ec448bc31a7fe7369b66e7c78872f91e84 commit]

1.4. TCP listener handling completely lockless, making TCP servers faster and more scalable

In this release, and as a result from an effort that started two years ago, the TCP implementation has been refactored to make the TCP listener fast path completely lockless. During tests, a server was able to process 3,500,000 SYN packets per second on one listener and still have available cpu cycles - about 2 to 3 order of magnitude what it was possible before. SO_REUSEPORT has also been extended (see Networking section) to add proper cpu/numa affinities, so that heavy duty TCP servers can get proper siloing thanks to multi-queues NICs.

Code: [https://git.kernel.org/torvalds/c/4d54d86546f62c7c4a0fe3b36a64c5e3b98ce1a9 commit], [https://git.kernel.org/torvalds/c/e6934f3ec00b04234acb24a1a2c28af59763d3b5 commit], [https://git.kernel.org/torvalds/c/c3fc7ac9a0b978ee8538058743d21feef25f7b33 commit]

1.5. Journalled RAID5 MD support

This release adds journalled raid5 support to the MD (RAID/LVM) layer. With a journal device configured (typically NVRAM or SSD), Data/parity writing to raid array first writes to the log, then write to raid array disks. If crash happens, we can recovery data from the log. This can speed up raid resync and fixes RAID5 write hole issue - a crash during degraded operations cannot result in data corruption. In future releasees the journal will also be used to improve performance and latency

Code: [https://git.kernel.org/torvalds/c/ac322de6bf5416cb145b58599297b8be73cd86ac merge]

1.6. Unprivileged eBPF + persistent eBPF programs

Unprivileged eBPF

eBPF programs got its own syscall in [http://kernelnewbies.org/Linux_3.18#head-ead251efb6bbdbe2922e7c6bd0c7b46342e03dad Linux 3.18], but until now its use had been restricted to root, because these programs were dangerous for security. eBPF programs are, however, validated by the kernel, and in this release the eBPF verifier has been improved and unprivileged users can use it (although unprivileged eBPF is only meaningful for 'socket filter'-like programs, eBPF programs for tracing and TC classifiers/actions will stay root only). This feature can be switched off with the sysctl kernel.unprivileged_bpf_disabled (once true, bpf programs and maps cannot be accessed from unprivileged process, and the toggle cannot be set back to false)

Recommended LWN article: [http://lwn.net/Articles/660331/ Unprivileged bpf()]

Code: [https://git.kernel.org/torvalds/c/1be7f75d1668d6296b80bf35dcf6762393530afc commit], [https://git.kernel.org/torvalds/c/aaac3ba95e4c8b496d22f68bd1bc01cfbf525eca commit]

Persistent eBPF maps/progs

This release also adds support for "persistent" eBPF maps/programs. The term "persistent" is to be understood that maps/programs have a facility that lets them survive process termination. This is desired by various eBPF subsystem users, for example: tc classifier/action. Whenever tc parses the ELF object, extracts and loads maps/progs into the kernel, these file descriptors will be out of reach after the tc instance exits, so a subsequent tc invocation won't be able to access/relocate on this resource, and therefore maps cannot easily be shared, f.e. between the ingress and egress networking data path.

To fix issues as these, a new minimal file system has been created that can hold map/prog objects at /sys/fs/bpf/. Any subsequent mounts within a given namespace will point to the same instance. The file system allows for creating a user-defined directory structure. The objects for maps/progs are created/fetched through bpf(2) along with a pathname with two new commands (BPF_OBJ_PIN/BPF_OBJ_GET), that in turn creates the file system nodes. The user can use that to access maps and progs later on, through bpf(2).

Code: [https://git.kernel.org/torvalds/c/b2197755b2633e164a439682fb05a9b5ea48f706 commit], [https://git.kernel.org/torvalds/c/https://git.kernel.org/torvalds/c/42984d7c1e563bf92e6ca7a0fd89f8e933f2162e commit]

1.7. perf + eBPF integration

In this release, eBPF programs have been integrated with perf. When perf is given an eBPF .c source file (or .o file built for the 'bpf' target with clang), will get it automatically built, validated and loaded into the kernel, which can then be used and seen using perf trace and other tools.

Users are allowed to use BPF filter like: # perf record --event ./hello_world.o ls, and the eBPF program is attached to a newly created perf event which works with all tools.

1.8. Block polling support

This release adds basic support for polling for specific IO to complete, which can improve latency and throughput in very fast devices. Currently O_DIRECT sync read/write are supported. This support is only intended for testing, in future releases stats tracking will be used to auto-tune this. For now, for benchmark and testing purposes, we add a sysfs file (io_poll) that controls whether polling is enabled or not.

Recommended LWN article: [http://lwn.net/Articles/663879/ Block-layer I/O polling]

Code: [https://git.kernel.org/torvalds/c/15c4f638f3d41bae52105ca4c0c8760afbcbeaab commit], [https://git.kernel.org/torvalds/c/05229beeddf7e75e2e616ddaad4b70e7fca9528d commit], [https://git.kernel.org/torvalds/c/a0fa9647a54e81883abd57c5c865d1747f68a577 commit]

1.9. mlock2() syscall allow users to request memory to be locked on page fault

mlock() allows a user to control page out of program memory, but this comes at the cost of faulting in the entire mapping when it is allocated. For large mappings this is not ideal: For example, security applications that need mlock() are forced to lock an entire buffer, no matter how big it is. Or maybe a large graphical models where the path through the graph is not known until run time, they are forced to lock the entire graph or lock page by page as they are faulted in.

This new mlock2() syscall set creates a middle ground. Pages are marked to be placed on the unevictable LRU (locked) when they are first used, but they are not faulted in by the mlock call. The new system call that takes a flags argument along with the start address and size. This flags argument gives the caller the ability to request memory be locked in the traditional way, or to be locked after the page is faulted in. New calls are added for munlock() and munlockall() which give the called a way to specify which flags are supposed to be cleared. A new MCL flag is added to mirror the lock on fault behavior from mlock() in mlockall(). Finally, a flag for mmap() is added that allows a user to specify that the covered are should not be paged out, but only after the memory has been used the first time.

Recommended LWN article: [http://lwn.net/Articles/650538/ Deferred memory locking]

Code: [https://git.kernel.org/torvalds/c/de60f5f10c58d4f34b68622442c0e04180367f3f commit], [https://git.kernel.org/torvalds/c/b0f205c2a3082dd9081f9a94e50658c5fa906ff1 commit], [https://git.kernel.org/torvalds/c/a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e commit], [https://git.kernel.org/torvalds/c/1aab92ec3de552362397b718744872ea2d17add2 commit]

2. Drivers and architectures

All the driver and architecture-specific changes can be found in the [http://kernelnewbies.org/Linux_4.4-DriversArch Linux_4.4-DriversArch] page

3. Core (various)

process scheduler: Apply a frequency scaling correction factor to per-entity load tracking to make it invariant with respect to CPU frequency. Currently, load appears bigger when the CPU is running at slower frequencies, which affects load-balancing decisions [https://git.kernel.org/torvalds/c/e0f5f3afd2cffa96291cd852056d83ff4e2e99c7 commit], [https://git.kernel.org/torvalds/c/e3279a2e6d697e00e74f905851ee7cf532f72b2d commit]
seccomp: add support for dumping a process' (classic BFP) seccomp filters via ptrace + PTRACE_SECCOMP_GET_FILTER [https://git.kernel.org/torvalds/c/f8e529ed941ba2bbcbf310b575d968159ce7e895 commit]
watchdog: Mimic the softlockup_panic kernel knob and create a /proc/sys/kernel/hardlockup_panic. It enables a hardlockup to panic the machine [https://git.kernel.org/torvalds/c/ac1f591249d95372f3a5ab3828d4af5dfbf5efd3 commit]
watchdog: optionally perform all-CPU backtrace in case of hard lockup. Can be enabled with sysctl /proc/sys/kernel/hardlockup_all_cpu_backtrace [https://git.kernel.org/torvalds/c/55537871ef666b4153fd1ef8782e4a13fee142cc commit]
coredump: Add two new flags to the existing coredump mechanism for ELF and FDPIC ELF files to allow us to explicitly filter DAX mappings. This is desirable because DAX mappings, like hugetlb mappings, have the potential to be very large [https://git.kernel.org/torvalds/c/5037835c1f3eabf4f22163fc0278dd87165f8957 commit], [https://git.kernel.org/torvalds/c/ab27a8d04b32b6ee8c30c14c4afd1058e8addc82 commit]
test_printf: test printf family at runtime [https://git.kernel.org/torvalds/c/707cc7280f452a162c52bc240eae62568b9753c2 commit]
Make sync_file_range(2) use WB_SYNC_NONE writeback. It helps PostgreSQL avoid large latency spikes when flushing data in the background [https://git.kernel.org/torvalds/c/23d0127096cb91cb6d354bdc71bd88a7bae3a1d5 commit]

4. File systems

XFS
- Add per-filesystem stats in /sys/fs/xfs/<block>/stats/stats, and a stats_clear file to clear them. Also, the global stats that are currently present in /proc are duplicated in /sys/fs/xfs/stats/stats (along with a stats_clear file) [https://git.kernel.org/torvalds/c/bb230c124730f21eea13deab433f9f8fc96bd5f3 commit], [https://git.kernel.org/torvalds/c/225e4635580ce9fb12f8a2dc88473161cd64dbf6 commit], [https://git.kernel.org/torvalds/c/ff6d6af2351caea7db681f4539d0d893e400557a commit]
BTRFS
- Add fragment debug mount option. It can be used to cause extreme fragmentation in data, metadata or both [https://git.kernel.org/torvalds/c/d0bd456074dca089579818312da7cbe726ad2ff9 commit]
- Add balance filter for stripes. This is useful to selectively rebalance only chunks that do not span enough devices, applies to RAID0/10/5/6. [https://git.kernel.org/torvalds/c/dee32d0ac3719ef8d640efaf0884111df444730f commit]
CIFS
- Allow duplicate extents (cp --reflink) in SMB3.0 not just SMB3.1.1 [https://git.kernel.org/torvalds/c/ca9e7a1c85594f61d7ffb414071e6cae82eae23a commit]
- Add resilienthandles mount parameter. Since many servers (Windows clients, and non-clustered servers) do not support persistent handles but do support resilient handles, allow the user to specify a mount option "resilienthandles" in order to get more reliable connections and less chance of data loss (at least when SMB2.1 or later). Default resilient handle timeout (120 seconds to recent Windows server) is used [https://git.kernel.org/torvalds/c/592fafe644bf3a48b9e00e182a67d301493634fc commit]
- Add support for persistent handles, which are like durable file handles with strong guarantees [https://git.kernel.org/torvalds/c/b2a3077414fd6ff1de8972ea55e91f27bcabd913 commit], [https://git.kernel.org/torvalds/c/f16dfa7cd1b588e5d7ef4b5a19ee579f11b7a41f commit], [https://git.kernel.org/torvalds/c/b618f001a20e44f691dd0e2ffea651a40a651871 commit]
- Allow copy offload (copychunk) across shares [https://git.kernel.org/torvalds/c/7b52e2793a58af61b5d349c2c080437a437a4edb commit]
NFS
- Support for NFSv4.2 file CLONE using the btrfs ioctl [https://git.kernel.org/torvalds/c/21fad313d5890b674432fe3ad0c7bcf040320340 commit] [https://git.kernel.org/torvalds/c/e5341f3a5762d17be9cdd06257c02c0098bdcab8 commit], [https://git.kernel.org/torvalds/c/36022770de6cf9a403c40a68712ed2d2ea2746be commit], [https://git.kernel.org/torvalds/c/bea51b30b281039f0f43fb4f42028ddf33fb601f commit], [https://git.kernel.org/torvalds/c/a340abcf4173461f688292a6879b4d5bc781c2b1 commit]
EXT4
- Store checksum seed in superblock [https://git.kernel.org/torvalds/c/8c81bd8f586c46eaf114758a78d82895a2b081c2 commit]
OCFS2
- Improve performance for localalloc [https://git.kernel.org/torvalds/c/1d1aff8cf367d2216a678c722161784e207965c4 commit]
UBIFS
- atime support [https://git.kernel.org/torvalds/c/8c1c5f263833ec2dc8fd716cf4281265c485d7ad commit]

5. Memory management

Get rid of vmalloc_info from /proc/meminfo. It is too expensive to calculate and shows up in real workloads, people who actually want to know what the situation is wrt the vmalloc area should just look at the much more complete /proc/vmallocinfo instead [https://git.kernel.org/torvalds/c/a5ad88ce8c7fae7ddc72ee49a11a75aa837788e0 commit]
Add HugetlbPages field to /proc/PID/status. Currently there's no easy way to get per-process usage of hugetlb pages, which is inconvenient because userspace applications which use hugetlb can need it [https://git.kernel.org/torvalds/c/5d317b2b6536592a9b51fe65faed43d65ca9158e commit]
Add hugetlb-related fields to /proc/PID/smaps to know per-task or per-vma base hugetlb usage: AnonHugePages shows the amount of memory backed by transparent hugepage; Shared_Hugetlb and Private_Hugetlb show the amounts of memory backed by hugetlbfs page which is not counted in RSS or PSS field for historical reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field [https://git.kernel.org/torvalds/c/25ee01a2fca02dfb5a3ce316e77910c468108199 commit]
memcontrol: eliminate memory.current on the root level, because it doesn't add anything that wouldn't be more accurate and detailed using system statistics [https://git.kernel.org/torvalds/c/f5fc3c5d817435970aa301d066820a9ac12c8120 commit]

6. Block layer

Block polling support [https://git.kernel.org/torvalds/c/15c4f638f3d41bae52105ca4c0c8760afbcbeaab commit], [https://git.kernel.org/torvalds/c/05229beeddf7e75e2e616ddaad4b70e7fca9528d commit], [https://git.kernel.org/torvalds/c/a0fa9647a54e81883abd57c5c865d1747f68a577 commit]
loop: direct and asynchronous I/O [https://git.kernel.org/torvalds/c/ab1cb278bc7027663adbfb0b81404f8398437e11 commit], [https://git.kernel.org/torvalds/c/2e5ab5f379f96a6207c45be40c357ebb1beb8ef3 commit], [https://git.kernel.org/torvalds/c/5b5e20f421c0b6d437b3dec13e53674161998d56 commit], [https://git.kernel.org/torvalds/c/bc07c10a3603a5ab3ef01ba42b3d41f9ac63d1b6 commit], [https://git.kernel.org/torvalds/c/e03a3d7a94e2485b6e2fa3fb630b9b3a30b65718 commit]
Add Persistent Reservations support. It includes a user space interface for simplified Persistent Reservations which map to block devices that support these (only SCSI for now). Persistent Reservations allow restricting access to block devices to specific initiators in a shared storage setup [https://git.kernel.org/torvalds/c/bbd3e064362e5057cc4799ba2e4d68c7593e490b commit], [https://git.kernel.org/torvalds/c/924d55b06347d813b38c51e75ce1a6666c113933 commit], [https://git.kernel.org/torvalds/c/71cdb6978a80f9f6c51bef0622388c1414c2fe32 commit]
Export integrity data interval size in /sys/block/<disk>/integrity/protection_interval_bytes, so that apps can tell whether the interval is different from the device's logical block size [https://git.kernel.org/torvalds/c/4c241d08dbfcbdc7a949b91d72707a289d464954 commit]
cdrom: Random writing support for BD-RE media [https://git.kernel.org/torvalds/c/f7e7868b4743f1cc5e59e6e0ddd3ccf9cfe53a1b commit]

7. Cryptography

crypto: caam - add support for acipher xts(aes) [https://git.kernel.org/torvalds/c/c6415a6016bff0b547c13cadb1d5e50e9ace2be3 commit] crypto: keywrap - add key wrapping block chaining mode [https://git.kernel.org/torvalds/c/e28facde3c39005071cc5323d56539bb44efa446 commit] crypto: qat - add support for ctr(aes) and xts(aes) [https://git.kernel.org/torvalds/c/def14bfaf30d5d5a4a8fe5bf600ce09232e688c0 commit]

8. Security

9. Tracing and perf tool

Integration of perf with eBPF that, given an eBPF .c source file (or .o file built for the 'bpf' target with clang), will get it automatically built, validated and loaded into the kernel via the sys_bpf syscall, which can then be used and seen using 'perf trace' and other tools. Users can run commands like perf record --event bpf-file.c ls to try it [https://git.kernel.org/torvalds/c/69d262a93a25cf475012ea2e00aeb29f4932c028 commit], [https://git.kernel.org/torvalds/c/84c86ca12b2189df751eed7b2d67cb63bc8feda5 commit], [https://git.kernel.org/torvalds/c/ed63f34c026e9a60d17fa750ecdfe3f600d49393 commit], [https://git.kernel.org/torvalds/c/1f45b1d49073541947193bd7dac9e904142576aa commit], [https://git.kernel.org/torvalds/c/4edf30e39e6cff32390eaff6a1508969b3cd967b commit], [https://git.kernel.org/torvalds/c/71dc2326252ff1bcdddc05db03c0f831d16c9447 commit], [https://git.kernel.org/torvalds/c/d509db0473e40134286271b1d1adadccf42ac467 commit], [https://git.kernel.org/torvalds/c/aa3abf30bb28addcf593578d37447d42e3f65fc3 commit], [https://git.kernel.org/torvalds/c/1e5e3ee8ff3877db6943032b54a6ac21c095affd commit], [https://git.kernel.org/torvalds/c/ba1fae431e74bb427a699187434142fd3fe98390 commit]
Add a new branch type sampling filter to perf record, named 'call' (perf record -j call -e cycles .....), that samples only call branches (function calls), unlike 'any_call' that included direct, indirect calls and far jumps. Only x86 and PowerPC are supported in this release [https://git.kernel.org/torvalds/c/43e41adc9e8c36545888d78fed2ef8d102a938dc commit], [https://git.kernel.org/torvalds/c/c229bf9dc179d2023e185c0f705bdf68484c1e73 commit]
Add Intel cstate (aka idle states) Performance Monitoring Unit support. This allows perf to support cstate related free running (read-only and system-wide) counters. For example, to caculate the fraction of time when the core is running in C6 state: perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5 [https://git.kernel.org/torvalds/c/7ce1346a6842550a3c4c453cdf1c7b81fb60b07e commit]
CPU socket filtering: perf tools introduce a new sort type "socket" for the processor socket, eg. perf report --stdio --sort socket,comm,dso,symbol [https://git.kernel.org/torvalds/c/2e7ea3ab8282f6bb1d211d8af760a734c055f493 commit]. Also, perf report introduces a --socket-filter option for 'perf report' to only show entries for a processor socket that match this filter [https://git.kernel.org/torvalds/c/21394d948a0c7c451d4a4d68afed9a06c4969636 commit]. perf hists browser can zoom in/out for processor socket [https://git.kernel.org/torvalds/c/84734b06b63093cd44533f4caa43d4452fb11ec3 commit]
perf tools: Introduce 'P' modifier, it will cause the event to get maximum possible detected precise level. For example, perf record -e cycles:P ... will detect maximum precise level for 'cycles' event and use it [https://git.kernel.org/torvalds/c/7f94af7a489fada17d28cc60e8f4409ce216bd6d commit]
perf tools: Add support for sorting on the iaddr. New sort option is: symbol_iaddr, header label is 'Code Symbol', eg perf mem report --stdio -F +symbol_iaddr [https://git.kernel.org/torvalds/c/28e6db205b3ed3f1d86a00c69b3304190377da5f commit]
perf tools: enables config terms for tracepoint perf events. Valid terms for tracepoint events are 'call-graph' and 'stack-size', so different callgraph settings can be used for each event and eliminate unnecessary overhead. An example for using different call-graph config for each tracepoint: perf record -e syscalls:sys_enter_write/call-graph=fp -e syscalls:sys_exit_write/call-graph=no dd if=/dev/zero of=test bs=4k count=10 [https://git.kernel.org/torvalds/c/e637d17757a10732fa5d573c18f20b3cd4d31245 commit]
perf script: Enable printing of branch stack viaa the 'brstack' and 'brstacksym' arguments to the field selection option -F. The option is off by default and operates only if the perf.data file has branch stack content [https://git.kernel.org/torvalds/c/dc323ce8e72d6d1beb9af9bbd29c4d55ce3d7fb0 commit]
perf auxtrace: Add AUX area tracing option 'l' to synthesize branch stacks on samples just like sample type PERF_SAMPLE_BRANCH_STACK [https://git.kernel.org/torvalds/c/601897b54c7ed492a89b262dccd7c6f7faf12b30 commit]
perf hists browser: Add 'm' key for context menu display [https://git.kernel.org/torvalds/c/31eb4360546b4bd890f349db01295a173c09b0fb commit]
perf inject: Add --strip option which is used with --itrace to strip out non-synthesized events [https://git.kernel.org/torvalds/c/f56fb9864c501dc85ebe40af5bf925dd07d990c0 commit]
perf script: Allow time to be displayed in nanoseconds [https://git.kernel.org/torvalds/c/83e1986032dfcd3f9e9fc0d06e11d9153edae19b commit]
Intel PT hardware tracer: Accept a zero --itrace period, meaning "as often as possible". In the case of Intel PT that is the same as a period of 1 and a unit of 'instructions' (i.e. --itrace=i1i)[https://git.kernel.org/torvalds/c/e1791347b5d57d13326cf0114df1a3f3b1c4ca24 commit]
Intel PT: Add support for generating branch stack context for PT samples. This is useful for: reporting accurate basic block edge frequencies through the perf report branch view or using with --branch-history to get the wider context of samples. Examples, record with Intel PT: perf record -e intel_pt//u ls
ftrace: add module globbing [https://git.kernel.org/torvalds/c/0b507e1ed1b7364def464cfb348ea7c9e87e6e18 commit]

10. Virtualization

Support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). Used by KVM and VFIO [https://git.kernel.org/torvalds/c/f73f8173126ba68eb1c42bd9a234a51d78576ca6 commit]
KVM: Nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster [https://git.kernel.org/torvalds/c/99b83ac893b84ed1a62ad6d1f2b6cc32026b9e85 commit], [https://git.kernel.org/torvalds/c/089d7b6ec5151ad06a2cd524bc0580d311b641ad commit], [https://git.kernel.org/torvalds/c/5c614b3583e7b6dab0c86356fa36c2bcbb8322a0 commit]
KVM: Support for "split irqchip", i.e. LAPIC in kernel and IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor [https://git.kernel.org/torvalds/c/b053b2aef25d00773fa6762dcd4b7f5c9c42d171 commit], [https://git.kernel.org/torvalds/c/7543a635aa09eb138b2cbf60ac3ff19503ae6954 commit], [https://git.kernel.org/torvalds/c/1c1a9ce973a7863dd46767226bce2a5f12d48bc6 commit]
KVM: add capability for any-length ioeventfds. With KVM_CAP_IOEVENTFD_ANY_LENGTH, a zero length ioeventfd is allowed, and the kernel will ignore the length of guest write and may get a faster vmexit [https://git.kernel.org/torvalds/c/e9ea5069d9e569c32ab913c39467df32e056b3a7 commit]
VMware balloon: Get notified immediately via VMCI when a balloon target is set, instead of waiting for up to one second [https://git.kernel.org/torvalds/c/48e3d668b7902cca3c61e9e2098e7f76b5646c28 commit]
VMware balloon: Support ballooning with 2 MB sized pages. It significantly reduces the hypervisor side (and guest side) overhead of ballooning and unballooning [https://git.kernel.org/torvalds/c/365bd7ef7ec8eb9c2e081cd970a5cdfa237dc243 commit]
Vmware vmxnet3: Extend register dump support [https://git.kernel.org/torvalds/c/b6bd9b5448a9362e3ca33b21f1461baa5500520f commit]

11. Networking

Lockless TCP listener [https://git.kernel.org/torvalds/c/4d54d86546f62c7c4a0fe3b36a64c5e3b98ce1a9 commit], [https://git.kernel.org/torvalds/c/e6934f3ec00b04234acb24a1a2c28af59763d3b5 commit], [https://git.kernel.org/torvalds/c/c3fc7ac9a0b978ee8538058743d21feef25f7b33 commit]
[http://kb.linuxvirtualserver.org/wiki/IPVS IP Virtual Server]
- Support scheduling of ICMP packets to IPVS instances. A new sysctl net.ipv4.vs.schedule_icmp has been introduced, that will enable this feature if set to 1 (by default, it is set by default to 0 to retain the old behaviour) [https://git.kernel.org/torvalds/c/99cb99aa055a72d3880d8a95a71034c4d64bcf9a merge commit]
- Allow to ignore tunnelled packets with new Sysctl net.ipv4.vs.ignore_tunneled. If set, ipvs will set the ipvs_property on all packets which are of unrecognised protocols. This prevents the kernel from routing tunnelled protocols like ipip, which is useful to prevent rescheduling packets that have been tunneled to the ipvs host (i.e. to prevent ipvs routing loops when ipvs is also acting as a real server) [https://git.kernel.org/torvalds/c/4e478098ac0ac1b6ef9a70fcdc2ec8b93f1b59a1 commit]
Add setsockopt() support for SO_INCOMING_CPU and extend SO_REUSEPORT selection logic : If a TCP listener or UDP socket has this option set, a packet is delivered to this socket only if CPU handling the packet matches the specified one. This allows to build very efficient TCP servers, using one listener per RX queue, as the associated TCP listener should only accept flows handled in softirq by the same cpu. This provides optimal NUMA behavior and keep cpu caches hot [https://git.kernel.org/torvalds/c/76973dd79fd52f187ba3df018bca65792a3d942 commit], [https://git.kernel.org/torvalds/c/70da268b569d32a9fddeea85dc18043de9d89f89 commit]
Provide FIB table ID in ipv4 route dumps just as ipv6 does [https://git.kernel.org/torvalds/c/b7503e0cdb5dbec5d201aa69d8888c14679b5ae8 commit]
Allow the user to ask for the statistics to be filtered out of ipv4/ipv6 address netlink dumps, because many commonly used functions like getifaddrs() invoke RTM_GETLINK to dump the interface information, and do not need the AF_INET6 statistics, which are expensive to calculate [https://git.kernel.org/torvalds/c/d5566fd72ec1924958fcfd48b65c022c8f7eae64 commit]
wireless: implement Very High Throughput support for mesh networks [https://git.kernel.org/torvalds/c/c85fb53c4fa6521352028c40ce096a808aabd389 commit]
bridge: Allow setting the bridge attribute ageing_time in rocker and switchdev [https://git.kernel.org/torvalds/c/c62987bbd8a1a1664f99e89e3959339350a6131e commit], [https://git.kernel.org/torvalds/c/d0cf57f9dddb50ea404bf747a3c6b22b29f82b9a commit], [https://git.kernel.org/torvalds/c/f55ac58ae64cbb0315382e738681fe31837dcac0 commit]
vxlan: support both IPv4 and IPv6 sockets in a single vxlan device [https://git.kernel.org/torvalds/c/b1be00a6c39fda2ec380e168d7bcf96fb8c9da42 commit]
bridge: complete the bridge device's netlink support and makes it possible to view and configure everything that can be configured via sysfs [https://git.kernel.org/torvalds/c/3e087caa23ef36370bfb925d3bbca78e8302d3ce commit]
IPv4: Hash-based multipath routing. When the routing cache was [http://kernelnewbies.org/Linux_3.6#head-85de5e5247a939f2a61d0c5ccbc13ff5b4f1a6a0 removed in 3.6], the IPv4 multipath algorithm changed from more or less being destination-based into being quasi-random per-packet scheduling. This increased the risk of out-of-order packets and made it impossible to use multipath together with anycast services. In this release, the multipath routing implementation is replaced with a flow-based load balancing based on a hash over the source and destination addresses [https://git.kernel.org/torvalds/c/07355737a8badd951e6b72aa8609a2d6eed0a7e7 merge commit]
IPv6 support to the Virtual Routing and Forwarding (VRF) devices [https://git.kernel.org/torvalds/c/ccf3c8c3fe1bd4828556650ae7928da6ffb4aaf6 commit], [https://git.kernel.org/torvalds/c/35402e31366349a32b505afdfe856aeeb8d939a0 commit], [https://git.kernel.org/torvalds/c/ca254490c8dfdaddb5df8a763774db0f4c5200c3 commit]
TCP: Recent ACK (RACK) loss recovery. RACK loss recovery uses the notion of time instead of packet sequence (FACK) or counts (dupthresh) (see commit for details). In the current patch set RACK is only a supplemental loss detection and does not trigger fast recovery. However RACK is being developed to replace or consolidate FACK/dupthresh, early retransmit, and thin-dupack. Since RACK is still experimental, it is now used as a supplemental loss detection on top of existing algorithms. It can be disabled with sysctl net.ipv4.tcp_recovery [https://git.kernel.org/torvalds/c/eb9fae328faff9807a4ab5c1834b19f34dd155d4 commit]

TODO

mpls: flow-based multipath selection [https://git.kernel.org/torvalds/c/1c78efa8319cad2f10f421afa627745fb4d9b29f commit]
mpls: multipath route support [https://git.kernel.org/torvalds/c/f8efb73c97e2fa0abbe2e07c5c5df07800312643 commit]
bridge: allow adding of fdb entries pointing to the bridge device [https://git.kernel.org/torvalds/c/3741873b4f73b572b8f8835e6bd114e08316a160 commit]
bonding: support encapsulated ipv6 TSO [https://git.kernel.org/torvalds/c/e87eb4051efe76b35d0a297db772f5964a001544 commit]
net: Add support for filtering neigh dump by device index [https://git.kernel.org/torvalds/c/16660f0bd942cec203eaf4de0e2ac1695bd9d32d commit]
net: Add support for filtering neigh dump by master device [https://git.kernel.org/torvalds/c/21fdd092acc7ebda0dfe682008592eb79c382707 commit]
net/core: generic support for disabling netdev features down stack [https://git.kernel.org/torvalds/c/fd867d51f889aec11cca235ebb008578780d052d commit]
net/ethoc: support big-endian register layout [https://git.kernel.org/torvalds/c/06e60e5912c0373b15143cc52e4a11fafeaafff3 commit]
net/wireless: enable wiphy device to suspend/resume asynchronously [https://git.kernel.org/torvalds/c/9f0e13546ef5773b7059b531a667ec47a5f897ee commit]
net: Introduce L3 Master device abstraction [https://git.kernel.org/torvalds/c/1b69c6d0ae90b7f1a4f61d5c8209d5cb7a55f849 commit]
net: dummy: add more features [https://git.kernel.org/torvalds/c/8f3af27786913851e720bc9466d1abffcfa7aff6 commit]
net: tso: add support for IPv6 [https://git.kernel.org/torvalds/c/8941faa161b526199e55ca7764cf875383453612 commit]
netfilter: nfnetlink_log: allow to attach conntrack [https://git.kernel.org/torvalds/c/a29a9a585b2840a205f085a34dfd65c75e86f7c3 commit]
nl80211: put current TX power in interface info [https://git.kernel.org/torvalds/c/d55d0d598e6610bbfcc1f2ecd6e8af669b94783b commit]
nl80211: support vendor dumpit commands [https://git.kernel.org/torvalds/c/7bdbe400d1b2aac116513f90b75969ad2365fba6 commit]
nl802154: add support for security layer [https://git.kernel.org/torvalds/c/a26c5fd7622d4951425131d54a8c99f076fe2068 commit]
ipconfig: send Client-identifier in DHCP requests [https://git.kernel.org/torvalds/c/26fb342c734061859fec1bd9e987bb6b78061ef0 commit]
ipv4: implement support for NOPREFIXROUTE ifa flag for ipv4 address [https://git.kernel.org/torvalds/c/7b1311807f3d3eb8bef3ccc53127838b3bea3771 commit]
ipv6: gro: support sit protocol [https://git.kernel.org/torvalds/c/feec0cb3f20b837f8ca36e974267918d7a4497f8 commit]
ieee802154: 6lowpan: add tx/rx stats [https://git.kernel.org/torvalds/c/1c64f147d3cc9bbafe091a7b335ea3ec700186f0 commit]
if_link: Add control trust VF [https://git.kernel.org/torvalds/c/dd461d6aa894761fe67c30ddf81eec0d08be216b commit]
IB/addr: Pass network namespace as a parameter [https://git.kernel.org/torvalds/c/565edd1d555513ab5d67a847d50d7c14c82ef6c3 commit]
IB/cma: Add support for network namespaces [https://git.kernel.org/torvalds/c/fa20105e09e97e81aadf02f722c31195e4a75c84 commit]
IB/cma: Separate port allocation to network namespaces [https://git.kernel.org/torvalds/c/4be74b42a6d05a74a21362010cd3920fa17f63c7 commit]
IB/core: Add support of checksum capability reporting for RC and RAW [https://git.kernel.org/torvalds/c/470a55358186d0bb93558a87d13159dfbc989351 commit]
bpf, seccomp: prepare for upcoming criu support [https://git.kernel.org/torvalds/c/bab18991871545dfbd10c931eb0fe8f7637156a9 commit]
cfg80211: allow changing station capabilities for unassociated stations [https://git.kernel.org/torvalds/c/47edb11b522561658fe719e56aa69a3c3098a3fe commit]
cfg80211: reg: make CRDA support optional [https://git.kernel.org/torvalds/c/b68630369167a7fd2c4c3d1be96430defc59fb9a commit]
mac80211: advertise support for full station state in AP mode [https://git.kernel.org/torvalds/c/44674d9c2267f454f38df7b2395939bfa911f92e commit]
mac80211: allow the driver to advertise A-MSDU within A-MPDU Rx support [https://git.kernel.org/torvalds/c/99e7ca44bb910f0cbfda5d9008e8517df0ebc939 commit]
mac80211: allow to transmit A-MSDU within A-MPDU [https://git.kernel.org/torvalds/c/e3abc8ff0fc18b3925fd5d5c5fbd1613856f4e7c commit]
openvswitch: netlink attributes for IPv6 tunneling [https://git.kernel.org/torvalds/c/6b26ba3a7d952e611dcde1f3f77ce63bcc70540a commit]
switchdev: Add support for flood control [https://git.kernel.org/torvalds/c/741af0053b43d8b9a688a12c57ece62338616ae8 commit]
switchdev: Make flood to CPU optional [https://git.kernel.org/torvalds/c/371e59adcebf9953385bf46d5325ac39a53c5520 commit]
tipc: introduce capability bit for broadcast synchronization [https://git.kernel.org/torvalds/c/fd556f209af53b9cdc45df8c467feb235376c4df commit]
tipc: introduce jumbo frame support for broadcast [https://git.kernel.org/torvalds/c/959e1781aa230aecc90e4deb80117fd9a53dede7 commit]
xprtrdma: Enable swap-on-NFS/RDMA [https://git.kernel.org/torvalds/c/a045178887ebafa9514d6b4cb840ac13a26c8365 commit]