When NetBSD and FreeBSD swap out a page from a process, they look in the process virtual address space to find nearby pages that are also candidates for being swapped out. This greatly increases the chance of swapin IO pulling in related data, reducing the number of disk seeks involved in swap IO.

It would be good if somebody could implement this functionality in Linux.

The FreeBSD code that implements swapout clustering is in vm/vm_pageout.c. On Linux it will need to be slightly different due to the fact that Linux does not have BSD style VM objects.

For bonus points, keep track of how many of the swapped in pages actually get used and dynamically vary the IO clustering size for swapin. Maybe even have swapin readahead on a per-VMA basis.

Interested in implementing this feature? Go to #mm on or claim the feature on the KernelProjects page.

Since there are multiple sub-features that can be implemented, there could be enough work for multiple small projects or one large project.

An example patch that clusters virtually-related, inactive anonymous pages on LRU isolation time, without taking page table references into account, can be found here.

A swapin readahead implementation that takes both VMA-relationship and physical swap layout into account when looking for candidates can be found here.


KernelNewbies: KernelProjects/SwapoutClustering (last edited 2017-12-30 01:29:51 by localhost)