When NetBSD and FreeBSD swap out a page from a process, they look in the process virtual address space to find nearby pages that are also candidates for being swapped out. This greatly increases the chance of swapin IO pulling in related data, reducing the number of disk seeks involved in swap IO.
It would be good if somebody could implement this functionality in Linux.
The FreeBSD code that implements swapout clustering is in [http://fxr.watson.org/fxr/source/vm/vm_pageout.c#L312 vm/vm_pageout.c]. On Linux it will need to be slightly different due to the fact that Linux does not have BSD style VM objects.
For bonus points, keep track of how many of the swapped in pages actually get used and dynamically vary the IO clustering size for swapin. Maybe even have swapin readahead on a per-VMA basis.
Interested in implementing this feature? Go to #mm on irc.oftc.net or claim the feature on the KernelProjects page.
- At swapout time, allocate a larger chunk of swap at once.
- Gather up pages from virtual addresses near the selected page at pageout time.
- Where shrink_page_list() calls add_to_swap()?
- Test the nearby pages:
- Recently referenced? Do not swap out.
- Active? This needs testing
- What to do with the found pages?
- Write them to swap, obviously.
- Evict them early? Maybe this is a good idea, maybe not. This needs testing.
Since there are multiple sub-features that can be implemented, there could be enough work for multiple small projects or one large project.
An example patch that clusters virtually-related, inactive anonymous pages on LRU isolation time, without taking page table references into account, can be found [http://cmpxchg.org/~hannes/kernel/mm-reclaim-anonymous-memory-in-virtual-clusters.patch here].
A swapin readahead implementation that takes both VMA-relationship and physical swap layout into account when looking for candidates can be found [http://cmpxchg.org/~hannes/kernel/mm-virtual-swap-readahead/ here].