⇤ ← Revision 1 as of 2023-11-18 04:54:10
Size: 1291
Comment: First draft
|
Size: 3816
Comment: Flesh this out a bit more
|
Deletions are marked like this. | Additions are marked like this. |
Line 9: | Line 9: |
The memdesc contains a 4 bit field that describes what the remaining 60/28 bits are used for. Most memdescs use a pointer, but some memdescs can store all the extra information they need directly in the memdesc. |
The memdesc contains a 4 bit ''type'' field that describes what the remaining 60/28 bits are used for. |
Line 12: | Line 11: |
|| 0 || None, Reserved, Guard, ... || || 1 || File (struct folio pointer) || || 2 || Anon (struct anon_folio pointer) || || 3 || KSM (struct ksm pointer) || || 4 || Slab (struct slab pointer) || || 5 || Free (struct buddy pointer) || || 6 || Movable (struct movable pointer) || || 7 || VmallocMappable (struct pointer) || || 8 || VmallocUnmappable || || 9 || PageTable (struct ptdesc pointer) || || 10 || NetPool (struct netpool pointer) || || 11 || HWPoison (struct hwpoison pointer) || || 12-15 || not yet assigned || |
|| type || Meaning || Remaining bits || || 0 || No Pointer || See below || || 1 || File || Pointer to struct folio || || 2 || Anon || Pointer to struct anon_folio || || 3 || KSM || Pointer to struct ksm (TBD) || || 4 || Slab || Pointer to struct slab || || 5 || Free || Pointer to struct buddy (TBD) || || 6 || Movable || Pointer to struct movable (TBD) || || 7 || PageTable || Pointer to struct ptdesc || || 8 || NetPool || Pointer to struct netpool (TBD) || || 9 || HWPoison || Pointer to struct hwpoison (TBD) || || 10-15 || not yet assigned || || === Type 0 === Type 0 is used for allocations which do not require a struct pointer. The next four bits distinguish what kind of allocation this is: || subtype || Meaning || || 0 || Driver allocation || || 1 || PageReserved || || 2 || PageGuard || || 3 || PageOffline || || 4 || Vmalloc Mappable || || 5 || Vmalloc Unmappable || || 6-15 || not yet assigned || The high bits are used to store zone/node/... information, as is done today with the page flags. Some bits are also used to store the order of the allocation. === Memdesc pointers === |
Line 27: | Line 42: |
== Allocating memory == Device drivers that do not touch the contents of struct page can continue calling alloc_pages() as they do today. We'll add a new memdesc_alloc_pages() family which allocate the memory and set the page->memdesc to the passed-in memdesc. So each memdesc allocator will first use slab to allocate a memdesc, then allocate the pages that point to that memdesc. There's a minor recursion problem for the slab memdesc. This can be avoided by special-casing the struct slab allocation; any time we need to allocate a new slab for the slab memdesc cache, we _do not_ allocate a struct slab for it; we use the first object in the allocated memory for its own struct slab. I don't know how we'll handle alloc_pages_exact(). == Freeing memory == Folios (file/anon/ksm) have a refcount. These should be freed with folio_put(). Other memdescs may not have a refcount, XXX: We need to use memory to track free memory. This puts us in the awkward position of needing to allocate memory in order to free memory, and the memory might be highmem so we can't necessarily use the memory we're freeing. So we actually have to allocate the free memdesc at the time we allocate memory and then have a way to find a free memdesc at the == Memory control group == == Mapping memory into userspace == File, anon memory and KSM memory is rmappable. The rmap does not apply to other kinds of memory (networking, device driver, vmalloc, etc). These kinds of memory should be added to VM_MIXEDMAP or VM_PFNMAP mappings only. |
The ultimate goal of the folios project is to turn struct page into:
struct page { unsigned long memdesc; };
The memdesc contains a 4 bit type field that describes what the remaining 60/28 bits are used for.
type |
Meaning |
Remaining bits |
0 |
No Pointer |
See below |
1 |
File |
Pointer to struct folio |
2 |
Anon |
Pointer to struct anon_folio |
3 |
KSM |
Pointer to struct ksm (TBD) |
4 |
Slab |
Pointer to struct slab |
5 |
Free |
Pointer to struct buddy (TBD) |
6 |
Movable |
Pointer to struct movable (TBD) |
7 |
Pointer to struct ptdesc |
|
8 |
Pointer to struct netpool (TBD) |
|
9 |
HWPoison |
Pointer to struct hwpoison (TBD) |
10-15 |
not yet assigned |
|
Type 0
Type 0 is used for allocations which do not require a struct pointer. The next four bits distinguish what kind of allocation this is:
subtype |
Meaning |
0 |
Driver allocation |
1 |
|
2 |
|
3 |
|
4 |
Vmalloc Mappable |
5 |
Vmalloc Unmappable |
6-15 |
not yet assigned |
The high bits are used to store zone/node/... information, as is done today with the page flags. Some bits are also used to store the order of the allocation.
Memdesc pointers
All structs pointed to from a memdesc must be allocated from a slab which has its alignment set to 16 bytes (in order to allow the bottom 4 bits to be used for the type). That implies that they are a multiple of 16 bytes in size. The slab must also have the TYPESAFE_BY_RCU flag set as some page walkers will attempt to look up the memdesc from the page while holding only the RCU read lock.
Allocating memory
Device drivers that do not touch the contents of struct page can continue calling alloc_pages() as they do today.
We'll add a new memdesc_alloc_pages() family which allocate the memory and set the page->memdesc to the passed-in memdesc. So each memdesc allocator will first use slab to allocate a memdesc, then allocate the pages that point to that memdesc.
There's a minor recursion problem for the slab memdesc. This can be avoided by special-casing the struct slab allocation; any time we need to allocate a new slab for the slab memdesc cache, we _do not_ allocate a struct slab for it; we use the first object in the allocated memory for its own struct slab.
I don't know how we'll handle alloc_pages_exact().
Freeing memory
Folios (file/anon/ksm) have a refcount. These should be freed with folio_put(). Other memdescs may not have a refcount,
XXX:
We need to use memory to track free memory. This puts us in the awkward position of needing to allocate memory in order to free memory, and the memory might be highmem so we can't necessarily use the memory we're freeing. So we actually have to allocate the free memdesc at the time we allocate memory and then have a way to find a free memdesc at the
Memory control group
Mapping memory into userspace
File, anon memory and KSM memory is rmappable. The rmap does not apply to other kinds of memory (networking, device driver, vmalloc, etc). These kinds of memory should be added to VM_MIXEDMAP or VM_PFNMAP mappings only.