KernelNewbies:

by Arnout Vandecappelle, Mind

In the kernel, malloc() is not available. Instead, the kernel has to define its own memory allocation functions. However, many different allocation mechanisms exist. This article gives an overview of them.

References

The memory manager is discussed as part of an introductory course.

http://linux-mm.org/LinuxMMInternals is a wiki about the kernel memory manager.

http://www.win.tue.nl/~aeb/linux/lk/lk-9.html and http://www.linuxjournal.com/article/6930 give an overview of the three main kernel memory allocation mechanisms.

http://www.informit.com/content/images/0131453483/downloads/gorman_book.pdf is a complete book on the linux kernel memory managers. It's a bit too detailed, though.

Summary

All allocations take place from one out of three zones: ZONE_DMA (which is accessible by ISA DMA), ZONE_NORMAL, and ZONE_HIGHMEM (which is not directly accessible by the kernel but requires virtual-to-physical address translation through the MMU; it is required for large memory on 32-bit machines).

HIGHMEM

See http://linux-mm.org/HighMemory.

The Linux kernel normally uses a very simple way to map virtual to physical addresses: subtract PAGE_OFFSET (0xC000000 on x86). However, that leaves only 1GiB of addressable space for the kernel. Therefore, the kernel defines high memory. When high memory is allocated, it is not directly addressable. To address it, first the kmap() function has to be called to enter the memory page into the kernel page table. Then the address is valid, until kunmap() is called. The kmap() - kunmap() sequence has to be entered around every access to this page.

The HIGHMEM is mostly relevant for I/O buffers to mass storage devices: they require a lot of kernel space and may eat up the 1GiB address space. The kernel provides an additional feature, bounce buffers, (cfr. bounce_buffer_create) to manage this type of buffer on large memory systems.

DMA

https://elixir.bootlin.com/linux/latest/source/Documentation/core-api/dma-api.rst and https://elixir.bootlin.com/linux/latest/source/Documentation/core-api/dma-api-howto.rst in the kernel source tree document how to do DMA. There is a large overlap in the content of the two documents. dma-api.rst is a bit more high-level. However, dma-api-howto.rst contains some good skeleton code you can start from when writing a driver.

DMA requires some memory space that can be accessed by the hardware (which often requires it to be in the ZONE_DMA memory region), which is not cached, and which is physically contiguous. Therefore, drivers of DMA hardware use dma_alloc_coherent() to allocate DMA-able space. If it's DMA over the PCI bus, pci_alloc_consistent() is used instead. For USB, it's usb_buffer_alloc(). Note that you still need to use memory barriers to make sure the accesses are not reordered by the processor. Basically, the only thing guaranteed here is that the DMA region is uncacheable.

How this coherency/consistency is guaranteed is processor-dependent, therefore these functions are implemented in the architecture-specific directories.

Since dma_alloc_coherent() allocates at least a full page, use dma_pool_create() to allocate space for smaller transfers. Then, take some space from the pool with dma_pool_alloc().

Since the cache-coherent mapping may be expensive, also a streaming allocation exists. This is a buffer for one-way communication, which means coherency is limited to flushing the data from the cache after a write finishes. The buffer has to be pre-allocated (e.g. using kmalloc()). DMA for it is set up with dma_map_single(). When the DMA is finished (e.g. when the device has sent an interrupt signaling end of DMA), call dma_unmap_single(). Between map and unmap, the device is in control of the buffer: if you write to the device, do it before dma_map_single(), if you read from it, do it after dma_unmap_single().

The streaming DMA may use bounce buffers if necessary (i.e. if the physical address is not accessible by the device DMA, as specified by the DMA mask set for the device by dma_set_mask()). Bounce buffers require extra memory-to-memory copies. This is an issue on large-memory systems for 32 (or less)-bit devices. Note that the implementation of dma_unmap_single() is architecture-specific and may not include bounce buffers (e.g. on x86 it doesn't and there's no check).

If the buffer is not physically contiguous, it must be passed through a scatter/gather list. Use dma_map_sg() instead of dma_map_single().

If you're doing a lot of DMA, you would normally have a sequence of map-unmap-map-unmap requests. Rather than unmapping, you can keep the address mapped and just synchronise with dma_sync_single_for_cpu() or dma_sync_single_for_device(), as appropriate.


CategoryKernelHacking

KernelNewbies: KernelMemoryAllocation (last edited 2021-01-13 04:36:03 by RandyDunlap)