#language en

''Note, This article is originally based on the article published in Linux Journal Embedded by Greg Kroah-Hartman, and has been added
here by the original author.  Please feel free to update it like any other page on this site.''

Almost all Linux kernel device drivers work on more than just one type of processor. This only happens because device-driver writers adhere to a few important rules. These rules include using the proper variable types, not relying on specific memory page sizes, being aware of endian issues with external data, setting up proper data alignment and accessing device memory locations through the proper interface. This article explains these rules, shows why it is important that they be followed and gives examples of them in use.

InternalKernelDataTypes

== Memory Issues ==

As we saw above in the example taken from drivers/char/serial.c, you can ask the kernel for a memory page. The size of a memory page is not always 4KB of data (as it is on i386). If you are going to be referencing memory pages, you need to use the {{{PAGE_SHIFT}}} and {{{PAGE_SIZE }}}defines.

{{{PAGE_SHIFT}}} is the number of bits to shift one bit left to get the {{{PAGE_SIZE}}} value. Different architectures define this to different values. The table below shows a short list of some architectures and the values of {{{PAGE_SHIFT}}} and the resulting value for {{{PAGE_SIZE}}}.

||||Architecture||PAGE_SHIFT||PAGE_SIZE||
||||i386||12||4K||
||||MIPS||12||4K||
||||Alpha||13||8K||
||||m68k||12||4K||
||||m68k||13||8K||
||||ARM||12||4K||
||||ARM||14||16K||
||||ARM||15||32K||
||||IA-64||12||4K||
||||IA-64||13||8K||
||||IA-64||14||16K||
||||IA-64||16||64K||



Even on the same base architecture type, you can have different page sizes. This depends sometimes on a configuration option (like IA-64) or is due to different variants of the processor type (like on ARM).

The code snippet from drivers/usb/audio.c in Listing 1 shows how PAGE_SHIFT and PAGE_SIZE are used when accessing memory directly.

Listing 1.
Accessing Memory Directly


Endian Issues

Processors store internal data in one of two ways: little-endian or big-endian. Little-endian processors store data with the right-most bytes (those with a higher address value) being the most significant, while big-endian processors store data with the left-most bytes (those with a lower address value) being the most significant.

For example, Table 2 shows how the decimal value 684686 is stored in a 4-byte integer on the two different processor types (684686 decimal = a72be hex = 00000000 00001010 01110010 10001110 binary).

Table 2.
How the Decimal Value 684686 is Stored in a 4-Byte Integer


Intel processors, for example the i386 and IA-64 series, are little-endian machines, whereas the SPARC processors are big-endian. The PowerPC processors can be run in either little- or big-endian mode, but for Linux, they are defined as running in big-endian mode. The ARM processor can be either, depending on the specific ARM chip being used, but usually it also runs in big-endian mode.

Because of the different endian types of processors, you need to be aware of data you receive from external sources and the order in which it appears. For example, the USB specification dictates that all multibyte data fields are in little-endian form. So if you have a USB driver that reads a multibyte field from the USB connection, you need to convert that data into the processor's native format. Code that assumes the processor is little-endian could ignore the data format coming from the USB connection successfully. But this same code would not work on PowerPC or ARM processors and is the leading cause of drivers that are broken on different platforms.

Thankfully, there are a number of helpful macros that have been created to make this an easy task. All of the following macros can be found in the asm/byteorder.h header file.

To convert from the processor's native format into little-endian form you can use the following functions:
{{{
#!cplusplus
le64 cpu_to_le64(u64);
le32 cpu_to_le32(u32);
le16 cpu_to_le16(u16);
}}}

To convert from little-endian format into the processor's native format you should use these functions:
{{{
#!cplusplus
 u64 le64_to_cpu (u64);
 u32 le32_to_cpu (u32);
 u16 le16_to_cpu (u16);
}}}

For big-endian forms, the following functions are available:

 u64 cpu_to_be64 (u64);
 u32 cpu_to_be32 (u32);
 u16 cpu_to_be16 (u16);
 u64 be64_to_cpu (u64);
 u32 be32_to_cpu (u32);
 u16 be16_to_cpu (u16);


If you have a pointer to the value to convert, then you should use the following functions:

 u64 cpu_to_le64p (u64 *);
 u32 cpu_to_le32p (u32 *);
 u16 cpu_to_le16p (u16 *);
 u64 le64_to_cpup (u64 *);
 u32 le32_to_cpup (u32 *);
 u16 le16_to_cpup (u16 *);
 u64 cpu_to_be64p (u64 *);
 u32 cpu_to_be32p (u32 *);
 u16 cpu_to_be16p (u16 *);
 u64 be64_to_cpup (u64 *);
 u32 be32_to_cpup (u32 *);
 u16 be16_to_cpup (u16 *);


If you want to convert the value within a variable and store the modified value in the same variable (in situ), then you should use the following functions:

 void cpu_to_le64s (u64 *);
 void cpu_to_le32s (u32 *);
 void cpu_to_le16s (u16 *);
 void le64_to_cpus (u64 *);
 void le32_to_cpus (u32 *);
 void le16_to_cpus (u16 *);
 void cpu_to_be64s (u64 *);
 void cpu_to_be32s (u32 *);
 void cpu_to_be16s (u16 *);
 void be64_to_cpus (u64 *);
 void be32_to_cpus (u32 *);
 void be16_to_cpus (u16 *);


As stated before, the USB protocol is in little-endian format. The code snippet from drivers/usb/serial/visor.c presented in Listing 2 shows how a structure is read from the USB connection and then converted into the proper CPU format.

Listing 2.
How a structure is read from the USB connection and converted into the proper CPU format.


Data Alignment

The gcc compiler typically aligns individual fields of a structure on whatever byte boundary it likes in order to provide faster execution. For example, consider the code and resulting output shown in Listing 3.

Listing 3.
Alignment of Individual Fields of a Structure


The output shows that the compiler aligned fields b and c in the struct foo on even byte boundaries. This is not a good thing when we want to overlay a structure on top of a memory location. Typically driver data structures do not have even byte padding for the individual fields. Because of this, the gcc attribute (packed) is used to tell the compiler not to place any "memory holes" within a structure.

If we change the struct foo structure to use the packed attribute like this:
{{{
#!cplusplus
struct foo {
        char    a;
        short   b;
        int     c;
} __attribute__ ((packed));
}}}

Then the output of the program changes to:
{{{
offset A = 0
offset B = 1
offset C = 3
}}}

Now there are no more memory holes in the structure.

This packed attribute can be used to pack an entire structure, as shown above, or it can be used only to pack a number of specific fields within a structure.

For example, the struct usb_ctrlrequest is defined in include/usb.h as the following:
{{{
#!cplusplus
struct usb_ctrlrequest {
        __u8 bRequestType;
        __u8 bRequest;
        __le16 wValue;
        __le16 wIndex;
        __le16 wLength;
} __attribute__ ((packed));
}}}

This ensures that the entire structure is packed, so that it can be used to write data directly to a USB connection.

But the definition of the struct usb_endpoint_descriptor looks like:
{{{
#!cplusplus
struct usb_endpoint_descriptor {
        __u8   bLength           __attribute__ ((packed));
        __u8   bDescriptorType   __attribute__ ((packed));
        __u8   bEndpointAddress  __attribute__ ((packed));
        __u8   bmAttributes      __attribute__ ((packed));
        __le16 wMaxPacketSize    __attribute__ ((packed));
        __u8   bInterval         __attribute__ ((packed));
        __u8   bRefresh          __attribute__ ((packed));
        __u8   bSynchAddress     __attribute__ ((packed));
        unsigned char *extra;   /* Extra descriptors */
        int extralen;
};
}}}

This ensures that the first part of the structure is packed and can be used to read directly from a USB connection, but the extra and extralen fields of the structure can be aligned to whatever the compiler thinks will be fastest to access.

== I/O Memory Access ==

Unlike on most typical embedded systems, accessing I/O memory on Linux cannot be done directly. This is due to the wide range of different memory types and maps present on the wide range of processors on which Linux runs. To access I/O memory in a portable manner, you must call ioremap() to gain access to a memory region and iounmap() to release access.

ioremap() is defined as:

 void * ioremap (unsigned long offset,
     unsigned long size);


You pass in a starting offset of the region you wish to access and the size of the region in bytes. You cannot just use the return value as a memory location to read and write from directly, but rather it is a token that must be passed to different functions to read and write data.

The functions to read and write data using memory mapped by ioremap() are:

 u8  readb (unsigned long token);    /* read 8 bits */
 u16 readw (unsigned long token);    /* read 16 bits */
 u32 readl (unsigned long token);    /* read 32 bits */
 void writeb (u8 value,
     unsigned long token);   /* write 8 bits */
 void writew (u16 value,
     unsigned long token);   /* write 16 bits */
 void writel (u32 value,
     unsigned long token);   /* write 32 bits */


After you are finished accessing memory, you must call iounmap() to free up the memory so that others can use it if they want to.

The code example in Listing 4 from the Compaq PCI Hot Plug driver in drivers/hotplug/cpqphp_core.c shows how to access a PCI device's resource memory properly.

Listing 4.
Accessing a PCI Device's Resource Memory


Accessing PCI Memory

To access the PCI memory of a device, you again must use some general functions and not try to access the memory directly. This is due to the different ways the PCI bus can be accessed, depending on the type of hardware you have. If you use the general functions, then your PCI driver will be able to work on any type of Linux system that has a PCI bus.

To read data from the PCI bus use the following functions:

 int pci_read_config_byte(struct pci_dev *dev,
     int where, u8 *val);
 int pci_read_config_word(struct pci_dev *dev,
     int where, u16 *val);
 int pci_read_config_dword(struct pci_dev *dev,
     int where, u32 *val);


and to write data, use these functions:

 int pci_write_config_byte(struct pci_dev *dev,
     int where, u8 val);
 int pci_write_config_word(struct pci_dev *dev,
     int where, u16 val);
 int pci_write_config_dword(struct pci_dev *dev,
     int where, u32 val);


Where are the pci_read_config_* and pci_write_config_* functions actually declared? If you look closely in the file drivers/pci/pci.c, you will see the following code:

#define PCI_OP(rw,size,type) \
int pci_##rw##_config_##size (struct pci_dev *dev,
                              int pos, type value) \
{                                                  \
    int res;                                       \
    unsigned long flags;                           \
    if (PCI_##size##_BAD) return
       PCIBIOS_BAD_REGISTER_NUMBER;                \
    spin_lock_irqsave(&pci_lock, flags);           \
    res = dev->bus->ops->rw##_##size(dev, pos,
                                     value);       \
    spin_unlock_irqrestore(&pci_lock, flags);      \
    return res;                                    \
}

PCI_OP(read, byte, u8 *)
PCI_OP(read, word, u16 *)
PCI_OP(read, dword, u32 *)
PCI_OP(write, byte, u8)
PCI_OP(write, word, u16)
PCI_OP(write, dword, u32)



This bit of macro fun creates the six pci_read_config_* and pci_write_config_* functions by abusing the C preprocessor #define of PCI_OP().

These functions allow you to write 8, 16 or 32 bits to a specific location that is assigned to a specific PCI device. If you wish to access the memory location of a specific PCI device that has not been initialized by the Linux PCI core yet, you can use the following functions that are present in the pci_hotplug core code:

 int pci_read_config_byte_nodev(struct pci_ops *ops,
     u8 bus, u8 device, u8 function, int where, u8 *val);
 int pci_read_config_word_nodev(struct pci_ops *ops,
     u8 bus, u8 device, u8 function, int where, u16 *val);
 int pci_read_config_dword_nodev(struct pci_ops *ops,
     u8 bus, u8 device, u8 function, int where, u32 *val);
 int pci_write_config_byte_nodev(struct pci_ops *ops,
     u8 bus, u8 device, u8 function, int where, u8 val);
 int pci_write_config_word_nodev(struct pci_ops *ops,
     u8 bus, u8 device, u8 function, int where, u16 val);
 int pci_write_config_dword_nodev(struct pci_ops *ops,
     u8 bus, u8 device, u8 function, int where, u32 val);


An example of reading and writing to PCI memory by a driver can be seen in the USB OHCI driver at drivers/usb/usb-ohci.c (see Listing 5).

Listing 5.
Reading and Writing to PCI Memory


Conclusion

If you follow these different rules when creating a new Linux kernel device driver, or when modifying an existing one, the resulting code will run successfully on a wide range of processors. These rules are also good to remember when debugging a driver that only works on one platform (remember those endian issues).

The most important resource to remember is to look at existing kernel drivers that are known to work on different platforms. One of Linux's strengths is the open access of its code, which provides a powerful learning tool for aspiring driver authors.