• Immutable Page
  • Info
  • Attachments

InitcallMechanism/BackgroundInformation

Before I start explaining what I found, I think it would be helpful to begin with a bit of a review of the ELF file format and how things get executed in Linux.

Executable File Formats

The product of compiling a C program is some machine language. But raw machine language isn't enough to allow the OS to run your code. The OS will want to know several pieces of meta-information with regards to your program before loading and running it; such as:

  • information necessary to allow dynamic linking

  • information about the size of your executable

  • how much executable code and how much data space is used by your application

  • how to lay out the application in memory

  • debugging information

  • ...and many many other things.

One way to solve this problem is to use a file format, a file format that contains not only the raw machine language code, but all the required additional information. There have been many such file formats devised over the years. Ones that I am familiar with include:

  • DOS .exe

    • A simple format consisting of a header (which contains all the required meta-information) plus the code itself. Very similar to most bitmap graphics file formats.

  • DOS .com

    • A "file format" that contains no meta-information whatsoever. This "format" is literally a raw dump of the machine language. All such meta-information is held in assumptions, the OS makes assumptions for any information it needs, and the programmer must follow these assumptions. Therefore, in a sense, it does "contain" meta-information, it's just that this information isn't contained anywhere in the file itself.

  • a.out

    • A fairly basic file format based on the notion of sections. Unfortunately, it was created before shared libraries became main-stream and couldn't specify the dynamic linking information easily. It also does not allow for an arbitrary number of sections.

  • ELF

    • A nicer format that expands on the ideas of the a.out format and adds additional features.

Linker

The linker (in our case, GNU ld) is responsible for taking the raw machine code from the compiler/assembler and creating a valid ELF file. ELF files consist of several different sections and have many different abilities. In other words, laying out an ELF file can be quite involved considering all the options that are available and all the information that needs to be stored. The linker, therefore, uses a script which helps guide how the output file is laid out. If you don't supply a script, there is an implicit default one provided for you.

The different sections and their attributes are used by, for example, the operating system's loader (when you want to execute an application). For example, some sections contain executable code that needs to be loaded into memory, other sections don't contain any data at all, but rather they instruct the loader to allocate some memory for the application to use (sometimes this memory even needs to be explicitly zeroed). Some sections aren't used by the loader at all, for example, sections that contain debugging information are of no use to the loader but are used by a debugger instead.

By default, executable code you write ends up in a section called .text, initialised data in a .data section, read-only data in .rodata, uninitialised data in .bss, and so on.

Compiler

All compilers have their own little extensions built into them which either extend the language in some way or provide the programmer with #pragma-like control of the environment; GCC is no exception. Of particular interest is the ability GCC gives the developer to specify the name of the section into which to place some object. The name of the ELF segment into which this object will be placed is only one of numerous such attributes the belong to this object.

Tell others about this page:

last edited 2006-10-11 15:58:01 by TrevorWoerner