Before I start explaining what I found, I think it would be helpful to begin with a bit of a review of the ELF file format and how things get executed in Linux.
Executable File Formats
The product of compiling a C program is some machine language. But raw machine language isn't enough to allow the OS to run your code. The OS will want to know several pieces of meta-information with regards to your program before loading and running it; such as:
- information necessary to allow dynamic linking
- information about the size of your executable
- how much executable code and how much data space is used by your application
- how to lay out the application in memory
- debugging information
- ...and many many other things.
One way to solve this problem is to use a file format, a file format that contains not only the raw machine language code, but all the required additional information. There have been many such file formats devised over the years. Ones that I am familiar with include:
- DOS .exe
- A simple format consisting of a header (which contains all the required meta-information) plus the code itself. Very similar to most bitmap graphics file formats.
- DOS .com
- A "file format" that contains no meta-information whatsoever. This "format" is literally a raw dump of the machine language. All such meta-information is held in assumptions, the OS makes assumptions for any information it needs, and the programmer must follow these assumptions. Therefore, in a sense, it does "contain" meta-information, it's just that this information isn't contained anywhere in the file itself.
- a.out
- A fairly basic file format based on the notion of sections. Unfortunately, it was created before shared libraries became main-stream and couldn't specify the dynamic linking information easily. It also does not allow for an arbitrary number of sections.
- ELF
- A nicer format that expands on the ideas of the a.out format and adds additional features.