KernelNewbies:

Before/After main()

Specifying an attribute gives the compiler information about how an object is intended to be used, thereby allowing it to not only better optimize your code but to also perform additional checks for you. In general (using gcc) attributes can be specified for functions, variables, and types. Full information can be found by visiting the GCC onlinedocs website and looking for the relevant subsections on attributes.

Unless you've stumbled across this before, you probably thought that the first line of your main() is the first line of code that gets executed when your executible is run. This isn't true. There are a number of functions that run before your main() gets called. Then, after your main() terminates, a number of additional clean-up routines are also called. Your main() is just one of several functions for the loader to run.

gcc allows you to specify functions it should call during the phase before main() is called as well as functions to call during the phase after main() is done. The following code demonstrates this and serves as an example of how to specify attributes on functions.

/*
 * Copyright (C) 2006  Trevor Woerner
 */

#include <stdio.h>

void my_ctor (void) __attribute__ ((constructor));
void my_dtor (void) __attribute__ ((destructor));

void
my_ctor (void)
{
        printf ("hello before main()\n");
}

void
my_dtor (void)
{
        printf ("bye after main()\n");
}

int
main (void)
{
        printf ("hello\nbye\n");
        return 0;
}

Compiling and running the above yields:

[trevor@trevor code]$ make beforeafter
cc     beforeafter.c   -o beforeafter
[trevor@trevor code]$ ./beforeafter 
hello before main()
hello
bye
bye after main()


Section and Object Layout

For this example we're going to build ./main composed of main.c and add.c:

/*
 * Copyright (C) 2006  Trevor Woerner
 */

#include <stdio.h>

int add (int, int);
int global_val;
int gval_init = 0;

int
main (void)
{
        int local_val = 25;
        global_val = 17;

        printf ("local_val: %d    global_val: %d    gval_init: %d\n",
                        local_val, global_val, gval_init);
        printf ("%d + %d = %d\n", local_val, global_val,
                        add (local_val, global_val));

        return 0;
}

/*
 * Copyright (C) 2006  Trevor Woerner
 */

int
add (int i, int j)
{
        return i+j;
}

By the way, the local_val, gval_init, and global_val were just added so we could see which sections they end up in.

Compiling is a simple process of:

[trevor@trevor code]$ gcc -c main.c
[trevor@trevor code]$ gcc -c add.c
[trevor@trevor code]$ gcc -o main main.o add.o

Here is a dump of the info from add.o:

[trevor@trevor code]$ objdump -t add.o

add.o:     file format elf32-i386

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 add.c
00000000 l    d  .text  00000000 .text
00000000 l    d  .data  00000000 .data
00000000 l    d  .bss   00000000 .bss
00000000 l    d  .note.GNU-stack        00000000 .note.GNU-stack
00000000 l    d  .comment       00000000 .comment
00000000 g     F .text  0000000b add

Here is a similar dump of main.o:

[trevor@trevor code]$ objdump -t main.o

main.o:     file format elf32-i386

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 main.c
00000000 l    d  .text  00000000 .text
00000000 l    d  .data  00000000 .data
00000000 l    d  .bss   00000000 .bss
00000000 l    d  .rodata        00000000 .rodata
00000000 l    d  .note.GNU-stack        00000000 .note.GNU-stack
00000000 l    d  .comment       00000000 .comment
00000000 g     O .bss   00000004 gval_init
00000000 g     F .text  0000007d main
00000004       O *COM*  00000004 global_val
00000000         *UND*  00000000 printf
00000000         *UND*  00000000 add

Here is a dump of the final executable:

[trevor@trevor code]$ objdump -t main

main:     file format elf32-i386

SYMBOL TABLE:
08048114 l    d  .interp        00000000              .interp
08048128 l    d  .note.ABI-tag  00000000              .note.ABI-tag
08048148 l    d  .hash  00000000              .hash
08048174 l    d  .dynsym        00000000              .dynsym
080481d4 l    d  .dynstr        00000000              .dynstr
08048234 l    d  .gnu.version   00000000              .gnu.version
08048240 l    d  .gnu.version_r 00000000              .gnu.version_r
08048260 l    d  .rel.dyn       00000000              .rel.dyn
08048268 l    d  .rel.plt       00000000              .rel.plt
08048280 l    d  .init  00000000              .init
08048298 l    d  .plt   00000000              .plt
080482d8 l    d  .text  00000000              .text
08048488 l    d  .fini  00000000              .fini
080484a4 l    d  .rodata        00000000              .rodata
080484ec l    d  .eh_frame      00000000              .eh_frame
080494f0 l    d  .ctors 00000000              .ctors
080494f8 l    d  .dtors 00000000              .dtors
08049500 l    d  .jcr   00000000              .jcr
08049504 l    d  .dynamic       00000000              .dynamic
080495cc l    d  .got   00000000              .got
080495d0 l    d  .got.plt       00000000              .got.plt
080495e8 l    d  .data  00000000              .data
080495f4 l    d  .bss   00000000              .bss
00000000 l    d  .comment       00000000              .comment
00000000 l    d  *ABS*  00000000              .shstrtab
00000000 l    d  *ABS*  00000000              .symtab
00000000 l    d  *ABS*  00000000              .strtab
080482fc l     F .text  00000000              call_gmon_start
00000000 l    df *ABS*  00000000              crtstuff.c
080494f0 l     O .ctors 00000000              __CTOR_LIST__
080494f8 l     O .dtors 00000000              __DTOR_LIST__
08049500 l     O .jcr   00000000              __JCR_LIST__
080495f4 l     O .bss   00000001              completed.4583
080495f0 l     O .data  00000000              p.4582
08048320 l     F .text  00000000              __do_global_dtors_aux
08048354 l     F .text  00000000              frame_dummy
00000000 l    df *ABS*  00000000              crtstuff.c
080494f4 l     O .ctors 00000000              __CTOR_END__
080494fc l     O .dtors 00000000              __DTOR_END__
080484ec l     O .eh_frame      00000000              __FRAME_END__
08049500 l     O .jcr   00000000              __JCR_END__
08048460 l     F .text  00000000              __do_global_ctors_aux
00000000 l    df *ABS*  00000000              main.c
00000000 l    df *ABS*  00000000              add.c
08049504 g     O .dynamic       00000000              _DYNAMIC
080495fc g     O .bss   00000004              global_val
080484a4 g     O .rodata        00000004              _fp_hw
080494f0 g       *ABS*  00000000              .hidden __fini_array_end
080495ec g     O .data  00000000              .hidden __dso_handle
08048458 g     F .text  00000005              __libc_csu_fini
08048280 g     F .init  00000000              _init
080483fc g     F .text  0000000b              add
080495f8 g     O .bss   00000004              gval_init
080482d8 g     F .text  00000000              _start
080494f0 g       *ABS*  00000000              .hidden __fini_array_start
08048408 g     F .text  0000004f              __libc_csu_init
080495f4 g       *ABS*  00000000              __bss_start
0804837c g     F .text  0000007d              main
00000000       F *UND*  00000187              __libc_start_main@@GLIBC_2.0
080494f0 g       *ABS*  00000000              .hidden __init_array_end
080495e8  w      .data  00000000              data_start
00000000       F *UND*  00000039              printf@@GLIBC_2.0
08048488 g     F .fini  00000000              _fini
080494f0 g       *ABS*  00000000              .hidden __preinit_array_end
080495f4 g       *ABS*  00000000              _edata
080495d0 g     O .got.plt       00000000              .hidden _GLOBAL_OFFSET_TABLE_
08049600 g       *ABS*  00000000              _end
080494f0 g       *ABS*  00000000              .hidden __init_array_start
080484a8 g     O .rodata        00000004              _IO_stdin_used
080495e8 g       .data  00000000              __data_start
00000000  w      *UND*  00000000              _Jv_RegisterClasses
080494f0 g       *ABS*  00000000              .hidden __preinit_array_start
00000000  w      *UND*  00000000              __gmon_start__

The interesting thing for me is how such a small amount of code generates such a large number of segments! Notice how both of the *.o files contain their own .text, .data, and .bss segments. When they are combined into the one final executable main, it contains just one of each such segment (i.e. it makes no distinction about where the specific parts come from, they all get combined into one larger segment of the same name).

If we want to know the linker script that was used (to find out how ld lays out all the sections), all we have to do is pass the --verbose flag to ld via gcc (like this: gcc -Wl,--verbose ...) and we will get the linker script spat out on stderr. Here is the linker script that I get for this code:

/* Script for -z combreloc: combine and sort reloc sections */
OUTPUT_FORMAT("elf32-i386", "elf32-i386",
              "elf32-i386")
OUTPUT_ARCH(i386)
ENTRY(_start)
SEARCH_DIR("/usr/i386-redhat-linux/lib"); SEARCH_DIR("/usr/local/lib"); SEARCH_DIR("/lib"); SEARCH_DIR("/usr/lib");
/* Do we need any of these for elf?
   __DYNAMIC = 0;    */
SECTIONS
{
  /* Read-only sections, merged into text segment: */
  PROVIDE (__executable_start = 0x08048000); . = 0x08048000 + SIZEOF_HEADERS;
  .interp         : { *(.interp) }
  .hash           : { *(.hash) }
  .dynsym         : { *(.dynsym) }
  .dynstr         : { *(.dynstr) }
  .gnu.version    : { *(.gnu.version) }
  .gnu.version_d  : { *(.gnu.version_d) }
  .gnu.version_r  : { *(.gnu.version_r) }
  .rel.dyn        :
    {
      *(.rel.init)
      *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*)
      *(.rel.fini)
      *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*)
      *(.rel.data.rel.ro*)
      *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*)
      *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*)
      *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*)
      *(.rel.ctors)
      *(.rel.dtors)
      *(.rel.got)
      *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*)
    }
  .rela.dyn       :
    {
      *(.rela.init)
      *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
      *(.rela.fini)
      *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
      *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
      *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*)
      *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*)
      *(.rela.ctors)
      *(.rela.dtors)
      *(.rela.got)
      *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
    }
  .rel.plt        : { *(.rel.plt) }
  .rela.plt       : { *(.rela.plt) }
  .init           :
  {
    KEEP (*(.init))
  } =0x90909090
  .plt            : { *(.plt) }
  .text           :
  {
    *(.text .stub .text.* .gnu.linkonce.t.*)
    KEEP (*(.text.*personality*))
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
  } =0x90909090
  .fini           :
  {
    KEEP (*(.fini))
  } =0x90909090
  PROVIDE (__etext = .);
  PROVIDE (_etext = .);
  PROVIDE (etext = .);
  .rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
  .rodata1        : { *(.rodata1) }
  .eh_frame_hdr : { *(.eh_frame_hdr) }
  .eh_frame       : ONLY_IF_RO { KEEP (*(.eh_frame)) }
  .gcc_except_table   : ONLY_IF_RO { KEEP (*(.gcc_except_table)) *(.gcc_except_table.*) }
  /* Adjust the address for the data segment.  We want to adjust up to
     the same address within the page on the next page up.  */
  . = ALIGN (0x1000) - ((0x1000 - .) & (0x1000 - 1)); . = DATA_SEGMENT_ALIGN (0x1000, 0x1000);
  /* Exception handling  */
  .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) }
  .gcc_except_table   : ONLY_IF_RW { KEEP (*(.gcc_except_table)) *(.gcc_except_table.*) }
  /* Thread Local Storage sections  */
  .tdata          : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
  .tbss           : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
  /* Ensure the __preinit_array_start label is properly aligned.  We
     could instead move the label definition inside the section, but
     the linker would then create the section even if it turns out to
     be empty, which isn't pretty.  */
  . = ALIGN(32 / 8);
  PROVIDE (__preinit_array_start = .);
  .preinit_array     : { KEEP (*(.preinit_array)) }
  PROVIDE (__preinit_array_end = .);
  PROVIDE (__init_array_start = .);
  .init_array     : { KEEP (*(.init_array)) }
  PROVIDE (__init_array_end = .);
  PROVIDE (__fini_array_start = .);
  .fini_array     : { KEEP (*(.fini_array)) }
  PROVIDE (__fini_array_end = .);
  .ctors          :
  {
    /* gcc uses crtbegin.o to find the start of
       the constructors, so we make sure it is
       first.  Because this is a wildcard, it
       doesn't matter if the user does not
       actually link against crtbegin.o; the
       linker won't look for a file to match a
       wildcard.  The wildcard also means that it
       doesn't matter which directory crtbegin.o
       is in.  */
    KEEP (*crtbegin*.o(.ctors))
    /* We don't want to include the .ctor section from
       from the crtend.o file until after the sorted ctors.
       The .ctor section from the crtend file contains the
       end of ctors marker and it must be last */
    KEEP (*(EXCLUDE_FILE (*crtend*.o ) .ctors))
    KEEP (*(SORT(.ctors.*)))
    KEEP (*(.ctors))
  }
  .dtors          :
  {
    KEEP (*crtbegin*.o(.dtors))
    KEEP (*(EXCLUDE_FILE (*crtend*.o ) .dtors))
    KEEP (*(SORT(.dtors.*)))
    KEEP (*(.dtors))
  }
  .jcr            : { KEEP (*(.jcr)) }
  .data.rel.ro : { *(.data.rel.ro.local) *(.data.rel.ro*) }
  .dynamic        : { *(.dynamic) }
  .got            : { *(.got) }
  . = DATA_SEGMENT_RELRO_END (12, .);
  .got.plt        : { *(.got.plt) }
  .data           :
  {
    *(.data .data.* .gnu.linkonce.d.*)
    KEEP (*(.gnu.linkonce.d.*personality*))
    SORT(CONSTRUCTORS)
  }
  .data1          : { *(.data1) }
  _edata = .;
  PROVIDE (edata = .);
  __bss_start = .;
  .bss            :
  {
   *(.dynbss)
   *(.bss .bss.* .gnu.linkonce.b.*)
   *(COMMON)
   /* Align here to ensure that the .bss section occupies space up to
      _end.  Align after .bss to ensure correct alignment even if the
      .bss section disappears because there are no input sections.  */
   . = ALIGN(32 / 8);
  }
  . = ALIGN(32 / 8);
  _end = .;
  PROVIDE (end = .);
  . = DATA_SEGMENT_END (.);
  /* Stabs debugging sections.  */
  .stab          0 : { *(.stab) }
  .stabstr       0 : { *(.stabstr) }
  .stab.excl     0 : { *(.stab.excl) }
  .stab.exclstr  0 : { *(.stab.exclstr) }
  .stab.index    0 : { *(.stab.index) }
  .stab.indexstr 0 : { *(.stab.indexstr) }
  .comment       0 : { *(.comment) }
  /* DWARF debug sections.
     Symbols in the DWARF debugging sections are relative to the beginning
     of the section so we begin them at 0.  */
  /* DWARF 1 */
  .debug          0 : { *(.debug) }
  .line           0 : { *(.line) }
  /* GNU DWARF 1 extensions */
  .debug_srcinfo  0 : { *(.debug_srcinfo) }
  .debug_sfnames  0 : { *(.debug_sfnames) }
  /* DWARF 1.1 and DWARF 2 */
  .debug_aranges  0 : { *(.debug_aranges) }
  .debug_pubnames 0 : { *(.debug_pubnames) }
  /* DWARF 2 */
  .debug_info     0 : { *(.debug_info .gnu.linkonce.wi.*) }
  .debug_abbrev   0 : { *(.debug_abbrev) }
  .debug_line     0 : { *(.debug_line) }
  .debug_frame    0 : { *(.debug_frame) }
  .debug_str      0 : { *(.debug_str) }
  .debug_loc      0 : { *(.debug_loc) }
  .debug_macinfo  0 : { *(.debug_macinfo) }
  /* SGI/MIPS DWARF 2 extensions */
  .debug_weaknames 0 : { *(.debug_weaknames) }
  .debug_funcnames 0 : { *(.debug_funcnames) }
  .debug_typenames 0 : { *(.debug_typenames) }
  .debug_varnames  0 : { *(.debug_varnames) }
  /DISCARD/ : { *(.note.GNU-stack) }
}

This might look like it's difficult to read, but it's not. Text within /* and */, as is the same with C code, indicates comments which are ignored. A period by itself ., as is the same in assembly notation, indicates the current value of the output location counter.

At the top of the file is a bunch of housekeeping stuff. Then it gives the SECTIONS command which indicates the start of the script that defines how the sections of the output ELF file are going to be laid out. As an example, let's look at the following lines of code which lay out the .text part of the image (i.e. the part where the executable code is placed):

  .text           :
  {
    *(.text .stub .text.* .gnu.linkonce.t.*)
    KEEP (*(.text.*personality*))
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
  } =0x90909090

This snippet says:

  1. Now I am going to lay out a .text section in the output file, at this point in the output.
  2. This section is going to be composed of all the .text, .stub, .text.*, and .gnu.linkonce.t.* sections I encounter (in that order) from all input files I am given (the * before the parenthesized list indicates which input files to consider).

  3. This is followed by all .gnu.warning sections I encounter from all input files.

  4. The =0x90909090 written at the end of the section's description tells me the fill pattern to use if there is any space between sections (mostly due to alignment constraints).

Just playing around and experimenting some more, here is the objdump -t of the executable again, with most of the cruft removed, and sorted by address:

080482d8 g     F .text  00000000              _start
080482d8 l    d  .text  00000000              .text
080482fc l     F .text  00000000              call_gmon_start
08048320 l     F .text  00000000              __do_global_dtors_aux
08048354 l     F .text  00000000              frame_dummy
0804837c g     F .text  0000007d              main
080483fc g     F .text  0000000b              add
08048408 g     F .text  0000004f              __libc_csu_init
08048458 g     F .text  00000005              __libc_csu_fini
08048460 l     F .text  00000000              __do_global_ctors_aux

If I change the compile line to be:

[trevor@trevor code]$ gcc -o main add.o main.o

watch how the positions of the functions main() and add() change place in the executable image:

080482d8 g     F .text  00000000              _start
080482d8 l    d  .text  00000000              .text
080482fc l     F .text  00000000              call_gmon_start
08048320 l     F .text  00000000              __do_global_dtors_aux
08048354 l     F .text  00000000              frame_dummy
0804837c g     F .text  0000000b              add
08048388 g     F .text  0000007d              main
08048408 g     F .text  0000004f              __libc_csu_init
08048458 g     F .text  00000005              __libc_csu_fini
08048460 l     F .text  00000000              __do_global_ctors_aux

This happens because the linker script, when creating the .text section, does a wildcard match on all .text sections and joins them together into one single .text section in the order in which they are encountered. During the first compile we specified the order as main.o followed by add.o; therefore the symbols were placed in the executable starting with the symbols from main.o followed by the symbols from add.o. In the second case we specified the object files in the reverse order, therefore the symbols were stored in the executable in the reverse order too.


Putting Objects into their own ELF Sections

I'm going to start with the code that we saw before in the section on code layout and modify it a bit so that different parts will now be in their own ELF sections:

/*
 * Copyright (C) 2006  Trevor Woerner
 */

#include <stdio.h>

int add (int, int) __attribute__ ((section ("my_code_section")));
int global_val     __attribute__ ((section ("my_data_section")));
int gval_init      __attribute__ ((section ("my_data_section"))) = 29;

int
add (int i, int j)
{
        return i+j;
}

int
main (void)
{
        int local_val = 25;
        global_val = 17;

        printf ("local_val: %d    global_val: %d    gval_init: %d\n",
                        local_val, global_val, gval_init);
        printf ("%d + %d = %d\n", local_val, global_val,
                        add (local_val, global_val));

        return 0;
}

Now when we do an objdump -t on the result we get the following:

00000000       F *UND*  00000039              printf@@GLIBC_2.0
00000000       F *UND*  00000187              __libc_start_main@@GLIBC_2.0
00000000  w      *UND*  00000000              _Jv_RegisterClasses
00000000  w      *UND*  00000000              __gmon_start__
00000000 l    d  *ABS*  00000000              .shstrtab
00000000 l    d  *ABS*  00000000              .strtab
00000000 l    d  *ABS*  00000000              .symtab
00000000 l    d  .comment       00000000              .comment
00000000 l    df *ABS*  00000000              crtstuff.c
00000000 l    df *ABS*  00000000              crtstuff.c
00000000 l    df *ABS*  00000000              new.c
08048114 l    d  .interp        00000000              .interp
08048128 l    d  .note.ABI-tag  00000000              .note.ABI-tag
08048148 l    d  .hash  00000000              .hash
08048174 l    d  .dynsym        00000000              .dynsym
080481d4 l    d  .dynstr        00000000              .dynstr
08048234 l    d  .gnu.version   00000000              .gnu.version
08048240 l    d  .gnu.version_r 00000000              .gnu.version_r
08048260 l    d  .rel.dyn       00000000              .rel.dyn
08048268 l    d  .rel.plt       00000000              .rel.plt
08048280 g     F .init  00000000              _init
08048280 l    d  .init  00000000              .init
08048298 l    d  .plt   00000000              .plt
080482d8 g     F .text  00000000              _start
080482d8 l    d  .text  00000000              .text
080482fc l     F .text  00000000              call_gmon_start
08048320 l     F .text  00000000              __do_global_dtors_aux
08048354 l     F .text  00000000              frame_dummy
0804837c g     F .text  0000007a              main
080483f8 g     F .text  0000004f              __libc_csu_init
08048448 g     F .text  00000005              __libc_csu_fini
08048450 l     F .text  00000000              __do_global_ctors_aux
08048478 g       *ABS*  00000000              __start_my_code_section
08048478 g     F my_code_section        0000000b              add
08048478 l    d  my_code_section        00000000              my_code_section
08048483 g       *ABS*  00000000              __stop_my_code_section
08048484 g     F .fini  00000000              _fini
08048484 l    d  .fini  00000000              .fini
080484a0 g     O .rodata        00000004              _fp_hw
080484a0 l    d  .rodata        00000000              .rodata
080484a4 g     O .rodata        00000004              _IO_stdin_used
080484e8 l     O .eh_frame      00000000              __FRAME_END__
080484e8 l    d  .eh_frame      00000000              .eh_frame
080494ec g       *ABS*  00000000              .hidden __fini_array_end
080494ec g       *ABS*  00000000              .hidden __fini_array_start
080494ec g       *ABS*  00000000              .hidden __init_array_end
080494ec g       *ABS*  00000000              .hidden __init_array_start
080494ec g       *ABS*  00000000              .hidden __preinit_array_end
080494ec g       *ABS*  00000000              .hidden __preinit_array_start
080494ec l     O .ctors 00000000              __CTOR_LIST__
080494ec l    d  .ctors 00000000              .ctors
080494f0 l     O .ctors 00000000              __CTOR_END__
080494f4 l     O .dtors 00000000              __DTOR_LIST__
080494f4 l    d  .dtors 00000000              .dtors
080494f8 l     O .dtors 00000000              __DTOR_END__
080494fc l     O .jcr   00000000              __JCR_END__
080494fc l     O .jcr   00000000              __JCR_LIST__
080494fc l    d  .jcr   00000000              .jcr
08049500 g     O .dynamic       00000000              _DYNAMIC
08049500 l    d  .dynamic       00000000              .dynamic
080495c8 l    d  .got   00000000              .got
080495cc g     O .got.plt       00000000              .hidden _GLOBAL_OFFSET_TABLE_
080495cc l    d  .got.plt       00000000              .got.plt
080495e4  w      .data  00000000              data_start
080495e4 g       .data  00000000              __data_start
080495e4 l    d  .data  00000000              .data
080495e8 g     O .data  00000000              .hidden __dso_handle
080495ec l     O .data  00000000              p.4582
080495f0 g       *ABS*  00000000              __start_my_data_section
080495f0 g     O my_data_section        00000004              gval_init
080495f0 l    d  my_data_section        00000000              my_data_section
080495f4 g     O my_data_section        00000004              global_val
080495f8 g       *ABS*  00000000              __bss_start
080495f8 g       *ABS*  00000000              __stop_my_data_section
080495f8 g       *ABS*  00000000              _edata
080495f8 l     O .bss   00000001              completed.4583
080495f8 l    d  .bss   00000000              .bss
080495fc g       *ABS*  00000000              _end

Running the executable gives:

local_val: 25    global_val: 17    gval_init: 29
25 + 17 = 42

The first thing to note is that the executable works! (yea!) The second thing you should notice are the existence of new section names (my_code_section and my_data_section) in the executable image. You will also notice that in these sections are found the objects that we placed in them.

...
08048478 g       *ABS*  00000000              __start_my_code_section
08048478 g     F my_code_section        0000000b              add
08048478 l    d  my_code_section        00000000              my_code_section
08048483 g       *ABS*  00000000              __stop_my_code_section
...
080495f0 g       *ABS*  00000000              __start_my_data_section
080495f0 g     O my_data_section        00000004              gval_init
080495f0 l    d  my_data_section        00000000              my_data_section
080495f4 g     O my_data_section        00000004              global_val
080495f8 g       *ABS*  00000000              __bss_start
080495f8 g       *ABS*  00000000              __stop_my_data_section

Something else that is very worthy of note is the fact that ld has been kind enough to add a couple of global absolute symbols which delimit our newly-defined sections without us needing to ask it to: __start_my_code_section/__stop_my_code_section and __start_my_data_section/__stop_my_data_section. Notice how __start_my_code_section has the same address as our add() function and that __start_my_data_section has the same address as our gval_init.

You may have just asked yourself: "In the generated my_data_section above, why did the gval_init object come first?". Having a look at the generated assembly (gcc -S) helps us to investigate this question:

.globl gval_init
        .section        my_data_section,"aw",@progbits
        .align 4
        .type   gval_init, @object
        .size   gval_init, 4
gval_init:
        .long   29
        .section        my_code_section,"ax",@progbits
.globl add
        .type   add, @function
add:
        pushl   %ebp
        movl    %esp, %ebp
        movl    12(%ebp), %eax
        addl    8(%ebp), %eax
        leave
        ret
        .size   add, .-add
        .section        .rodata
        .align 4
.LC0:
        .string "local_val: %d    global_val: %d    gval_init: %d\n"
.LC1:
        .string "%d + %d = %d\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp
        andl    $-16, %esp
        movl    $0, %eax
        addl    $15, %eax
        addl    $15, %eax
        shrl    $4, %eax
        sall    $4, %eax
        subl    %eax, %esp
        movl    $25, -4(%ebp)
        movl    $17, global_val
        movl    gval_init, %eax
        movl    global_val, %edx
        pushl   %eax
        pushl   %edx
        pushl   -4(%ebp)
        pushl   $.LC0
        call    printf
        addl    $16, %esp
        movl    global_val, %eax
        pushl   %eax
        pushl   -4(%ebp)
        call    add
        addl    $8, %esp
        movl    global_val, %edx
        pushl   %eax
        pushl   %edx
        pushl   -4(%ebp)
        pushl   $.LC1
        call    printf
        addl    $16, %esp
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
.globl global_val
        .section        my_data_section
        .align 4
        .type   global_val, @object
        .size   global_val, 4
global_val:
        .zero   4

Notice how the two variables ended up in different sections (but which both have the same name). Why did this happen? I'm not 100% sure why, but the gcc sub-program that converts the C code into assembly did originally setup two different sections for the two variables, but because the two sections have the same name, they ended up together. I don't know why the initialized global data ended up at the top and the non-intialized one ended up at the bottom. Perhaps this is something to explore some other day.

Basically, the answer to the above question "why gval_init ended up first" is that gcc separated them that way. If we make them both the same type of global variable we'll see that gcc will only create one segment for both of them, and that they'll appear in our segment in the order in which they're found in the source code:

code:
int global_val     __attribute__ ((section ("my_data_section")));
int gval_init      __attribute__ ((section ("my_data_section")));

assembly: (at the bottom of file)
.globl global_val
        .section        my_data_section,"aw",@progbits
        .align 4
        .type   global_val, @object
        .size   global_val, 4
global_val:
        .zero   4
.globl gval_init
        .align 4
        .type   gval_init, @object
        .size   gval_init, 4
gval_init:
        .zero   4

objdump -t:
080495f0 g       *ABS*  00000000              __start_my_data_section
080495f0 g     O my_data_section        00000004              global_val
080495f0 l    d  my_data_section        00000000              my_data_section
080495f4 g     O my_data_section        00000004              gval_init
080495f8 g       *ABS*  00000000              __bss_start
080495f8 g       *ABS*  00000000              __stop_my_data_section

code:
int gval_init      __attribute__ ((section ("my_data_section")));
int global_val     __attribute__ ((section ("my_data_section")));

assembly:
.globl gval_init
        .section        my_data_section,"aw",@progbits
        .align 4
        .type   gval_init, @object
        .size   gval_init, 4
gval_init:
        .zero   4
.globl global_val
        .align 4
        .type   global_val, @object
        .size   global_val, 4
global_val:
        .zero   4

objdump -t:
080495f0 g       *ABS*  00000000              __start_my_data_section
080495f0 g     O my_data_section        00000004              gval_init
080495f0 l    d  my_data_section        00000000              my_data_section
080495f4 g     O my_data_section        00000004              global_val
080495f8 g       *ABS*  00000000              __bss_start
080495f8 g       *ABS*  00000000              __stop_my_data_section

KernelNewbies: InitcallMechanism/SimpleExamples (last edited 2021-01-13 04:54:43 by RandyDunlap)