What does that "jmp 1b\n" doing in down() function ?
While going through the implementation of down(), newbies gets confused by the strange looking "jmp 1b" instruction. I asked in the mailing list and Thomas PETAZZONI was kind enough to answer it.
static inline void down(struct semaphore * sem) { might_sleep(); __asm__ __volatile__( "# atomic down operation\n\t" LOCK "decl %0\n\t" /* --sem-≥count */ // locked and decremented. "js 2f\n" // if sign is set jump to 2: "1:\n" LOCK_SECTION_START("") // starts a new section in elf. This is the key to the mistery :-) "2:\tlea %0,%%eax\n\t" // loads &sem (==&sem->count) into eax. "call __down_failed\n\t" // this eventually returns when sem is acquired "jmp 1b\n" // goes back to 1:, To know why, read the explanation LOCK_SECTION_END :"=m" (sem->count) : :"memory","ax"); }
This is quite subtle. The explanation is in the LOCK_SECTION_START() and LOCK_SECTION_END() macros. These macros puts the code in-between in another section, far away from the current. So, in fact, after the 1:, what you have in memory, is not the "lea", but the next instruction after the down().
So, if you write some C code like :
a = 2; down (); c = 1;
It will end up like this :
- some assembly code to set a to 2 - LOCK decl %0 - js 2f /* Jump only in the contended case */ - some assembly code to set c to 1
and then, far away :
- lea %0, %%eax - call __down_failed
The idea is to optimize the non-contended case (when there's no contention on the semaphore). The idea is to not trash the i-cache with instructions that are useless most of the time, and probably to optimize the pipeline usage by making sure that prefetched instructions are the one that are most likely to be executed.