17380
Comment:
|
17810
typos
|
Deletions are marked like this. | Additions are marked like this. |
Line 8: | Line 8: |
This section will cover the internals of Interrupt Handling in Linux Kernel (all explaination is related to i386 platform). This section isunder development and might be incomplete right now. | This section will cover the internals of Interrupt Handling in Linux Kernel (all explaination is related to i386 platform). This section is under development and might be incomplete right now. |
Line 10: | Line 10: |
I will cover the following topics in this section, explaining the hardware as well as software part of it, from how the interrupts are generated, routed and then handled by the low level code of Linux Kernel. | I will cover the following topics in this section, explaining the hardware as well as software part of it, How the interrupts are generated, routed and then handled by the low level code of Linux Kernel. |
Line 23: | Line 23: |
This section will discuss, the hardware prospective of interrupt handling fromCPU, Linux Kernel's Interrupt Routing subsystem, Device Drivers's rolein Interrupt handling. | This section will discuss the hardware prospective of interrupt handling from the CPU, the Linux Kernel's Interrupt Routing subsystem and Device Drivers's role in Interrupt handling. |
Line 25: | Line 25: |
Term __Interrupt__ is self defined,Interrupts are signals sent to CPU on an INTR bus (connected to CPU) whenever any device want to get attention of CPU. As soon as theinterrupt signal occurs, CPU defer the current activity and service the interrupt by executing the interrupt handler corresponding to that interrupt number (also know as IRQ number). | Term __Interrupt__ is self defined, Interrupts are signals sent to a CPU on an INTR bus (providing the connection to the CPU), issued whenever any device wants to get attention of the CPU. As soon as the interrupt signal occurs, CPU defer the current activity and service the interrupt by executing the interrupt handler corresponding to that interrupt number (also known as IRQ number). |
Line 27: | Line 27: |
One of the clasifications of Interrupts can be done as follows: - Synchronous Interrupts (also know on as software interrupts) - Asynchronous Interrupts (also know as hardware interrupts) | One of the classifications of Interrupts can be done as follows: - Synchronous Interrupts (also know on as software interrupts) - Asynchronous Interrupts (also know as hardware interrupts) |
Line 29: | Line 29: |
Basic difference between these is that, synchronous interrupts are generated by CPU's control unit on facing some abnormal condition; these are also know as exception in Intel's termenology. These are interrupts whihc are generated by CPU itself either when CPU detects an abnormal condition or CPU executes some of the special instructions like 'int'or 'int3' etc. on other hand, asynchronous interupts are those, which actually are generated by outside world (devices connected to CPU). As these interrupts can occur at any point of time, these are known asynchronous interrupts. | The basic difference between these is that, synchronous interrupts are generated by CPU's control unit when some abnormal condition is faced; these are also know as exceptions in Intel's termenology. Synchronous interrupts are interrupts which are generated by the CPU itself, either when the CPU detects an abnormal condition or when the CPU executes some of the special instructions like 'int' or 'int3' etc. On other hand, asynchronous interupts are those, which actually are generated by the outside world (devices connected to CPU). As these interrupts can occur at any point of time, these are known asynchronous interrupts. |
Line 31: | Line 31: |
Its important to note that both synchornous and asynchronous interrupts are handled by CPU on the completion of insturction during which the interrupt occur. Execution of a machine instruction is not done in one single CPU cycle, it take somecycles to complete. Any interrupt occurs in between the execution of instruction, will not be handled imediately, rather CPU will check o finterrupts on the completion of instruction. | It's important to note that both synchronous and asynchronous interrupts are handled by the CPU on the completion of an instruction during which the interrupt occurs. Execution of a machine instruction is not done in one single CPU cycle, it will take some cycles to complete. Any interrupt occurring in between the execution of an instruction, will not be handled immediately. Rather, the CPU will handle interrupts after completion of the instruction. |
Line 34: | Line 34: |
For handling interrupts there are few of the things which we expect theCPU to do on occurence of every interrupt. Whenever an interrupt occurs, CPU performs some of the hardware checks, which are very much needed to make the system secure. Before explaining the hardware checks,we will understand how the interrupts are routed to the CPU from hardware devices. | There are few things we always expect the CPU to do on the occurence of the handling of an interrupt. Whenever an interrupt occurs, the CPU performs some hardware checks, required to make the system secure. Before discussing the hardware checks, we will explaining how interrupts are routed to the CPU from the hardware devices. |
Line 38: | Line 38: |
On Intel architecture, system devices (device controllers) are connected to a special device known as PIC (Programmable Interrupt Controller). CPU have two lines for receiving interrupt signals (NMI and INTR). NMI line is to recieve non-maskable interrupts; the interrupts which can not be masked, means which can not be blocked at any cost.These interrupts are of hightest priority and are rarely used. INTR line is the line on which all the interrupts from system devices are received.These interrupts can be masked or blocked. As all the interrupt signals need to be multiplxed on single CPU line, we need some mechanisum through which interrupts from different device controllerscan be routed to single line of CPU. This routing ormultiplexing isdone PIC(Programmable Interrupt Controller). PIC sitsbetween systemdevicesand CPU and have multiple input lines; each line connected to different divice contollers in system. On other hand IPC have only one output line which is connected to the CPU's INTR line on which it send signal to CPU. There are two PIC controllers joined together and the output of second PIC controller is connected to the second input of first PCI.This setup allows maximum of 15 input lines on which different system device controllers can be connected. PIC have some programmable registers, through which CPU communicates with it (give command, mask/unmask interrup lines, read status). Both PICs have their own following registers: | On Intel architecture, system devices (device controllers) are connected to a special device known as PIC (Programmable Interrupt Controller). CPUs have two lines for receiving interrupt signals: NMI and INTR. the NMI line is to recieve non-maskable interrupts; non-maskable indicates that the interrupt can not be blocked. These interrupts have the hightest priority and are rarely used. INTR line is the line on which all the interrupts from system devices are received. These interrupts can be masked (blocked). Since all the interrupt signals need to be multiplexed on a single CPU line, we need some mechanism through which interrupts from different device controllers can be routed to a single line of CPU. This routing, or multiplexing is done by PIC (Programmable Interrupt Controller). PIC sits between system devices and CPU and have multiple input lines; each line connected to a different device contoller in the system. On the other hand IPCs have only one output line which is connected to the CPU's INTR line on which it sends a signal to the CPU. There are two PIC controllers joined together and the output of the second PIC controller is connected to the second input of first PCI. This setup allows maximum of 15 input lines on which different system device controllers can be connected. PICs have some programmable registers, through which the CPU can communicate with it (give command, mask/unmask interrup lines, read status). Both PICs have their own following registers: |
Line 44: | Line 44: |
Mask register is used to mask/unmask a specific interrupt line. CPU can ask the PIC to mask (block) the specific interrupt by setting the corresponding bit in mask register. Unmasking can be done by clearing that bit. When a particular interrupt is being masked, PIC do receive the interrupts on its corresponding input line, but do not send the interrupt singnal to CPU in which case CPU keeps on doing what it was doing. When an interrupts are being masked, they are not lost, rather PIC remembers those and do send the interrupt to CPU when CPU unmasks that interrupt line. Masking is different from blocking all the interrupts to CPU. CPU can ignore all the interrupts coming on INTR line by clearing the IF (Interrupt Falg) flag in EFLAGS register of CPU. When this bit is cleared, interrupts coming on INTR line are simply ignored by CPU, we can consider it to be blocking of interrupts.So now we understand that masking is done at PIC level and individual interrupt lines can be masked or unmasked,where as blocking is done at CPU level and is done for all the interrupts coming to that CPU except NMI (Non-Maskable Interrupt), which is received on NMI lineof CPU and can not be blocked or ignored. | A mask register is used to mask/unmask a specific interrupt line. CPU can ask the PIC to mask (block) the specific interrupt by setting the corresponding bit in the mask register. Unmasking can be done by clearing that bit. When a particular interrupt is being masked, PICs do receive the interrupts on its corresponding input line, but do not send the interrupt singnal to a CPU in which case the CPU keeps on doing what it was doing. When an interrupts is being masked, they are not lost, rather PIC remembers those and does send the interrupt to the CPU when the CPU unmasks that interrupt line. Masking is different from blocking all the interrupts to the CPU. CPUs can ignore all the interrupts coming on INTR line by clearing the IF (Interrupt Falg) flag in the EFLAGS register of CPU. When this bit is cleared, interrupts coming on an INTR line are simply ignored by the CPU, we can then consider it to be blocking interrupts. So now we understand that masking is done at PIC level and individual interrupt lines can be masked or unmasked, whereas blocking is done at CPU level and is done for all the interrupts coming to that CPU except for NMIs (Non-Maskable Interrupts), that are received on a NMI line of the CPU and can not be blocked or ignored. |
Line 46: | Line 46: |
Now days, interrupt architecture is not as simple as shown above.Now days machines uses the APIC (Advanced Programmable Interrupt Controller),which can support upto 256 interrupt lines. Along with APIC, every CPU also have in-built IO-APIC. I won't go into detailsofthese right now. | Nowdays, interrupt architecture is not as simple as shown above: machines use the APIC (Advanced Programmable Interrupt Controller), which can support up to 256 interrupt lines. Along with APIC, every CPU also has an inbuilt IO-APIC. I won't go into details of these right now. |
Line 48: | Line 48: |
Once the interrupt signal is received by CPU, CPU performs some hardware checks for which no software machine instructions are executed. Before looking into what these checks are, we need to understand some architecture specific data structures maintained by kernel. | Once the interrupt signal is received by the CPU, the CPU performs some hardware checks for which no software machine instructions are executed. Before looking into what these checks are, we need to understand some architecture specific data structures maintained by the kernel. |
Line 51: | Line 51: |
Kernel need to maintain one IDT (Interrupt Descriptor Table), which actually maps the interrupt line with the interrupt handler routine. This table is of 256 enteries and each entry is of 8 bytes. First 32 enteries of this table are used for exceptions and rest are used for hardware interrupts received from outer world. This table can contain three different type of enteries; these three different types are as follows: | The kernel needs to maintain one IDT (Interrupt Descriptor Table), which actually maps the interrupt line with the interrupt handler routine. This table has 256 entries and each entry has 8 bytes. The first 32 enteries of this table are used for exceptions and the remaining are used for hardware interrupts received from the 'outside world'. This table can contain three different type of enteries; these three different types are as follows: |
Line 53: | Line 53: |
Task Gates Trap Gates Interrupt Gates | Task Gates, Trap Gates and Interrupt Gates |
Line 78: | Line 78: |
Basicallythe task gates are used in IDT, to allow the user processs to make a context switch with another process without requesting the kernel to do this. As soon as this gate is hit (interrupt received on line for which there is a task gate in IDT), CPU saves the context (state of processor registers) of currently running process to the TSS of current process,whose address is saved in TR (Task Register) of CPU. After saving the context of current process, CPU sets the CPU registers with the values stored in the TSS of new process, whose pointer is saved inthe 16-31 bits of the task gate. Once the registers are set with these new values, processor gets the new process and the context switch is done. Linux do not use the task gates, it only uses the trap and interrupt gates in IDT. So I will not explain the task gates any more. | Basically the task gates are used in IDT, to allow the user process to make a context switch with another process without requesting the kernel to do this. As soon as this gate is hit (interrupt received on line for which there is a task gate in IDT), The CPU saves the context - the state of the processors' registers - of currently running processes to the TSS of current processeses, whose address is saved in the TR (Task Register) of the CPU. After saving the context of a current process the CPU sets the CPU registers with the values stored in the TSS of a new process, whose pointer is saved in the 16-31 bits of the task gate. Once the registers are set with these new values, the processor gets the new process and the context switch is done. Linux does not use the task gates, it only uses the trap and interrupt gates in IDT. I will not explain the task gates any further. |
Line 94: | Line 94: |
Trap gates are basically used to handle exceptions generated by CPU. 0-15 bits and 48-63 bits together form the pointer (offset in segment identified by 16-31 bits of this entry) to a kernel function.The only difference between trap gates and interrupt gates is that,whenever an interrupt gate is hit, CPU automatically disables theinterrupts by clearing the IF flag in CPU's EFLAG register, where as incase of trap gate this is not done and interrupts remain enabled.As mentioned earlier trap gates are used for exceptions, so in Linux Kernel first 32 enteries in IDTare initialized with trap gates. In addition to this Linux Kernel also uses the trap gate for system call entry (entry128 of IDT). | Trap gates are basically used to handle exceptions generated by CPU. 0-15 bits and 48-63 bits together form the pointer (offset in segment identified by 16-31 bits of this entry) to a kernel function. The only difference between trap gates and interrupt gates is that, whenever an interrupt gate is hit, the CPU automatically disables the interrupts by clearing the IF flag in the CPU's EFLAG register. In case of trap gate, on the other hand, this is not done and interrupts remain enabled. As mentioned earlier, trap gates are used for exceptions, so in the Linux Kernel the first 32 enteries in the IDT are initialized with trap gates. In addition to this, the Linux Kernel also uses the trap gate for an system call entry (entry128 of IDT). |
Line 110: | Line 110: |
Format of interrupt gates is same as trap gates explained above,expect the value of type field (40-43 bits). In case of trap gates this have a value 1111 and in case of interrupts its 1110. | Format of interrupt gates is same as trap gates explained above,expect the value of type field (40-43 bits). In case of trap gates this has a value 1111 and in case of interrupts it has 1110. |
Line 112: | Line 112: |
Note: whenever the interrupt gate is hit, interrupts are disabled automatically. | Note: whenever the interrupt gate is hit, interrupts are disabled automaticly. |
Line 115: | Line 115: |
Whenever an exception or interrupt occurs, corresponding trap/interrupt gate is hit and CPU performs some checks with fields of these gates.Things done by CPU are as follows: | Whenever an exception or interrupt occurs, the corresponding trap/interrupt gate is hit and the CPU performs some checks with fields of these gates. Things done by the CPU are as follows: |
Line 117: | Line 117: |
1). get the ith entry fromIDT (physical address and size of IDT is stored in IDTR register ofCPU), here 'i' means the interrupt number. | 1). get the ith entry from the IDT (the physical address and the size of an IDT is stored in the IDTR register of the CPU), here 'i' means the interrupt number. |
Line 119: | Line 119: |
2). read the segment descriptor index from 16-31 bits of IDT entry, lets say this to be 'n' | 2). read the segment descriptor index from the 16-31 bits of the IDT entry, lets call this 'n' |
Line 121: | Line 121: |
3). gets the segment descriptor from 'n'th entry in GDT (physical address and size of GDT is stored in GDTR register of CPU) | 3). get the segment descriptor from the 'n'th entry in the GDT (the physical address and the size of an GDT is stored in the GDTR register of the CPU) |
Line 123: | Line 123: |
4).DPL of the nth entry in the GDT should be less that equal toCPL(Current Previelge Level, specified in the read-only lowermost twobitsof CS register). Incase DPL > CPL, CPU will generate general protection exception. We will see ahead, what does this check mean and why this is done. Simply saying: | 4). the DPL of the nth entry in the GDT should be less than equal to the CPL (the Current Priviledge Level, specified in the read-only lowermost two bits of the CS register). Incase DPL > CPL, the CPU will generate a general protection exception. We will discuss later what this check will mean and why this is done. In short: |
Line 158: | Line 158: |
CategoryKernelHacking | CategoryKernelHacking CategoryKernelHacking CategoryDocs |
Parent Node : [:New Kernel Hacking HOWTO/Subsystems:Subsystems]
Exceptions and Interrupts Handling
This section will cover the internals of Interrupt Handling in Linux Kernel (all explaination is related to i386 platform). This section is under development and might be incomplete right now.
I will cover the following topics in this section, explaining the hardware as well as software part of it, How the interrupts are generated, routed and then handled by the low level code of Linux Kernel.
- Introduction
- Interrupt Routing
- Details of Programmable Interrupt Controller
- Details of Interrupt Descriptor Table
- Task Gates
- Trap Gates
- Interrupt Gates
- Hardware Checks for Interrupts and Exceptions
- [:New Kernel Hacking HOWTO/Subsystems/Exceptions and Interrupts Handling/Details of do IRQ() function:Linux Kernel support for Handling Interrupts] - Details of do_IRQ() function, core of Interrupt Handling
Introduction
This section will discuss the hardware prospective of interrupt handling from the CPU, the Linux Kernel's Interrupt Routing subsystem and Device Drivers's role in Interrupt handling.
Term Interrupt is self defined, Interrupts are signals sent to a CPU on an INTR bus (providing the connection to the CPU), issued whenever any device wants to get attention of the CPU. As soon as the interrupt signal occurs, CPU defer the current activity and service the interrupt by executing the interrupt handler corresponding to that interrupt number (also known as IRQ number).
One of the classifications of Interrupts can be done as follows: - Synchronous Interrupts (also know on as software interrupts) - Asynchronous Interrupts (also know as hardware interrupts)
The basic difference between these is that, synchronous interrupts are generated by CPU's control unit when some abnormal condition is faced; these are also know as exceptions in Intel's termenology. Synchronous interrupts are interrupts which are generated by the CPU itself, either when the CPU detects an abnormal condition or when the CPU executes some of the special instructions like 'int' or 'int3' etc. On other hand, asynchronous interupts are those, which actually are generated by the outside world (devices connected to CPU). As these interrupts can occur at any point of time, these are known asynchronous interrupts.
It's important to note that both synchronous and asynchronous interrupts are handled by the CPU on the completion of an instruction during which the interrupt occurs. Execution of a machine instruction is not done in one single CPU cycle, it will take some cycles to complete. Any interrupt occurring in between the execution of an instruction, will not be handled immediately. Rather, the CPU will handle interrupts after completion of the instruction.
Interrupt Routing
There are few things we always expect the CPU to do on the occurence of the handling of an interrupt. Whenever an interrupt occurs, the CPU performs some hardware checks, required to make the system secure. Before discussing the hardware checks, we will explaining how interrupts are routed to the CPU from the hardware devices.
Details of Programmable Interrupt Controller
On Intel architecture, system devices (device controllers) are connected to a special device known as PIC (Programmable Interrupt Controller). CPUs have two lines for receiving interrupt signals: NMI and INTR. the NMI line is to recieve non-maskable interrupts; non-maskable indicates that the interrupt can not be blocked. These interrupts have the hightest priority and are rarely used. INTR line is the line on which all the interrupts from system devices are received. These interrupts can be masked (blocked). Since all the interrupt signals need to be multiplexed on a single CPU line, we need some mechanism through which interrupts from different device controllers can be routed to a single line of CPU. This routing, or multiplexing is done by PIC (Programmable Interrupt Controller). PIC sits between system devices and CPU and have multiple input lines; each line connected to a different device contoller in the system. On the other hand IPCs have only one output line which is connected to the CPU's INTR line on which it sends a signal to the CPU. There are two PIC controllers joined together and the output of the second PIC controller is connected to the second input of first PCI. This setup allows maximum of 15 input lines on which different system device controllers can be connected. PICs have some programmable registers, through which the CPU can communicate with it (give command, mask/unmask interrup lines, read status). Both PICs have their own following registers:
Mask Register
Status Register
A mask register is used to mask/unmask a specific interrupt line. CPU can ask the PIC to mask (block) the specific interrupt by setting the corresponding bit in the mask register. Unmasking can be done by clearing that bit. When a particular interrupt is being masked, PICs do receive the interrupts on its corresponding input line, but do not send the interrupt singnal to a CPU in which case the CPU keeps on doing what it was doing. When an interrupts is being masked, they are not lost, rather PIC remembers those and does send the interrupt to the CPU when the CPU unmasks that interrupt line. Masking is different from blocking all the interrupts to the CPU. CPUs can ignore all the interrupts coming on INTR line by clearing the IF (Interrupt Falg) flag in the EFLAGS register of CPU. When this bit is cleared, interrupts coming on an INTR line are simply ignored by the CPU, we can then consider it to be blocking interrupts. So now we understand that masking is done at PIC level and individual interrupt lines can be masked or unmasked, whereas blocking is done at CPU level and is done for all the interrupts coming to that CPU except for NMIs (Non-Maskable Interrupts), that are received on a NMI line of the CPU and can not be blocked or ignored.
Nowdays, interrupt architecture is not as simple as shown above: machines use the APIC (Advanced Programmable Interrupt Controller), which can support up to 256 interrupt lines. Along with APIC, every CPU also has an inbuilt IO-APIC. I won't go into details of these right now.
Once the interrupt signal is received by the CPU, the CPU performs some hardware checks for which no software machine instructions are executed. Before looking into what these checks are, we need to understand some architecture specific data structures maintained by the kernel.
Details of Interrupt Descriptor Table (IDT)
The kernel needs to maintain one IDT (Interrupt Descriptor Table), which actually maps the interrupt line with the interrupt handler routine. This table has 256 entries and each entry has 8 bytes. The first 32 enteries of this table are used for exceptions and the remaining are used for hardware interrupts received from the 'outside world'. This table can contain three different type of enteries; these three different types are as follows:
Task Gates, Trap Gates and Interrupt Gates
Lets see what these gates are where these are used.
a). Task Gates
Format of task gates is as follows:
- 0 to 15 bits : reserved (not used)
- 16 to 31 bits : points to the TSS (Task State Segment) entry of the process to which we need to switch.
- 32 to 39 bits : these bits are reserved and are not currently used.
- 40 to 43 bits : specify the type of entry (its value for task gate is 0101)
- 44th bit : always 0, not used
- 45 to 46 bits : this specifies the DPL (Decsriptor Previlege Level) level of gate entry.
- 47th bit : specifies if this entry is valid or not (1 - valid, 0 - invalid)
- 48 to 63 bits : reserved (not used)
Basically the task gates are used in IDT, to allow the user process to make a context switch with another process without requesting the kernel to do this. As soon as this gate is hit (interrupt received on line for which there is a task gate in IDT), The CPU saves the context - the state of the processors' registers - of currently running processes to the TSS of current processeses, whose address is saved in the TR (Task Register) of the CPU. After saving the context of a current process the CPU sets the CPU registers with the values stored in the TSS of a new process, whose pointer is saved in the 16-31 bits of the task gate. Once the registers are set with these new values, the processor gets the new process and the context switch is done. Linux does not use the task gates, it only uses the trap and interrupt gates in IDT. I will not explain the task gates any further.
b). Trap Gates
Format of trap gates is as follows:
- 0-15 bits : first 16 bits of a pointer to a kernel function which need to be invoked when this gate is hit
- 16-31 bits : indicates the index of segment descriptor in GDT (Global Descriptor Table)
- 32-36 bits : these bits are reserved and are not currently used.
- 37-39 bits : always 000, not used
- 40-43 bits : specify the type of entry (its value for trap gate is 1111)
- 44th bit : always 0, not used
- 45-46 bits : this specifies the DPL (Decsriptor Previlege Level) level of gate entry.
- 47th bit : specifies if this entry is valid or not (1 - valid, 0 - invalid)
- 48-63 bits : last 16 bits of a pointer to a kernel function which need to be invoked when this gate is hit
Trap gates are basically used to handle exceptions generated by CPU. 0-15 bits and 48-63 bits together form the pointer (offset in segment identified by 16-31 bits of this entry) to a kernel function. The only difference between trap gates and interrupt gates is that, whenever an interrupt gate is hit, the CPU automatically disables the interrupts by clearing the IF flag in the CPU's EFLAG register. In case of trap gate, on the other hand, this is not done and interrupts remain enabled. As mentioned earlier, trap gates are used for exceptions, so in the Linux Kernel the first 32 enteries in the IDT are initialized with trap gates. In addition to this, the Linux Kernel also uses the trap gate for an system call entry (entry128 of IDT).
c). Interrupt Gates
Format is as follows:
- 0-15 bits : first 16 bits of a pointer to a kernel function which need to be invoked when this gate is hit
- 16-31 bits : indicates the index of segment descriptor in GDT (Global Descriptor Table)
- 32-36 bits : these bits are reserved and are not currently used.
- 37-39 bits : always 000, not used
- 40-43 bits : specify the type of entry (its value for interrupt gate is 1110)
- 44th bit : always 0, not used
- 45-46 bits : this specifies the DPL (Decsriptor Previlege Level) level of gate entry.
- 47th bit : specifies if this entry is valid or not (1 - valid, 0 - invalid)
- 48-63 bits : last 16 bits of a pointer to a kernel function which need to be invoked when this gate is hit
Format of interrupt gates is same as trap gates explained above,expect the value of type field (40-43 bits). In case of trap gates this has a value 1111 and in case of interrupts it has 1110.
Note: whenever the interrupt gate is hit, interrupts are disabled automaticly.
Hardware Checks for Interrupts and Exceptions
Whenever an exception or interrupt occurs, the corresponding trap/interrupt gate is hit and the CPU performs some checks with fields of these gates. Things done by the CPU are as follows:
1). get the ith entry from the IDT (the physical address and the size of an IDT is stored in the IDTR register of the CPU), here 'i' means the interrupt number.
2). read the segment descriptor index from the 16-31 bits of the IDT entry, lets call this 'n'
3). get the segment descriptor from the 'n'th entry in the GDT (the physical address and the size of an GDT is stored in the GDTR register of the CPU)
4). the DPL of the nth entry in the GDT should be less than equal to the CPL (the Current Priviledge Level, specified in the read-only lowermost two bits of the CS register). Incase DPL > CPL, the CPU will generate a general protection exception. We will discuss later what this check will mean and why this is done. In short:
general protection exception IfDPL (of GDT entry) < CPL, we are entering the higher previlege level (probably from user to kernel mode). In this case CPU switches thehardware stack (SS and ESP registers) from currently running process'suser mode stack to its kernel mode stack. We will see ahead, how this stack switch is exactly done. Note: stack switching idea has been mentioned here, but it actually happens after the 5th step mentioned below.
5). for software interrupts (generated by assembly instructions 'int'), one more check is done. This check is not performed for hardware interrupts (interrupts generated by system devices and forwarded by PIC). Simply saying:
DPL (of IDT entry) >= CPL : ok, we have permission to enter through this gate
DPL < CPL : genreal protection exception
6).switches the stack if DPL (of GDT entry) < CPL. In addition to this mode of CPU (least significant two bits of CS) is also changed from CPL to DPL (of GDT entry)
7). if the stack switch has taken place (SS and ESP registers reset to kernelstack), then pushes the oldvalues of SS and ESP (pointing to user stack) on this new stack (kernel stack)
8). pushes the EFALGS, CS and EIP registers on the stack (note: now we are working on kernel stack). This actually saves the pointer to user application instruction to which we need to return back after servicing the interrupt or exception
9). In case of exceptions, if there is any harware code, processor pushes that also on kernel stack
10). loads the CS with the value of GDT entry and EIP with the offset entry of IDT (0-15 bits + 48-63 bits)
All the above action is done by CPU hardware without the execution of any software instruction. Checks performed at step 4th and 5th (mentioned above) are important.
4th checks make sure that the code we are going to execute (Interrupt Service Routine) does not fall in a segment with lesser previlege. Obivously the ISR can not be in lesser previlege segment than that what we are into. DPL or CPL can have 4 values (0,1,2 for kernel mode and 3 fo user mode). Out of these four only two are used, that is 0 (for kernel mode) and 3 (for user mode).
5th check makes sure that application can enter the kernel mode through specific gates only, in Linux only through 128th gate entry which is for system call invocation. If we set the DPL field of IDT entry to be 0,1 or 2,application programme (running with CPL 3) cannot enter through that gate entry. If it tries, CPU will generate general protection exception. This is the reason that in Linux, DPL fields of all the IDT enteries (except 128th entry used for system call) are initialized with value '0', this makes sure only kernel code can access these gates not application code. In Linux 128th entry (used for system call) is of trap gate type and its DPL value is initialized to 3, so that application code can enter through this gate by assembly instruction"int 0x80"
Now lets see how does the stack switch happens when the DPL (of GDT entry) < CPL. CPU have TR (Task Register) register,which actually points to the TSS (Task Sate Segment) od currently running process. TSS is an architecture defined data structure which contains the stae of processor registers whenever context switch ofthis process happens. TSS include three sets of ESS and ESP fields, one for each level of processor (0,1 and 2). These fields specifies the stack to be used whenever we enter that processor level. Lets say the DPL value in GDT entry is 0, in this case, CPU will load the SS register with the value of SS field in TSS for 0 level and ESP registerwith the value of ESP field in TSS for 0 level. After loading the SS and ESP with these values, CPU starts pointing to the new kernel levelstack o current process. Old values of SS and ESP (CPU remembers them somehow) are now pushed on this new kernel level stack; this is done as we need to return back to old stack oncewe service the interrupts,exception or system call. Prudent readers must be wondering, why there is no field for level 3 stack in TSS. Well the reason for this is that we never use the CPU's stack switching mechanism to switch from higher CPU level (kernel mode - 0,1 and 2) to lower CPU level (user mode - 3).This is the reason that CPU while entering the higher level (kernel mode) saves the previously used lower level stack (user mode) on thekernel stack.
Once all this CPU action is done, CPU's CS and EIP registers are pointing to the kernel functions written for handling interrupts or exceptions. CPU simply start executing the instructions at this point (now we are in kernel mode - level 0)
Linux Kernel support for Handling Interrupts - Details of do_IRQ() function, core of Interrupt Handling
As this is the software part related to handling of Interrupts and maybe interest of wider audience so I wrote this on a seperate page, please find this [:KernelHacking-HOWTO/Subsystems/Exceptions and Interrupts Handling/Details of do IRQ() function:here].
Parent Node : [:New Kernel Hacking HOWTO/Subsystems:Subsystems]