COMS W4118 Operating Systems I

Interrupts, Spin Locks, and Preemption

Interrupts

Review

Recall discussion of interrupts in L09-syscalls:

three kinds of interrupts
interrupts have values assigned to them, see interrupt descriptor table (IDT)
- a.k.a. interrupt request (IRQ) line

Kernel Execution Context

Kernel execution occurs in one of two modes:

Process context

system calls execute kernel code on behalf of a process
operations may sleep
- recall discussion of sleeping in L10-run-wait-queues: sleeping requires the associated task_struct to be placed on a wait queue and have schedule() called to switch to another task
one kernel stack for each process

Interrupt context

interrupt handlers run in interrupt context
operations cannot sleep – execution does not have an associated task and therefore can’t interact with wait queue and schedule()
- e.g. kmalloc(), copy_to/from_user() may trigger I/O which causes the caller to sleep until the I/O is satisfied. Can’t be called from interrupt context
all handlers share one interrupt stack per processor
- i.e. not the kernel stack of the interrupted task

Interrupt Handler

Only time-critical work should be dealt with in the handler so that we can return to the interrupted task ASAP. Push remainder of the work to “bottom half”

several kernel mechanisms available to execute some work at a later time (e.g. softirqs, tasklets, kernel threads)

For example, the two halves of dealing with network packet arrival could look like:

top half: acknowledge packet arrival, move packets from NIC to memory, prepare device for further packet arrival
bottom half: propogate packets through kernel networking stack

Single interrupt will not nest, so handler need not be reentrant

…but handler can be interrupted by a different interrupt

Spin Locks

Mutual Exclusion

We’ve seen two synchronization primitives so far that allow us to enforce mutual exclusion:

semaphore: L05-ipc
pthread_mutex: L06-thread

For example, L06-thread’s bank demo showed that current increments/decrements on a shared integer needed synchronization to prevent data corruption due to the interleaving of non-atomic operations:

pthread_mutex_lock(&balance_lock);
++balance;
pthread_mutex_unlock(&balance_lock);

Semaphore and pthread_mutex are examples of sleeping locks. The calling task is put to sleep while it waits for the critical section to become available.

Mutex can also be achieved using a spinning lock. Instead of sleeping until the critical section is free, spin locks poll the critical section until it is free.

Spin Lock Implementation

High-level idea: lock() polls until flag == 0, then sets flag = 1. unlock() sets flag = 0.

int flag = 0;

lock() {
    while (flag == 1)
        ;

    // This gap between testing and setting the variable
    // creates a race condition!

    flag = 1;
}

unlock() {
    flag = 0;
}

Race condition: what if task 1 is about to set flag = 1 but before it could, task 2 sees flag == 0?

Non-atomic test & set leads to mutual exclusion violation – both tasks enter critical section!

Correct implementation using atomic test_and_set hardware instruction:

int flag = 0;

lock() {
    while(test_and_set(&flag))
        ;
}

unlock() {
    flag = 0;
}

In C pseudocode, test_and_set hardware instruction looks like:

int test_and_set(int *lock) {
    int old = *lock;
    *lock = 1;
    return old;
}

Linux Kernel Spin Locks

spin_lock() / spin_unlock()
- keep the critical section as small as possible
- must not lose CPU while holding a spin lock
  - other threads will wait for the lock for a long time
- must NOT call any function that can potentially sleep
  - ex) kmalloc(), copy_from_user()
- spin_lock() prevents kernel preemption by ++preempt_count
  - in uniprocessor, that’s all spin_lock() does
- hardware interrupt is ok unless the interrupt handler may try to lock this spin lock
  - spin lock not recursive: same thread locking twice will deadlock
spin_lock_irqsave() / spin_unlock_irqrestore()
- save current interrupt state, disable all interrupts on local CPU, lock, unlock, restore interrupts to how they were before
- need to use this version if the lock is something that an interrupt handler may try to acquire
- no need to worry about interrupts on other CPUs – spin lock will work normally
- again, no need to spin in uniprocessor – just ++preempt_count & disable irq
spin_lock_irq() / spin_unlock_irq()
- disable & enable irq assuming it was enabled to begin with
- should not be used in most cases

Spinning vs. Sleeping Lock

Sleeping lock incurs cost of context-switch to put caller to sleep:

Context-switch is expensive, not worth it if we expect to acquire mutex very soon. Recall all the steps involved in putting a process to sleep from L10-run-wait-queues!
If we expect to acquire mutex soon, polling for a little bit makes more sense than sleeping and incuring cost of context switch.

Spinning lock consumes CPU time by polling:

If we don’t expect to acquire mutex soon, spinning on the CPU is wasteful (i.e. burning a core). Some other meaningful work could run instead.
If we don’t acquire mutex fast, better to put caller to sleep so something else can run

Can only use spin locks in interrupt context

Since interrupt context is not schedulable, can’t use sleeping lock

Can’t sleep while holding spin lock

Since preemption is disabled, sleeping task may never be rescheduled and lock is never released
If blocking operations are needed, either move them out of the spin lock critical section or use sleeping lock instead

Preemption

Recall from L10-run-wait-queues: there exists a state transition from running task to runnable task because of preemption (e.g. by timer interrupt).

Kernel cannot always rely on tasks to willingly yield the CPU – programs could run indefinitely otherwise.

Instead, kernel tracks a per-process TIF_NEED_RESCHED flag. If set, preemption occurs by calling schedule() in the following cases:

Returning to user space:
- from a system call
- from an interrupt handler
Returning to kernel from an interrupt handler, only if preempt_count is zero
preempt_count just became zero – right after spin_unlock(), for example
Task running in kernel mode calls schedule() itself – blocking syscall, for example

Last updated: 2022-02-28