Recall discussion of interrupts in L09-syscalls:
Kernel execution occurs in one of two modes:
Process context
system calls execute kernel code on behalf of a process
task_struct
to be placed on a wait queue and have schedule()
called to switch to another taskInterrupt context
interrupt handlers run in interrupt context
schedule()
kmalloc()
, copy_to/from_user()
may trigger I/O which causes the
caller to sleep until the I/O is satisfied. Can’t be called from interrupt
contextOnly time-critical work should be dealt with in the handler so that we can return to the interrupted task ASAP. Push remainder of the work to “bottom half”
For example, the two halves of dealing with network packet arrival could look like:
Single interrupt will not nest, so handler need not be reentrant
We’ve seen two synchronization primitives so far that allow us to enforce mutual exclusion:
semaphore: L05-ipc
pthread_mutex: L06-thread
For example, L06-thread’s bank demo showed that current increments/decrements on a shared integer needed synchronization to prevent data corruption due to the interleaving of non-atomic operations:
pthread_mutex_lock(&balance_lock);
++balance;
pthread_mutex_unlock(&balance_lock);
Semaphore and pthread_mutex are examples of sleeping locks. The calling task is put to sleep while it waits for the critical section to become available.
Mutex can also be achieved using a spinning lock. Instead of sleeping until the critical section is free, spin locks poll the critical section until it is free.
High-level idea: lock()
polls until flag == 0
, then sets flag = 1
.
unlock()
sets flag = 0
.
int flag = 0;
lock() {
while (flag == 1)
;
// This gap between testing and setting the variable
// creates a race condition!
flag = 1;
}
unlock() {
flag = 0;
}
Race condition: what if task 1 is about to set flag = 1
but before it could, task 2 sees flag == 0
?
Correct implementation using atomic test_and_set
hardware instruction:
int flag = 0;
lock() {
while(test_and_set(&flag))
;
}
unlock() {
flag = 0;
}
In C pseudocode, test_and_set
hardware instruction looks like:
int test_and_set(int *lock) {
int old = *lock;
*lock = 1;
return old;
}
spin_lock()
/ spin_unlock()
keep the critical section as small as possible
kmalloc()
, copy_from_user()
spin_lock()
prevents kernel preemption by ++preempt_count
spin_lock()
doesspin_lock_irqsave()
/ spin_unlock_irqrestore()
save current interrupt state, disable all interrupts on local CPU, lock, unlock, restore interrupts to how they were before
need to use this version if the lock is something that an interrupt handler may try to acquire
no need to worry about interrupts on other CPUs – spin lock will work normally
again, no need to spin in uniprocessor – just ++preempt_count
& disable
irq
spin_lock_irq()
/ spin_unlock_irq()
disable & enable irq assuming it was enabled to begin with
should not be used in most cases
Sleeping lock incurs cost of context-switch to put caller to sleep:
Spinning lock consumes CPU time by polling:
Can only use spin locks in interrupt context
Can’t sleep while holding spin lock
Recall from L10-run-wait-queues: there exists a state transition from running task to runnable task because of preemption (e.g. by timer interrupt).
Kernel cannot always rely on tasks to willingly yield the CPU – programs could run indefinitely otherwise.
Instead, kernel tracks a per-process TIF_NEED_RESCHED
flag. If set,
preemption occurs by calling schedule()
in the following cases:
Returning to user space:
from a system call
from an interrupt handler
Returning to kernel from an interrupt handler, only if
preempt_count
is zero
preempt_count
just became zero – right after spin_unlock()
, for
example
Task running in kernel mode calls schedule()
itself – blocking
syscall, for example
Last updated: 2022-02-28