Preemption Disabling

Because the kernel is preemptive, a process in the kernel can stop running at any instant to allow a process of higher priority to run. This means a task can begin running in the same critical region as a task that was preempted. To prevent this, the kernel preemption code uses spin locks as markers of nonpreemptive regions. If a spin lock is held, the kernel is not preemptive. Because the concurrency issues with kernel preemption and SMP are the same, and the kernel is already SMP-safe, this simple change makes the kernel preempt-safe, too.

Or so we hope. In reality, some situations do not require a spin lock, but do need kernel preemption disabled. The most frequent of these situations is per-processor data. If the data is unique to each processor, there may be no need to protect it with a lock because only that one processor can access the data. If no spin locks are held, the kernel is preemptive, and it would be possible for a newly scheduled task to access this same variable, as shown here:

task A manipulates per-procesor variable foo, which is not protected by a lock
task A is preempted
task B is scheduled
task B manipulates variable foo
task B completes
task A is rescheduled
task A continues manipulating variable foo

Consequently, even if this were a uniprocessor computer, the variable could be accessed pseudo-concurrently by multiple processes. Normally, this variable would require a spin lock (to prevent true concurrency on multiprocessing machines). If this were a per-processor variable, however, it might not require a lock.

To solve this, kernel preemption can be disabled via preempt_disable(). The call is nestable; you may call it any number of times. For each call, a corresponding call to preempt_enable() is required. The final corresponding call to preempt_enable()re-enables preemption. For example:

preempt_disable();
/* preemption is disabled ... */
preempt_enable();

The preemption count stores the number of held locks and preempt_disable() calls. If the number is zero, the kernel is preemptive. If the value is one or greater, the kernel is not preemptive. This count is incredibly usefulit is a great way to do atomicity and sleep debugging. The function preempt_count() returns this value. See Table 9.9 for a listing of kernel preemptionrelated functions.

Table 9.9. Kernel PreemptionRelated Functions
Function
Description
preempt_disable()
Disables kernel preemption by incrementing the preemption counter
preempt_enable()
Decrement the preemption counter and check and service any pending reschedules if the count is now zero
preempt_enable_no_resched()
Enables kernel preemption but do not check for any pending reschedules
preempt_count()
Returns the preemption count

As a cleaner solution to per-processor data issues, you can obtain the processor number (which presumably is used to index into the per-processor data) via get_cpu(). This function disables kernel preemption prior to returning the current processor number:

int cpu;

/* disable kernel preemption and set "cpu" to the current processor */
cpu = get_cpu();
/* manipulate per-processor data ... */

/* reenable kernel preemption, "cpu" can change and so is no longer valid */
put_cpu();