Building a Robust PintOS Thread Scheduler
Alarm Clock, Priority Donation, and MLFQ Scheduling
Introduction
In this comprehensive guide, we'll transform PintOS's basic round-robin thread scheduler into a sophisticated, production-quality scheduling system. Our journey encompasses three critical enhancements:
- Alarm Clock: Replace busy-waiting in
timer_sleep()with an interrupt-driven system
- Priority Scheduling: Implement nested priority donation to prevent priority inversion
- MLFQS: Integrate a Multi-Level Feedback Queue Scheduler for dynamic priority adjustment
By the end of this guide, you'll understand how to build a scheduler that efficiently handles I/O-bound sleeps, ensures high-priority threads never starve, and dynamically adapts to workload patterns using fixed-point arithmetic.
Project Overview
Objective
Transform PintOS's thread scheduler with three critical features:
| Feature | Purpose | Key Benefit |
|---------|---------|-------------|
| **Alarm Clock** | Replace busy-waiting with interrupt-driven blocking | Eliminate CPU waste during sleeps |
| **Priority Scheduling** | Implement nested priority donation | Prevent high-priority thread starvation |
| **MLFQS** | Dynamic priority adjustment based on CPU usage | Balance fairness and responsiveness |
Key Challenges
- Eliminating CPU waste during sleep operations
- Preventing high-priority thread starvation due to lock contention
- Balancing fairness and responsiveness in scheduling decisions
1. 🔔 Alarm Clock: Efficient Sleep Implementation
The Busy-Waiting Problem
The original timer_sleep() implementation suffered from a critical inefficiency:
void timer_sleep(int64_t ticks) {
int64_t start = timer_ticks();
while (timer_ticks() < start + ticks) {
/* Busy-wait: keep looping until enough ticks have passed. */
}
}
Problems with this approach:
- 100% CPU utilization during seemingly idle periods
- Wasted cycles - threads perform no useful work while spinning
- Poor system responsiveness due to unnecessary CPU consumption
Solution: A Sorted Sleep Queue
Our solution maintains a sleep queue sorted by wake-up time, enabling efficient O(1) wake-up checks.
Data Structure
struct sleeping_thread {
int64_t wakeup_tick; /* Absolute wake time in ticks */
struct thread *thread; /* Thread that is blocked */
struct list_elem elem; /* Sorted insertion element */
};
/* Global sleep queue, sorted by wakeup_tick */
static struct list sleeping_threads_list;
Implementation
/* Comparison function for sorting by wakeup_tick */
static bool sleep_cmp_wakeup_tick(const struct list_elem *a,
const struct list_elem *b,
void *aux UNUSED) {
struct sleeping_thread *sa = list_entry(a, struct sleeping_thread, elem);
struct sleeping_thread *sb = list_entry(b, struct sleeping_thread, elem);
return sa->wakeup_tick < sb->wakeup_tick;
}
/* Insert a sleeping_thread into the sorted sleep queue */
static void add_to_sleep_list(struct sleeping_thread *entry) {
list_insert_ordered(&sleeping_threads_list,
&entry->elem,
sleep_cmp_wakeup_tick,
NULL);
}
/* Enhanced timer_sleep() implementation */
void timer_sleep(int64_t ticks) {
if (ticks <= 0) return;
int64_t start = timer_ticks();
int64_t wakeup_tick = start + ticks;
struct sleeping_thread *entry = malloc(sizeof(*entry));
entry->wakeup_tick = wakeup_tick;
entry->thread = thread_current();
enum intr_level old_level = intr_disable();
add_to_sleep_list(entry);
thread_block();
intr_set_level(old_level);
}
Wake-Up in Timer Interrupt Handler
void timer_interrupt(struct intr_frame *args UNUSED) {
ticks++;
/* Wake up threads whose sleep time has arrived */
while (!list_empty(&sleeping_threads_list)) {
struct sleeping_thread *first =
list_entry(list_front(&sleeping_threads_list),
struct sleeping_thread, elem);
if (ticks >= first->wakeup_tick) {
list_pop_front(&sleeping_threads_list);
thread_unblock(first->thread);
free(first);
} else {
/* Since the list is sorted, no other thread is ready */
break;
}
}
thread_tick();
}
🎉 Key Benefits
- O(1) Wake-Up Checks: Only inspect the first element in sorted list
- Zero CPU Usage While Sleeping: Blocked threads removed from ready queue
- Safe Synchronization: Interrupt disabling ensures atomic operations
2. Priority Scheduling & Nested Donation
The Priority Inversion Problem
Consider this classic scenario:
Timeline: Thread L (priority 30) → Thread M (priority 40) → Thread H (priority 63)
1. Thread L acquires Lock A
2. Thread H attempts to acquire Lock A and blocks
3. Thread M preempts Thread L, preventing Lock A release
4. Thread H waits indefinitely despite having highest priority
Result: High-priority threads starve behind medium-priority threads! 😱
Solution: Nested Priority Donation
When a high-priority thread blocks on a lock held by a lower-priority thread, we donate the high priority to the lock holder. This donation propagates recursively through lock chains.
Enhanced Data Structures
struct thread {
int priority; /* Effective priority (may be donated) */
int ori_priority; /* Original base priority */
struct lock *waiting_lock; /* Lock the thread is currently blocked on */
struct list locks; /* Locks currently held, sorted by highest waiter priority */
struct list_elem elem; /* For insertion into ready list or donation list */
/* … existing PintOS thread fields … */
};
struct lock {
struct thread *holder; /* Current owner of the lock, or NULL */
struct list waiters; /* Threads blocked on this lock, sorted by priority */
struct list_elem lock_elem; /* For insertion into a thread's locks list */
int priority; /* Highest priority among threads waiting on this lock */
/* … existing PintOS lock fields … */
};
Priority Donation Implementation
void donate_priority(struct thread *donor, struct lock *lock) {
struct thread *holder = lock->holder;
if (holder == NULL)
return;
if (holder->priority < donor->priority) {
holder->priority = donor->priority;
/* Recursive donation through lock chains */
if (holder->waiting_lock != NULL) {
donate_priority(holder, holder->waiting_lock);
}
}
}
Lock Acquisition with Donation
void lock_acquire(struct lock *lock) {
enum intr_level old_level = intr_disable();
struct thread *cur = thread_current();
if (lock->holder != NULL) {
cur->waiting_lock = lock;
donate_priority(cur, lock);
/* Insert cur into lock->waiters, highest priority first */
list_insert_ordered(&lock->waiters,
&cur->elem,
thread_cmp_priority,
NULL);
thread_block();
/* Execution resumes here after lock_release unblocks us */
cur->waiting_lock = NULL;
}
/* Acquire lock */
lock->holder = cur;
list_insert_ordered(&cur->locks,
&lock->lock_elem,
lock_cmp_priority,
NULL);
cur->priority = max(cur->ori_priority, get_highest_donated_priority(cur));
intr_set_level(old_level);
}
Lock Release and Priority Restoration
void lock_release(struct lock *lock) {
enum intr_level old_level = intr_disable();
struct thread *cur = thread_current();
/* Remove lock from cur->locks */
list_remove(&lock->lock_elem);
if (!list_empty(&lock->waiters)) {
/* Wake up highest-priority waiter */
struct list_elem *max_elem = list_pop_front(&lock->waiters);
struct thread *next = list_entry(max_elem, struct thread, elem);
lock->holder = next;
list_insert_ordered(&next->locks,
&lock->lock_elem,
lock_cmp_priority,
NULL);
next->waiting_lock = NULL;
thread_unblock(next);
} else {
lock->holder = NULL;
}
/* Restore or recompute cur's priority */
int donated = get_highest_donated_priority(cur);
cur->priority = max(cur->ori_priority, donated);
/* Preempt if necessary */
if (!list_empty(&ready_list)) {
struct thread *highest_ready =
list_entry(list_front(&ready_list), struct thread, elem);
if (cur->priority < highest_ready->priority) {
thread_yield();
}
}
intr_set_level(old_level);
}
Why Nested Donation Matters
Example Scenario:
- Thread H (priority 63) waits for Lock A → held by Thread M (priority 40)
- Thread M is blocked on Lock B → held by Thread L (priority 30)
Without nested donation:
- H donates 63 to M, but M remains blocked on B
- L stays at priority 30, allowing medium-priority threads to preempt
- H waits indefinitely!
With nested donation:
- H's donation propagates: M becomes priority 63
- M donates 63 to L (recursive propagation)
- L runs immediately at priority 63, releases B, then A
- H can finally proceed!
3. Multi-Level Feedback Queue Scheduler (MLFQS)
Why MLFQS?
Static priority scheduling has limitations:
- CPU-bound high-priority threads can monopolize the CPU
- Manual priority tuning burden on programmers
- No adaptation to changing workload patterns
MLFQS automatically:
- Tracks CPU usage (
recent_cpu) with exponential decay - Monitors system load (
load_avg) reflecting runnable threads - Dynamically adjusts priorities using the formula:
priority = PRI_MAX - (recent_cpu / 4) - (nice × 2)
This BSD-inspired design promotes fairness between CPU-bound and I/O-bound workloads.
Fixed-Point Arithmetic Implementation
Since PintOS kernel cannot use floating-point, we implement 17.14 fixed-point arithmetic (17 integer bits, 14 fractional bits):
typedef int fixed_point_t;
/* Convert integer n to fixed point */
static fixed_point_t int_to_fp(int n) {
return n << 14;
}
/* Convert fixed-point x to nearest integer (round) */
static int fp_to_int_round(fixed_point_t x) {
if (x >= 0)
return (x + (1 << 13)) >> 14;
else
return (x - (1 << 13)) >> 14;
}
/* Fixed-point arithmetic operations */
static fixed_point_t fp_add(fixed_point_t x, fixed_point_t y) {
return x + y;
}
static fixed_point_t fp_sub(fixed_point_t x, fixed_point_t y) {
return x - y;
}
static fixed_point_t fp_mul(fixed_point_t x, fixed_point_t y) {
return ((int64_t)x * y) >> 14;
}
static fixed_point_t fp_div(fixed_point_t x, fixed_point_t y) {
return ((int64_t)x << 14) / y;
}
Enhanced Thread Structure
struct thread {
int nice; /* Niceness: -20 (high priority) .. +20 (low priority) */
fixed_point_t recent_cpu; /* Scaled CPU usage (17.14 fixed-point) */
/* … existing fields … */
};
static fixed_point_t load_avg; /* System load average (17.14 fixed-point) */
Key MLFQS Formulas
1. System Load Average
Updated once per second (every TIMER_FREQ ticks):
load_avg = (59/60) × load_avg + (1/60) × ready_threads
void update_load_avg(void) {
/* Count ready threads (exclude idle) */
size_t ready_threads = list_size(&ready_list);
if (thread_current() != idle_thread)
ready_threads++;
fixed_point_t term1 = fp_mul(fp_div(int_to_fp(59), int_to_fp(60)), load_avg);
fixed_point_t term2 = fp_mul(fp_div(int_to_fp(1), int_to_fp(60)),
int_to_fp((int) ready_threads));
load_avg = fp_add(term1, term2);
}
2. Recent CPU Usage
Updated per tick for running thread and per second for all threads:
recent_cpu = (2×load_avg)/(2×load_avg + 1) × recent_cpu + nice
void update_recent_cpu_all(void) {
struct list_elem *e;
for (e = list_begin(&all_list); e != list_end(&all_list); e = list_next(e)) {
struct thread *t = list_entry(e, struct thread, allelem);
if (t != idle_thread) {
fixed_point_t coeff = fp_div(fp_mul(int_to_fp(2), load_avg),
fp_add(fp_mul(int_to_fp(2), load_avg),
int_to_fp(1)));
t->recent_cpu = fp_add(fp_mul(coeff, t->recent_cpu), int_to_fp(t->nice));
}
}
}
3. Priority Calculation
Updated every 4 ticks and when threads unblock:
void update_priority(struct thread *t) {
if (t == idle_thread) return;
/* Compute recent_cpu / 4, rounding to nearest */
int recent_div_4 = fp_to_int_round(fp_div(t->recent_cpu, int_to_fp(4)));
int new_priority = PRI_MAX - recent_div_4 - (t->nice * 2);
/* Clamp to valid range */
if (new_priority > PRI_MAX) new_priority = PRI_MAX;
if (new_priority < PRI_MIN) new_priority = PRI_MIN;
t->priority = new_priority;
}
Integration into Timer Interrupt
void thread_tick(void) {
struct thread *cur = thread_current();
if (thread_mlfqs) {
/* Increment recent_cpu for running thread */
if (cur != idle_thread) {
cur->recent_cpu = fp_add(cur->recent_cpu, int_to_fp(1));
}
/* Every second: update load_avg and all threads' recent_cpu */
if (timer_ticks() % TIMER_FREQ == 0) {
update_load_avg();
update_recent_cpu_all();
}
/* Every 4 ticks: update running thread's priority */
if (timer_ticks() % 4 == 0) {
update_priority(cur);
}
}
/* … existing scheduling logic … */
}
📊 MLFQS Behavior Example
Consider two threads A and B, both with nice = 0:
Initial State:
priority_A = 63 - (0/4) - (0 × 2) = 63
priority_B = 63 - (0/4) - (0 × 2) = 63
Timeline:
| Ticks | Running Thread | recent_cpu_A | recent_cpu_B | priority_A | priority_B | Next Scheduled | |-------|----------------|--------------|--------------|------------|------------|----------------| | 0-7 | A | 8 | 0 | 61 | 63 | B | | 8-11 | B | 8 | 4 | 61 | 62 | B | | 12-15 | B | 8 | 8 | 61 | 61 | Round-robin |
Key Insight: CPU-bound threads accumulate higher recent_cpu, leading to lower priorities and better fairness! 🎯
4. 🔧 Design Trade-Offs and Optimizations
Alarm Clock: Overhead vs. Efficiency
| Pros ✅ | Cons ❌ | |---------|---------| | Sleeping threads consume zero CPU | O(N) insertion into sorted list | | O(1) wake-up checks per tick | Memory allocation/deallocation overhead |
🚀 Optimization Ideas:
- Use binary min-heap for O(log N) insertion, O(1) peek
- Embed sleep entries in thread structure to avoid dynamic allocation
Priority Donation: Power vs. Complexity
| Pros ✅ | Cons ❌ | |---------|---------| | Solves priority inversion completely | Recursive donation can be expensive | | Works with nested lock scenarios | Complex bookkeeping for held locks |
🚀 Optimization Ideas:
- Limit donation depth to prevent unbounded recursion
- Use max-heap for O(1) highest-donated priority lookup
MLFQS: Fairness vs. Performance
| Pros ✅ | Cons ❌ | |---------|---------| | Automatic workload balancing | Fixed-point arithmetic adds overhead | | No manual priority tuning needed | O(T) updates every second for all threads |
🚀 Optimization Ideas:
- Stagger per-thread updates across multiple ticks
- Update priorities only on unblock/yield events
🎉 Conclusion
Our enhanced PintOS scheduler now delivers:
- ⚡ Idle-Time Efficiency: Zero CPU usage for sleeping threads via sorted sleep queues
- 🔒 Inversion-Resilient: Nested priority donation prevents high-priority thread starvation
- 🔄 Dynamic Fairness: MLFQS automatically balances CPU-bound vs. I/O-bound workloads
🔑 Key Takeaways
- Sorted Wait Queues enable O(1) wake-up performance critical for timer interrupts
- Nested Donation must correctly propagate through lock chains and restore priorities
- Fixed-Point Arithmetic allows BSD-style MLFQS without floating-point support
- Feature vs. Overhead Balance requires careful consideration of complexity costs
With these enhancements, PintOS transforms from a basic round-robin scheduler into a production-grade thread scheduler capable of handling diverse workloads efficiently and fairly! 🚀
📚 References
- Anderson, T., & Dahlin, M. Operating Systems: Principles and Practice
- 4.4BSD Scheduler Documentation (load_avg, recent_cpu, priority formulas)
- PintOS Project Specification: "Project 2: Alarm Clock, Priority Scheduling, and MLFQS"
- PintOS Source Code (Stanford University)
Built with ❤️ for operating systems education