[CS161] The Classic CV Error
This is a very technical post, largely for the benefit of the students of CS161: Operating Systems, for which I am a Teaching Fellow this semester. It may be useful to you if you're interested in operating systems for some reason, but if you're not in a CS mood today, maybe just move along.
From what I've seen as a TF for this course, it is very, very normal to write condition-variables code that looks like this:
struct cv {
struct semaphore *sem;
volatile int waiters;
}
void cv_wait(struct cv *cv, struct lock *lock) {
KASSERT(lk_do_i_hold(lock));
cv->waiters++;
lk_release(lock);
P(cv->sem);
lk_acquire(lock);
}
void cv_broadcast(struct cv *cy struct lock *lock) {
KASSERT(lk_do_i_hold(lock));
for (; cv->waiters > 0; cv->waiters--)
V(cv->sem);
}
This code is wrong (or, more specifically, badly synchronized). And it is such a common error that I'm choosing to dub it The Classic CV Error. It's subject to a race condition in e.g. the following case:
int before_and_after (int *the_thing, struct cv *thing_changed_cv, struct lock *thing_changed_lock) {
lk_acquire(thing_changed_lock);
int before = *the_thing;
cv_wait(thing_changed_cv, thing_changed_lock);
int after = *the_thing;
KASSERT(before != after);
lk_release(thing_changed_lock);
return compare_things(before,after);
}
void mess_around (int *the_thing, struct cv *thing_changed_cv, struct lock *thing_changed_lock) {
lk_acquire(thing_changed_lock);
mutate(the_thing);
cv_broadcast(thing_changed_cv, thing_changed_lock);
lk_release(thing_changed_lock);
}
This code is properly synchronized (and in particular, protects the KASSERT(before != after)), assuming that your locks and CVs