[CS161] The Classic CV Error
This is a very technical post, largely for the benefit of the students of CS161: Operating Systems, for which I am a Teaching Fellow this semester. It may be useful to you if you're interested in operating systems for some reason, but if you're not in a CS mood today, maybe just move along.
From what I've seen as a TF for this course, it is very, very normal to write condition-variables code that looks like this:
struct cv { struct semaphore *sem; volatile int waiters; } void cv_wait(struct cv *cv, struct lock *lock) { KASSERT(lk_do_i_hold(lock)); cv->waiters++; lk_release(lock); P(cv->sem); lk_acquire(lock); } void cv_broadcast(struct cv *cy struct lock *lock) { KASSERT(lk_do_i_hold(lock)); for (; cv->waiters > 0; cv->waiters--) V(cv->sem); }
This code is wrong (or, more specifically, badly synchronized). And it is such a common error that I'm choosing to dub it The Classic CV Error. It's subject to a race condition in e.g. the following case:
int before_and_after (int *the_thing, struct cv *thing_changed_cv, struct lock *thing_changed_lock) { lk_acquire(thing_changed_lock); int before = *the_thing; cv_wait(thing_changed_cv, thing_changed_lock); int after = *the_thing; KASSERT(before != after); lk_release(thing_changed_lock); return compare_things(before,after); } void mess_around (int *the_thing, struct cv *thing_changed_cv, struct lock *thing_changed_lock) { lk_acquire(thing_changed_lock); mutate(the_thing); cv_broadcast(thing_changed_cv, thing_changed_lock); lk_release(thing_changed_lock); }
This code is properly synchronized (and in particular, protects the KASSERT(before != after)
), assuming that your locks and CVs