CS 162
Spring 1996
Homework 4 -- sample solution and discussion
1 Condition variables in semaphores
Consider the following Condition
class implementation:
class Condition { Semaphore *semaphore; int numWaiting; } Condition::Condition() { semaphore = new Sempahore(0); numWaiting = 0; } void Condition::Wait(Lock *lock) { numWaiting++; lock->Release(); semaphore->P(); lock->Acquire(); } void Condition::Signal(Lock *lock) { if (numWaiting > 0) { numWaiting--; semaphore->V(); } }
this code does not implement Mesa-style condition variables, nor does it implement Hoare-style condition variables. To see why this implementation fails, consider three threads A, B, and C such that A and C run the code fragment:
... lock->Acquire(); condition->Wait(lock); lock->Release(); ...
and B runs the code fragment:
... lock->Acquire(); condition->Signal(lock); lock->Release(); ...finally, consider the case when these threads execute their
Condition
method calls in the sequence <A, B,
C>. It is possible for a context-switch to occur immediately
after A releases the lock, such that B executes its critical
region and C acquires the lock and executes before A is
scheduled again.
While running Condition::Wait
, A incremented
numWaiting
but did not yet call semaphore->P
.
Therefore B, running Condition::Signal
, calls
semaphore->V
and the semaphore gets the value
1
. Then C acquires the lock and might possibly call
Semaphore->P
before A runs again, but this means that
it decrements the semaphore back to 0
and does not block. In
effect, Condition::Signal
may awake a thread which calls
Condition::Wait
after the signal! This is a clear violation of
the semantics of condition variables.
The assignment also asks whether the solution can be any simpler that the one currently in Nachos. The immediate response to this question should be to ask, "is it simpler in what way?" Had the code above successfully implemented the needed semantics, it would certainly have used fewer lines of code and exhibited a reduction in both time and space complexity.
However, as everyone who did this assignment should realize, the code above is much more difficult to reason about than the concrete implementation currently available in Nachos. Unless one had the luck of immediately seeing the counter-example used to disprove the implementation above, a tedious case-analysis needed to be undertaken in an attempt to derive a proof of correctness for the algorithm; unless an error is made in one of the numerous steps, such an analysis will eventually produce a counter-example to show that the implementation fails. By comparison, the currently available implementation has a transparent proof of correctness built into it. An intuitive proof can be constructed around the concrete invariants of the shared data structures.
When development resources are limited, simplicity of validation is an important metric with which to evaluate implementation strategies.
2 Bridge control: synchronizing with condition variables
The bridge controller must prevent collisions by restricting bridge travel to one direction at a time. Further, it must limit the number of vehicles simultaneously on the bridge to 3 or fewer.
2.a Bridge controller
This class-based implementation of the bridge controller encapsulates all of
the required routines behind the thread-safe method OneVehicle
:
class Bridge { // this class is a thread-safe controller for bridge traffic. // do not use members concurrently with constructor/destructor // public: Bridge(); // initialize an empty bridge controller ~Bridge(); // destroy the bridge controller enum Direction { East, West }; // used to specify travel direction... void OneVehicle(Direction direc); // cross the bridge safely. // satisfies the requirements of the protected methods. // will never deadlock, but starvation is possible. protected: void ArriveBridge(Direction direc); // when this returns, it is safe for the vehicle to cross void CrossBridge(Direction direc); // travel across the bridge. // only legal when it is safe! void ExitBridge(Direction direc); // called after a vehicle has crossed safely // only legal to call after calling ArriveBridge. bool isSafe(Direction direc); // only legal to call while holding lock Lock lock(); Condition safe(); Direction currentDirec; int currentNumber; }; Bridge::Bridge() { currentNumber = 0; } Bridge::~Bridge() { } void Bridge::OneVehicle(Bridge::Direction direc) { ArriveBridge(direc); CrossBridge(direc); ExitBridge(direc); } void Bridge::ArriveBridge(Bridge::Direction direc) { lock.Acquire(); while ( ! isSafe(direc) ) safe.Wait(lock); currentNumber++; currentDirec = direc; lock.Release(); } void Bridge::CrossBridge(Bridge::Direction direc) { // go, man, go! } void Bridge::ExitBridge(Bridge::Direction direc) { lock.Acquire(); currentNumber--; safe.Signal(lock); lock.Release(); } bool isSafe(Bridge::Direction direc) { if ( currentNumber == 0 ) return TRUE; // always safe when bridge is empty else if ( (currentNumber < 3) && (currentDirec == direc) ) return TRUE; // room for us to follow others in direc else return FALSE; // bridge is full or has oncoming traffic. }
because the above implementation provides a thread-safe public interface, it is easier to reason about the correctness of programs. A proof of correctness should be derived once for the interface, and all future client threads can then use it with impunity.
2.b Controller behavior
If a car A arrives (meaning it calls OneVehicle
) while
traffic is currently moving in its direction and a car B is currently
waiting to cross in the opposite direction (meaning its call to
OneVehicle
has already blocked in ArriveBridge
),
then depending on the situation either car might cross next. If there is
only one car C on the bridge and it exits (running
ExitBridge
) before A acquires the lock in
ArriveBridge
, then B will be awake and competing for the
lock; depending on which car gets the lock, either might enter the empty
bridge. If A acquires the lock before C can exit the bridge,
then A will get to cross before B can possibly observe a safe
condition. Therefore, the solution permits starvation.
3 Handling deadlock
Algorithms to handle deadlock differ in the amount of concurrency they provide and in the runtime costs associated with the algorithms.
3.a Concurrency ratings
In order from most-concurrent to least, there is a rough partial order on the deadlock-handling algorithms:
None of these algorithms limit concurrency before deadlock occurs, since they rely on runtime checks rather than static restrictions. Their effects after deadlock is detected are harder to characterize: they still allow lots of concurrency (in some cases they enhance it), but the computation may no longer be sensical or efficient. The third algorithm is the strangest, since so much of its concurrency will be useless repetition; because threads compete for execution time, this algorithm also prevents useful computation from advancing. Hence it is listed twice in this ordering, at both extremes.
These algorithms cause more unnecessary waiting than the previous two by restricting the range of allowable computations. The banker's algorithm prevents unsafe allocations (a proper superset of deadlock-producing allocations) and resource ordering restricts allocation sequences so that threads have fewer options as to whether they must wait or not.
This algorithm allows less concurrency than the previous two, but is less pathological than the worst one. By reserving all resources in advance, threads have to wait longer and are more likely to block other threads while they work, so the system-wide execution is in effect more linear.
As noted above, this algorithm has the dubious distinction of allowing both the most and the least amount of concurrency, depending on the definition of concurrency.
3.b Efficiency ratings
In order from most-efficient to least, there is a rough partial order on the deadlock-handling algorithms:
These algorithms are most efficient because they involve no runtime overhead. Notice that this is a result of the same static restrictions which made these rank poorly in concurrency.
These algorithms involve runtime checks on allocations which are roughly equivalent; the banker's algorithm performs a search to verify safety which is O(n m) in the number of threads and allocations, and deadlock detection performs a cycle-detection search which is O(n) in the length of resource-dependency chains. Resource-dependency chains are bounded by the number of threads, the number of resources, and the number of allocations.
This algorithm performs the same runtime check discussed previously but also entails a logging cost which is O(n) in the total number of memory writes performed.
This algorithm is grossly inefficient for two reasons. First, because threads run the risk of restarting, they have a low probability of completing. Second, they are competing with other restarting threads for finite execution time, so the entire system advances towards completion slowly if at all.
this ordering does not change when deadlock is more likely. The algorithms in the first group incur no additional runtime penalty because they statically disallow deadlock-producing execution. The second group incurs a minimal, bounded penalty when deadlock occurs. The algorithm in the third tier incurs the unrolling cost, which is O(n) in the number of memory writes performed between checkpoints. The status of the final algorithm is questionable because the algorithm does not allow deadlock to occur; it might be the case that unrolling becomes more expensive, but the behavior of this restart algorithm is so variable that accurate comparative analysis is nearly impossible.
4 Scheduling
For batch processing, we know that the algorithm which minimizes average response time is Shortest Job First (SJF) and that the algorithm which maximizes average response time is Longest Job First (LJF). Recall that non-preemptive SJF and LJF perform equivalently for one workload--jobs of equal length--and that preemptive and non-preemptive SJF are equivalent, but preemptive LJF is worse than non-preemptive LJF. To evaluate other scheduling policies we should consider under what workloads (if any) they perform as well as SJF and under what workloads (if any) they perform as poorly as preemptive LJF.
4.a FIFO
When jobs arrive in SJF order, FIFO is optimal and performs equivalently to SJF. When jobs arrive in LJF order, FIFO is pessimal and performs equivalently to non-preemptive LJF. When all jobs have equal length FIFO performs equivalently to both SJF and non-preemptive LJF, which are of course equivalent under such a load.
4.b Round robin
For mixed workloads, round-robin performs equivalently independent of order; with mixed jobs, round-robin performs worse than SJF and non-preemptive LJF, but better than preemptive LJF. With uniformly long jobs round-robin is pessimal and performs equivalently to preemptive LJF; the shorter the jobs, the closer round robin is to FIFO, until the jobs are the same length as the scheduling quanta and round robin is optimal, simulating FIFO.
4.c Shortest time to completion first with preemption
Shortest time to completion first with preemption (STCFp) is equivalent to preemptive SJF. STCFp performs optimally for all work loads and its worst case is a single job, where it is also equivalent to either form of LJF.
4.d Multi-level feedback
For mixed workloads, multi-level feedback (with preemption) approximates STCFp, with the exception that job-length for I/O jobs must be redefined to be the time to the next I/O; interactive jobs are therefore scheduled first and the average response time for these sub-jobs is optimized, rather than the time for the entire job. For uniform workloads, either I/O or CPU-bound, this algorithm approximates round-robin and is approximately pessimal. However, this reflects an inadequacy of average response time as an evaluation metric as much as it reflects an inadequacy in multi-level feedback schedulers.
4.e Lottery scheduling with ticketing inversely proportional to execution
This scheduler approximates multi-level feedback but with a randomizing effect. For mixed loads it is similarly optimal, favoring the short I/O sub-jobs. For uniform workloads it is similarly pessimal, approximating round-robin.
Some of these scheduling policies exhibit more complex behavior than is explicitly described here. Much of the variation comes with workloads which are heterogeneous with respect to job-length and I/O frequency, where the policies have differing strengths and weaknesses. In addition, there are other metrics by which to judge policies besides average response time, such as fairness and other human-interface considerations.