CS 162
Spring 1996

Homework 4 -- sample solution and discussion

1 Condition variables in semaphores

Consider the following Condition class implementation:

  class Condition {
      Semaphore *semaphore;
      int numWaiting;
  }

  Condition::Condition() {
      semaphore = new Sempahore(0);
      numWaiting = 0;
  }

  void Condition::Wait(Lock *lock) {
      numWaiting++;
      lock->Release();
      semaphore->P();
      lock->Acquire();
  }

  void Condition::Signal(Lock *lock) {
      if (numWaiting > 0) {
        numWaiting--;
        semaphore->V();
      }
  }

this code does not implement Mesa-style condition variables, nor does it implement Hoare-style condition variables. To see why this implementation fails, consider three threads A, B, and C such that A and C run the code fragment:

  ...
  lock->Acquire();
  condition->Wait(lock);
  lock->Release();
  ...

and B runs the code fragment:

  ...
  lock->Acquire();
  condition->Signal(lock);
  lock->Release();
  ...
finally, consider the case when these threads execute their Condition method calls in the sequence <A, B, C>. It is possible for a context-switch to occur immediately after A releases the lock, such that B executes its critical region and C acquires the lock and executes before A is scheduled again.

While running Condition::Wait, A incremented numWaiting but did not yet call semaphore->P. Therefore B, running Condition::Signal, calls semaphore->V and the semaphore gets the value 1. Then C acquires the lock and might possibly call Semaphore->P before A runs again, but this means that it decrements the semaphore back to 0 and does not block. In effect, Condition::Signal may awake a thread which calls Condition::Wait after the signal! This is a clear violation of the semantics of condition variables.

The assignment also asks whether the solution can be any simpler that the one currently in Nachos. The immediate response to this question should be to ask, "is it simpler in what way?" Had the code above successfully implemented the needed semantics, it would certainly have used fewer lines of code and exhibited a reduction in both time and space complexity.

However, as everyone who did this assignment should realize, the code above is much more difficult to reason about than the concrete implementation currently available in Nachos. Unless one had the luck of immediately seeing the counter-example used to disprove the implementation above, a tedious case-analysis needed to be undertaken in an attempt to derive a proof of correctness for the algorithm; unless an error is made in one of the numerous steps, such an analysis will eventually produce a counter-example to show that the implementation fails. By comparison, the currently available implementation has a transparent proof of correctness built into it. An intuitive proof can be constructed around the concrete invariants of the shared data structures.

When development resources are limited, simplicity of validation is an important metric with which to evaluate implementation strategies.

2 Bridge control: synchronizing with condition variables

The bridge controller must prevent collisions by restricting bridge travel to one direction at a time. Further, it must limit the number of vehicles simultaneously on the bridge to 3 or fewer.

2.a Bridge controller

This class-based implementation of the bridge controller encapsulates all of the required routines behind the thread-safe method OneVehicle:

  class Bridge {
    // this class is a thread-safe controller for bridge traffic.
    // do not use members concurrently with constructor/destructor
    //
    public:
      Bridge();   // initialize an empty bridge controller
      ~Bridge();  // destroy the bridge controller

      enum Direction { East, West };
      // used to specify travel direction...

      void OneVehicle(Direction direc);
      // cross the bridge safely.
      // satisfies the requirements of the protected methods.
      // will never deadlock, but starvation is possible.

    protected:
      void ArriveBridge(Direction direc);
      // when this returns, it is safe for the vehicle to cross

      void CrossBridge(Direction direc);
      // travel across the bridge.
      // only legal when it is safe!

      void ExitBridge(Direction direc);
      // called after a vehicle has crossed safely
      // only legal to call after calling ArriveBridge.

      bool isSafe(Direction direc);
      // only legal to call while holding lock

      Lock lock();
      Condition safe();

      Direction currentDirec;
      int currentNumber;
  };

  Bridge::Bridge() {
    currentNumber = 0;
  }

  Bridge::~Bridge() { }

  void Bridge::OneVehicle(Bridge::Direction direc) {
    ArriveBridge(direc);
    CrossBridge(direc);
    ExitBridge(direc);
  }

  void Bridge::ArriveBridge(Bridge::Direction direc) {
    lock.Acquire();
      while ( ! isSafe(direc) ) safe.Wait(lock);
      currentNumber++;
      currentDirec = direc;
    lock.Release();
  }

  void Bridge::CrossBridge(Bridge::Direction direc) {
    // go, man, go!
  }

  void Bridge::ExitBridge(Bridge::Direction direc) {
    lock.Acquire();
      currentNumber--;
      safe.Signal(lock);
    lock.Release();
  }

  bool isSafe(Bridge::Direction direc) {
    if ( currentNumber == 0 ) 
      return TRUE;    // always safe when bridge is empty
    else if ( (currentNumber < 3) &&
              (currentDirec == direc) ) 
      return TRUE;    // room for us to follow others in direc
    else
      return FALSE;   // bridge is full or has oncoming traffic.
  }

because the above implementation provides a thread-safe public interface, it is easier to reason about the correctness of programs. A proof of correctness should be derived once for the interface, and all future client threads can then use it with impunity.

2.b Controller behavior

If a car A arrives (meaning it calls OneVehicle) while traffic is currently moving in its direction and a car B is currently waiting to cross in the opposite direction (meaning its call to OneVehicle has already blocked in ArriveBridge), then depending on the situation either car might cross next. If there is only one car C on the bridge and it exits (running ExitBridge) before A acquires the lock in ArriveBridge, then B will be awake and competing for the lock; depending on which car gets the lock, either might enter the empty bridge. If A acquires the lock before C can exit the bridge, then A will get to cross before B can possibly observe a safe condition. Therefore, the solution permits starvation.

3 Handling deadlock

Algorithms to handle deadlock differ in the amount of concurrency they provide and in the runtime costs associated with the algorithms.

3.a Concurrency ratings

In order from most-concurrent to least, there is a rough partial order on the deadlock-handling algorithms:

  1. detect deadlock and kill thread, releasing its resources
    detect deadlock and roll back thread's actions
    *restart thread and release all resources if thread needs to wait

    None of these algorithms limit concurrency before deadlock occurs, since they rely on runtime checks rather than static restrictions. Their effects after deadlock is detected are harder to characterize: they still allow lots of concurrency (in some cases they enhance it), but the computation may no longer be sensical or efficient. The third algorithm is the strangest, since so much of its concurrency will be useless repetition; because threads compete for execution time, this algorithm also prevents useful computation from advancing. Hence it is listed twice in this ordering, at both extremes.

  2. banker's algorithm
    resource ordering

    These algorithms cause more unnecessary waiting than the previous two by restricting the range of allowable computations. The banker's algorithm prevents unsafe allocations (a proper superset of deadlock-producing allocations) and resource ordering restricts allocation sequences so that threads have fewer options as to whether they must wait or not.

  3. reserve all resources in advance

    This algorithm allows less concurrency than the previous two, but is less pathological than the worst one. By reserving all resources in advance, threads have to wait longer and are more likely to block other threads while they work, so the system-wide execution is in effect more linear.

  4. *restart thread and release all resources if thread needs to wait

    As noted above, this algorithm has the dubious distinction of allowing both the most and the least amount of concurrency, depending on the definition of concurrency.

3.b Efficiency ratings

In order from most-efficient to least, there is a rough partial order on the deadlock-handling algorithms:

  1. reserve all resources in advance
    resource ordering

    These algorithms are most efficient because they involve no runtime overhead. Notice that this is a result of the same static restrictions which made these rank poorly in concurrency.

  2. banker's algorithm
    detect deadlock and kill thread, releasing its resources

    These algorithms involve runtime checks on allocations which are roughly equivalent; the banker's algorithm performs a search to verify safety which is O(n m) in the number of threads and allocations, and deadlock detection performs a cycle-detection search which is O(n) in the length of resource-dependency chains. Resource-dependency chains are bounded by the number of threads, the number of resources, and the number of allocations.

  3. detect deadlock and roll back thread's actions

    This algorithm performs the same runtime check discussed previously but also entails a logging cost which is O(n) in the total number of memory writes performed.

  4. restart thread and release all resources if thread needs to wait

    This algorithm is grossly inefficient for two reasons. First, because threads run the risk of restarting, they have a low probability of completing. Second, they are competing with other restarting threads for finite execution time, so the entire system advances towards completion slowly if at all.

this ordering does not change when deadlock is more likely. The algorithms in the first group incur no additional runtime penalty because they statically disallow deadlock-producing execution. The second group incurs a minimal, bounded penalty when deadlock occurs. The algorithm in the third tier incurs the unrolling cost, which is O(n) in the number of memory writes performed between checkpoints. The status of the final algorithm is questionable because the algorithm does not allow deadlock to occur; it might be the case that unrolling becomes more expensive, but the behavior of this restart algorithm is so variable that accurate comparative analysis is nearly impossible.

4 Scheduling

For batch processing, we know that the algorithm which minimizes average response time is Shortest Job First (SJF) and that the algorithm which maximizes average response time is Longest Job First (LJF). Recall that non-preemptive SJF and LJF perform equivalently for one workload--jobs of equal length--and that preemptive and non-preemptive SJF are equivalent, but preemptive LJF is worse than non-preemptive LJF. To evaluate other scheduling policies we should consider under what workloads (if any) they perform as well as SJF and under what workloads (if any) they perform as poorly as preemptive LJF.

4.a FIFO

When jobs arrive in SJF order, FIFO is optimal and performs equivalently to SJF. When jobs arrive in LJF order, FIFO is pessimal and performs equivalently to non-preemptive LJF. When all jobs have equal length FIFO performs equivalently to both SJF and non-preemptive LJF, which are of course equivalent under such a load.

4.b Round robin

For mixed workloads, round-robin performs equivalently independent of order; with mixed jobs, round-robin performs worse than SJF and non-preemptive LJF, but better than preemptive LJF. With uniformly long jobs round-robin is pessimal and performs equivalently to preemptive LJF; the shorter the jobs, the closer round robin is to FIFO, until the jobs are the same length as the scheduling quanta and round robin is optimal, simulating FIFO.

4.c Shortest time to completion first with preemption

Shortest time to completion first with preemption (STCFp) is equivalent to preemptive SJF. STCFp performs optimally for all work loads and its worst case is a single job, where it is also equivalent to either form of LJF.

4.d Multi-level feedback

For mixed workloads, multi-level feedback (with preemption) approximates STCFp, with the exception that job-length for I/O jobs must be redefined to be the time to the next I/O; interactive jobs are therefore scheduled first and the average response time for these sub-jobs is optimized, rather than the time for the entire job. For uniform workloads, either I/O or CPU-bound, this algorithm approximates round-robin and is approximately pessimal. However, this reflects an inadequacy of average response time as an evaluation metric as much as it reflects an inadequacy in multi-level feedback schedulers.

4.e Lottery scheduling with ticketing inversely proportional to execution

This scheduler approximates multi-level feedback but with a randomizing effect. For mixed loads it is similarly optimal, favoring the short I/O sub-jobs. For uniform workloads it is similarly pessimal, approximating round-robin.

Some of these scheduling policies exhibit more complex behavior than is explicitly described here. Much of the variation comes with workloads which are heterogeneous with respect to job-length and I/O frequency, where the policies have differing strengths and weaknesses. In addition, there are other metrics by which to judge policies besides average response time, such as fairness and other human-interface considerations.