CSE Curriculum Revision -- Dan's Inventory

Introduction

What are we teaching in our 7 required 300-level courses (plus STAT390/391)? It is easy to think we know based on catalog descriptions, what we did when we last taught these courses, or what we learned in similar courses we took long ago. It is likely we don't actually know, certainly not well enough to attempt improving the curriculum. It seems logical that a useful step in improving something is a clear understanding of where we stand. Dan G. took this "inventory" in January 2009 and learned plenty. In hindsight, this is not surprising: Dan has been here 5.5 years and has taught only 303 and 341 during that time. If you haven't taught more than, say, 2 of these courses in the last, say, decade (and very few people have), you will probably learn something too.

If committee members internalized most of the information below, it would probably make our discussions more efficient and productive.

New addition: Based on Tom's suggestion at our January 20 meeting, Dan added example homeworks, projects, exams. The examples are somehwere between "random" and "representative". Basically Dan looked for instructive examples, but was also working quickly.

Caveats

This inventory was compiled by reading course web pages over just the last year or two. It is understandably biased toward instructors that post easy-to-understand lecture schedules. The numbers for each topic is number-of-lectures-on-topic. It generally doesn't account for things taught in section. Ranges account for diversity among instructors, particularly note things like 0-3, which would mean one instructor spends a week on something another doesn't cover. That's not necessarily a problem; the goal is just to capture "where we are" one solid level deeper than "we have a course on programming languages where they learn functional programming and a bunch of other stuff". Obviously everything is approximate. For example, Winter quarter is always shorter. If you think something is wrong or confusing, please do offer suggestions. Remember Dan G. has never taught 5 of these courses.

A potential outcome of the committee would be a new curriculum described at about this level of detail. Of course, this would be a starting point that would naturally evolve over the first few iterations, since an alleged one-week topic might actually take two weeks or vice-versa. We would also continue to have natural differences for different instructors. Material "removed" would almost surely be covered in a relevant 400-level course.

Inventory

303: Concepts and Tools for Software Development 3 credits

Bash shell (2-3)
grep (1)
C basics (6)
break-point debuggers (1)
makefiles (1)
testing (1)
specifications (1)
version control (1)
linking (1)
manual memory management (1)
societal implications (4)
sed (0-1)
profiling (0-1)
concurrency (1-3)
program design (0-2)
some C++ basics (2-5)
Perl (0-2)
cgi (0-2)

Notes: Offerings by Balazinska/Perkins/Grossman are quite similar with Balazinska doing more C++. Zahorjan is less close, but still within 20-25%.

Sample work:

Nontrivial C program with dynamic data structures (Perkins)
Implement, test, and benchmark a simple memory manager (Perkins)
Final exam (Grossman)

321: Discrete Structures 4 credits

Logic (propositional, first-order) (5)
Proofs, induction (3)
Number theory (e.g., primes) (3)
Sets, functions, relations (4)
Combinatorics (3)
Discrete probability, Bayes (2-3)
Expectation (2)
Graphs (3-4)

Notes: Appears quite stable, but several recent offerings post only homeworks, not a lecture schedule.

Sample work:

Final exam (Reges)
Final exam (Benson Limketkai)

322: Intro. to Formal Models in Comp. Sci. (3 credits) Pre-req: 321

finite automata (6)
regular expressions (3)
Pumping lemma for finite automata (2)
Myhill-Nerode (more finite automata) (2)
String matching (2)
Context-free languages (4)
Pushdown automata (3)
Pumping lemma for context-free languages (2)
Turing machines / decidability / halting (3, at end of term)

Notes: Seems very consistent.

Sample work:

A sample final exam (Bacon/Beame)
Final-exam topics (Beame)

326: Data Structures (4 credits) Pre-req: 321

Stacks / queues (1)
Asymptotic analysis (2-3)
Memory performance (0-2)
Priority queues / heaps (4-5)
Binomial queues (1-2)
Balanced trees (6)
Hashing (3)
Sorting (4)
Union/find (1-2)
Graphs (4-5)
NP-Completeness (1-2, at end of term)

Sample work:

Implementing splay trees (Tompa)
Understanding and analyzing splay trees (Tompa)
Project word-frequency analysis (multiple)

341: Programming Languages (4 credits)

Basic functional programming (3)
Datatypes and ML/Haskell-style pattern-matching (3)
Higher-order functions (4)
Tail recursion (1)
Modularity (1-2)
Generics (i.e., polymorphism) (1)
Laziness (1-2)
Macros (1)
Quoting/eval (1)
Type systems, static vs. dynamic (1)
Subtyping (2)
Dynamically-Typed OO (3)
OO vs. functional (1)
Iterators (1)
Advanced OO topics (e.g., mixins) (0-1)
Garbage collection (0-1)
Logic programming (0-3)

Notes: Sort of an average/sum of Borning/Reges/Grossman with slight bias toward Grossman because less exact schedule available for Borning

Sample work:

Homework defining and using higher-order functions (see also provided code) (Grossman)
Homework writing an interpreter for a language with first-class function closures (see also provided code) (Grossman)
Final exam (Grossman)
Final exam (Borning)
Sample final exam (see also solution) (Reges)

370: Introduction to Digital Design (4 credits)

Binary number systems (1)
Boolean algebra, truth tables (1-2)
Truth tables / boolean cubes / Karnaugh maps (3)
Gates, circuit logic (3)
Mux / demux (1)
Flip-flops (1)
Delays, clocks, glitches (1)
Verilog (1-2)
State diagrams (1)
finite state machines (3-4)
Computer ogranization (2)
Adders, multipliers, ALUs (1-4)
FPGAs (0-1)
FETs, CMOS (0-1)
PALs, ROMs, etc. (0-1)

Notes: The least consistent aspect seems to be the part on underlying technologies, but this isn't a major component of the course?

Sample work:

378: Machine Organization and Assembly Language (4 credits), Pre-reqs: 303, 370

MIPS (2)
Assembly programming, data/pointers (1-2)
Assembly programming, control/procedures (3)
Machine language (instruction format) (1)
Multi-cycle control/data, microporgramming (4)
Pipelining (3-5)
Caches (4)
Performance (defining, measuring) (1)
Virtual memory (1-2)
Exceptions (2)
I/O (1-2)
Parallelism (2-4)
Floating-point numbers (0-2)

Notes: The list above doesn't indicate that the course also includes a substantial lab building a MIPS processor with FPGAs.

Sample work:

Autumn 2008 lab sequence (Ruth Anderson)
homework: assembly programming with recursion (Oskin)
homework: cache mechanics (Oskin)
homework: memory system (Ceze)

Statistics: 390 or 391 or (394&395)

Students take either 390 or 391. While 391 was designed to server our students needs better, it is offered only once a year so most of our students take 390.

Catalog description: STAT390: Probability and Statistics in Engineering and Science (4) Concepts of probability and statistics. Conditional probability, independence, random variables, distribution functions. Descriptive statistics, transformations, sampling errors, confidence intervals, least squares and maximum likelihood. Exploratory data analysis and interactive computing.

Current course web page for 390

Some facts extracted from the web page and syllabus:

Text: Applied Statistics for Engineers and Scientists (2nd Edition)
Text coverage: Chapters 1-11 except skip 1.5, skip page 84, skim 4.3, skip 6, skip page 318, skip 7.6, skip 8.4, skip 10.
Buzz words/concepts: Data, histogram, mean, standard deviation, scatterplot, correlation, regression, conditional probability, distribution, binomial, normal, sampling, estimation, inference, confidence interval, hypothesis/significance testing, anova, etc.

Catalog dscription: STAT391: Probability and Statistics for Computer Science (4) Fundamentals of probability and statistics from the perspective of the computer scientist. Random variables, distributions and densities, conditional probability, independence. Maximum likelihood, density estimation, Markov chains, classification. Applications in computer science. Pre-reqs: 2.5 in MATH 126; 2.5 in MATH 308; either CSE 326, CSE 373, CSE 417, or CSE 421

Spring 2008 web page for 391

Some facts extracted from the web page and syllabus:

"This course was developed by Marina Meila as an equivalent alternative to STAT 390 (Probability and Statistics in Engineering and Science), that is tailored specifically for the needs of computer scientists."
Taught in Spring 2008 by Assaf Oron, by Marina 2001--2007
Week-by-week lecture topics:
1. Programming with R, basic probability (measures, independence, ...), discrete distributions (uniform, Bernouilli, ...)
2. Continuous distributions, expectation, maximum-likelihood estimation, random variables
3. Descriptive statistics (exploratory data analysis), nonparametric statistics
4. Variance, laws of large numbers, central limit theorem, normal and Student's t-distributions
5. Decisions under uncertainty, classical statistical inference: hypothesis tests, confidence intervals, p-values
6. Conditional probability, Bayes' rule, Bayesian estimation
7. Covariance, Correlation, Regression (linear, R-squared, multiple, inference)
8. Regression continued; nonparametric statistical inference: permutation tests, bootstrapping, cross-validation
9. Review
10. Machine Learning: Clustering and Classification, Cross-Validation
The week 9 review includes this comparison to prior versions: "We did more descriptives, hypotheses, and estimation than Marina's courses -- but less probability theory and kernel estimates"
Looking at Spring 2006 web page suggests it also covered histograms, K-means clustering, and marginal distributions (but did not cover everything covered in 2008).
No textbook, though Marina had course notes available in book form.

Dan G. could still use some more information on how STAT390/391 came to be and what our relation to them is.