Creating and grading exams

by Michael Ernst

May, 2011
Last updated: March 29, 2016

This document describes one successful approach to creating a good exam or quiz — such as a midterm or final for a class.

Exam process

The lecturers are ultimately responsible for the exam, including selecting problems, editing and assembling the exam, correcting problems that are discovered during playtesting, and many other tasks. TAs are asked to contribute questions, test the exam, and assist in other ways.

As with any document, keep the exam under version control; email is a poor coordination mechanism, because it is error-prone and suffers time lags.

As a related point, write the exam in a format (such as LaTeX or HTML) that can be conveniently edited by multiple people, can be diffed, and is compatible with version control systems. As a minor advantage, these formats also allow code examples to be included directly from source files, which makes it easy to verify that the source code compiles.

Keep the answer key up to date at all times. Whenever the exam is tested by a TA, test the answer key by grading the completed exam. TAs shouldn't look at the answer key until after they have play-tested the exam, but the answer key should be kept under version control.

Obviously, the exam needs to be playtested multiple times, long enough before it is given to students to permit corrections and more playtesting.

Exam questions

I prefer a closed-book exam. An open-book exam too often turns into a “tree-killer”: students print out everything from the course, but generally don't refer to it. The exam should test whether students have internalized and can apply the material, not whether they can look it up, and the exam should be testing that sort of understanding rather than taking off points for niggling details. Plus, a closed-book exam rewards lecture attendance.

Exam questions should

When you make up a question, you need to write down three types of information:

Don't just take phrases from the book or slides, and ask students to fill in a word. Instead, think about the concept that is being conveyed, and how you can test the concept rather than testing whether a student can regurgitate a phrase.

For multiple-choice questions, always indicate in the question how many answers students should circle.

As in all technical writing, be precise. For instance, short-answer questions should be precise about the length of answer required. Don't say, “answer briefly”. Instead, be specific; for instance, “one sentence”.

Whenever possible, write questions so that they have only one possible answer. For instance, don't ask for any example of a particular phenomenon; instead, ask for the shortest or best example. This makes grading much, much easier: it is both easier to understand whether an answer is right and to understand what is wrong with an incorrect one. Furthermore, solutions should always be as simple as possible; a short solution is less likely to mislead students with a red herring, and we want students to be able to understand their essentials rather than getting caught up in inessential matters.

Try to avoid questions that require students to read or write non-trivial amounts of source code. The assignments evaluate students' ability to read, write, and debug code; students will have had plenty of experience with such activities. The exam should be used to evaluate students on other aspects of the course. Code-related questions can be very frustrating to students; for instance, “find a bug in the following code” can be an “aha” experience that is not well-suited to a limited-time exam (especially with the pressure of a exam). Asking for the result of running a piece of code is ill-motivated, since a programmer would just run it. Questions about code tend to be very long, since you need to provide specifications for every library routine that might be called; students can't be expected to have memorized these. Finally, there are so many possible answers to a coding question that they tend to be quite difficult to grade.

If you write code, use good code style. For instance, comments should be in English, not pidgin: use full sentences, started by capital letters and terminated by periods (or other appropriate punctuation). Comments that are not in clear English are much harder to read, and they set a bad example to the students.

Make up exam questions throughout the term; do not wait until just before the exam to create it. An excellent way to make up exam questions is to pay attention during lecture or section and write down anything that pops into your mind. Or, if there is a common misperception that you notice one week in office hours, make that into an exam question too. If a student asks a question (in lecture or office hours), that is often an excellent exam question as well, since it was something that could confuse a student but that was covered in class. If you follow this process, then creating an exam requires very little extra work — it essentially comes for free.

Print the exam on only one side of the page. If you use figures (whether code or otherwise) that are referenced by a question not on the same page, you should duplicate the figure on a tear-out page at the end of the exam, so that students can see everything relevant at once rather than being forced to flip pages.

Exam reviews

My style is to prepare no material to present to students during an exam review. I only answer questions that students bring to the exam review session. (Naturally, tell students that this will be the case, so that they can prepare for the review!) Questions such as “Can you explain this whole section of the course?” inspire no respect, and you needn't answer them directly. But when students have specific questions, often that can segue into a broader discussion, and that is a quite productive way to run a review session.

Exam grading

Immediately after giving a exam, the course staff will gather to grade the exam. My policy is that no one leaves until all the exams are graded — but you can pop out for a class, then return afterward, if you have a conflict. Typically, this takes 3-4 hours, but it can range from 2 to 8.

The grading time depends almost entirely on the quality of the exam. The worst situation is when the staff disagrees about the best answer to a question (even a multiple-choice one) and has to work it out in the grading room. The next-worst is when a question required students to write a long explanation; pithy ones are easiest to grade.

If you grade a question, you are responsible for making up a grading key or improving an existing one, and you are responsible for recording that (typically in comments in the exam document) for use when dealing with makeup exams, regrade requests, etc. Furthermore, you are responsible for improvements to the solutions, in particular explanations of any issues that many students got wrong.

After the exam, distribute both an original version of the exam, plus one with solutions. This lets future students test their understanding, and it explains any of their misunderstandings.


Back to Advice compiled by Michael Ernst.

Michael Ernst