This post shows how to get around the limitations of the destruct tactic when doing case analysis on dependent types, without resorting to dependent destruction, which relies on additional axioms.

New users of Coq, especially those who have worked in Agda or another dependently typed programming language, sometimes wonder why programs written in Coq don't take (more) advantage of the dependent type system to enforce invariants. It turns out it's more of a pain to use dependent types in your Coq programs than it is in Agda. This is partially due to foundational reasons (Agda uses the K axiom to make pattern matching easier), but it's also due to purely superficial annoyances on the Coq side.

Doug summed up his opinion of dependently typed programming in Coq like this:

Coq has this really powerful type system, but... don't use it.
—Doug ('s Dad)

On the other hand, it's hard to resist using a feature that's there. If you do use dependent types in your programs, you may have run into trouble with pattern matching or reasoning by case analysis over these types. There are ways to get around these problems, but the most common ones require assuming extra axioms.

The other day in the freenode IRC channel for Coq, user osa1 asked about defining and reasoning about the cardinality of types. Similar to how it might be done in math, he defined a type to have cardinality n if there was a bijection between that type and Fin n, the type with exactly n elements. osa1 was working through a (seemingly) simple example and got stuck. He needed to do case analysis on a term of type Fin 2 but couldn't get it to work. In the process of discussing this example, I came up with a kind of design pattern for implementing case analysis on dependent types without assuming any axioms.

In this post we'll walk through a proof that the booleans have cardinality 2. I know, pretty impressive, right? First we'll prove it using the dependent destruction tactic, which uses an axiom that is a strict extension of Coq's underlying theory. Then, we'll show how to do it "by hand" in a way that doesn't rely on any extra assumptions. Finally, I'll give another quick example and a hand-wavy description my design pattern for dependent case analysis without axioms.

This post is generated from a Coq source file via a hacky python script and pandoc. You can view the original source if you want to step through the proofs in Coq. The source also contains a link to the python script. Finally, there's also a version of the source with all the blog text and typesetting commands removed, for easy stepping.

Cardinalities of sets

As discussed above, we're going to define cardinalities in terms of the type Fin n, which we define as follows.

Inductive Fin : nat -> Set :=
  F1 : forall n : nat, Fin (S n)
| FS : forall n : nat, Fin n -> Fin (S n).

Fin n is a set with n elements, namely: F1, FS F1, ..., FS (FS (... F1)). 1

We now formalize the statement "A has cardinality n" a la osa1.

Definition cardinality (n : nat) (A : Type) : Prop :=
  exists (f : A -> Fin n) (g : Fin n -> A),
      (forall x, g (f x) = x) /\
      (forall y, f (g y) = y).

The definition takes the two directions of the bijection and proofs that the two are inverses.

As a simple use of this definition, let's prove that the booleans have cardinality 2. We're going to need functions bool -> Fin 2 and Fin 2 -> bool. The first direction is easy.

Definition bool_to_Fin_2 (x : bool) : Fin 2 :=
  if x then FS _ (F1 _) else F1 _.

The other direction is almost as easy.

Definition Fin_2_to_bool (y : Fin 2) : bool :=
  match y with
    | F1 _ => false
    | FS _ (F1 _) => true
    | _ => false (* bogus! *)
  end.

This may look like a strange definition. Since y has type Fin 2, there are only two possibilities, either y is F1 or FS F1, right? Well, yes, but no. It is true (and provable in Coq) that this is the case, but Coq's pattern matching will not give this fact to you automatically. It is possible to convince Coq that the last case is impossible, but for now we take a simpler approach. We just supply a bogus value in the impossible case. We know that this case can never happen, so the value we put there doesn't matter as long as it type checks. In fact, during a proof by case analysis y, we will be able to show that the third case is impossible.

Here's the main theorem. (Pro tip: you can hover over a tactic to see the state of the goal before and after the tactic runs. Use this to step through the proofs in your browser as they are discussed.)

Theorem bool_cardinality_2 :
  cardinality 2 bool.
Proof.
  unfold cardinality.
  exists bool_to_Fin_2.
  exists Fin_2_to_bool.

Here we just unfold the definition of cardinality and provide the two directions of the bijection. Next, we're asked to prove that they are in fact inverses.

  split; intros.

In the first direction, everything is easy (just like in defining bool_to_Fin_2).

  - destruct x; auto.

In the second direction, things are more difficult.

  - (* hover here to see subgoal *)

Looking at the goal, we want to do case analysis on y, but a simple destruct won't work.

  destruct y. (* hover to see error *)

This error is rather inscrutable, but the gist is that Coq's heuristic for doing case analysis has failed because the dependency in the types is too complicated (note: pretty much any dependency is too much dependency, recall Doug's Dad's wisdom).

When Coq performs a case analysis, it first abstracts over all indices. You may have seen this manifest as a loss of information when using destruct on predicates (try destructing even 3 for example: it just deletes the hypothesis!), or when doing induction on a predicate with concrete indices (try proving forall n, even (2*n+1) -> False by induction on the hypothesis (not the nat) -- you'll be stuck!). Coq essentially forgets the concrete values of the indices. When trying to induct on such a hypothesis, one solution is to replace each concrete index with a new variable together with a constraint that forces the variable to be equal to the correct concrete value. destruct does something similar: when given a term of some inductive type with concrete index values, it first replaces the concrete values with new variables. It doesn't add the equality constraints (but inversion does). The error here is about abstracting out the indices. You can't just go replacing concrete values with arbitrary variables and hope that things still type check. It's just a heuristic.

Getting back to the point, the error stems from the fact that destruct replaces 2 with a new variable, say n. At which point y has type Fin n. But we pass y into the function Fin_2_to_bool, which requires an argument of type Fin 2. Thus the goal is ill typed, just like Coq said.

So what should we do? One option is to not use dependent types. Another (as suggested by CPDT, for example) is to use the dependent destruction tactic, which comes from the Program module.

    Require Import Program.
    dependent destruction y.

    + reflexivity.
    + (* hover here to see subgoal *)

This does exactly what we want! It gives us two goals, in the first case, y has been substituted out for F1, while in the second it has become FS y (confusingly reusing the variable name y). The first case goes through easily by computation. In the second case, we just want to case analyze y again.

      dependent destruction y.
      * reflexivity.
      * (* hover here to see subgoal *)

More progress! We easily deal with another case (where the argument to Fin_2_to_bool is FS F1). But we are given one final case. This corresponds to the third case of the match when we defined Fin_2_to_bool. reflexivity won't work, since bool_to_Fin_2 never returns FS (FS F1). Instead, we can finish by showing that this case is actually impossible. We have an inhabitant of Fin 0 in the context. But the set with 0 elements is uninhabited. inversion derives the necessary contradiction

        inversion y.
Qed.

This works just fine. However, dependent destruction is a little too good to be true. Let's ask Coq to print the assumptions (ie, axioms) used in the proof of bool_cardinality_2.

Print Assumptions bool_cardinality_2. (* hover to see assumptions *)

We see that our proof relies on the axiom JMeq_eq. This states that heterogeneous equality implies the usual propositional equality.

Heterogeneous equality (aka John Major equality, hence the name JMeq in Coq) lets one state equalities between terms of different types. This sounds strange, but it turns out to be very convenient, especially when you have a lot of dependent types floating around. It eliminates a lot of uses of eq_rect and generally makes things nice. Elimination of heterogeneous equalities is implemented in Coq using JMeq_eq.

How did we manage to use heterogeneous equality in our proof? It turns out that the implementation of dependent destruction uses it internally. Let's take another crack at proving bool_cardinality_2 and see if we can avoid this axiom. We'll just look at the part of the proof that required dependent destruction.2

Lemma bool_to_Fin_2__Fin_2_to_bool__id :
  forall y,
    bool_to_Fin_2 (Fin_2_to_bool y) = y.
Proof.
  intros.

Here we are back at the fateful spot where we want to do case analysis on y. Remember that destruct errors out and inversion doesn't do what we want. Both of these tactics correspond to pattern matches in Gallina, and the trick we're going to use is to write our own matches (and more importantly, match annotations) to do the case analysis and carry the information we need.

Update: Note that the following development is slightly more complicated than necessary, but is a reasonable first cut. See the next post for a cleaner proof of the fin_case function below.

As a first attempt, we can try to do a simple match on y. (If you haven't seen refine before, it's a handy tactic that lets you "sketch" a proof term, using underscores to leave holes that will be filled in later. Below, the underscores on the left of the match arrow (=>) are "don't care" patterns, while underscores to the right of the arrow are holes for refine.)

  refine (match y with
            | F1 _ => _
            | FS _ _ => _
          end).

This gives us two subgoals, which is the number we expect. But they aren't the subgoals we expect. They're a mess! This is an example of Coq's match annotation inference being a little too smart (or not smart enough?) for its own good. To clean things up and see what's going on, let's explicitly give a return annotation.

  refine (match y as y' return bool_to_Fin_2 (Fin_2_to_bool y) = y with
            | F1 _ => _
            | FS _ _ => _
          end).

All I did was give a new name (y') to the discriminee (the thing being matched on) and then copy-paste the type of the goal into the return slot. y and y' are secretly two names for the same thing, but Coq will automatically refine the type of y' based on what case of the match it's in. This sounds like a good thing, and it can be, but in this case Coq will refine the type from Fin 2 to Fin n where n is a new variable. Thus, if we replace y with y' in the return annotation, it won't type check, similar to the problem with destruct that was described above. So in this case we just stick with the unrefined y.

Anyway, this generates two reasonable looking subgoals. The first corresponds to the case where y = F1, and the second to y = FS F1. However, we don't have enough information to proceed. We'd like Coq to generate equalities describing what y must be in each case. This is common when writing your own matches. To have Coq generate this equality, we use something called the "convoy pattern". The convoy pattern works by adding things to the return annotation. Coq automatically refines the type given in the return annotation based on what case of the match you're in. This will let us prove our equality.

Remember that Coq will automatically change y' to correspond to the current case of the match, so to state that y = F1 in the first case and y = FS in the second case, we just say y = y'. Let's give it a try.

  refine (match y as y' (* hover to see error *)
                return y = y' ->
                       bool_to_Fin_2 (Fin_2_to_bool y) = y with
            | F1 _ => _
            | FS _ _ => _
          end _).

This would work if the type of y wasn't dependent. As it is, though, y has type Fin 2 while y' has type Fin n after refinement introduces a new variable n. Since these two types aren't equal, we can't even state that the terms are equal.

This is exactly where dependent destruction would use heterogeneous equality. Because heterogeneous equality does not require the types to be equal in order to state an equality between terms.

Instead, we'll further extend the return annotation to generate an equality n = 2 and then explicitly transport one of the terms across this equality at the type level (Fin n = Fin 2) to state the desired term-level equality (y' = y).

  refine (match y as y' in Fin n
                return forall pf : n = 2,
                         eq_rect n Fin y' 2 pf = y ->
                         bool_to_Fin_2 (Fin_2_to_bool y) = y with
            | F1 _ => _
            | FS _ _ => _
          end eq_refl eq_refl).

That's an epic match annotation. First, we add the clause in Fin n to bind a name for the index to Fin. Then, in the return, we take a proof that n = 2 and a proof that y' (which has type Fin n) transported across the equality n = 2 is equal to y. Finally, we return the same goal as before.

The return type of the match is a function of two arguments (proofs of the two equalities). The first of which asks us to prove n = 2. Coq automatically refines this to 2 = 2 since the discriminee (y) has type Fin 2. So we can give eq_refl for the first proof. The second proof is only a bit more complicated: once we've given eq_refl for the first proof, the type simplifies to y = y, so we can again give eq_refl. This leaves us to fill in the two branches of the match.

With the new equalities in context, we now have enough information to finish the first case.

  - intros.
    inversion pf.
    subst.
    rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
    reflexivity.

The main trick here is to use decidable equality of the natural numbers to get rid of the eq_rect. This is the insight that makes the whole technique work. Decidable equality on a type means, in the language of HoTT, that the type is an h-Set. Thus there are provably no non-refl paths, and so the transports can be eliminated. The rest of the proof is easy now that we have all the necessary information in our context.

The second case starts out just the same.

  - intros.
    inversion pf.
    subst.
    rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
    simpl.

We're left wanting to do another case analysis, this time on f. The same technique will work again, but instead of copy-pasting, let's take a step back and implement a general solution.

A case analysis tactic for Fin

Our goal is to use the technique we just saw to define an analogue to dependent destruction, but without relying on any axioms and specialized to Fin. We'll first prove a lemma that feels kind of like an induction principle, except it doesn't give an induction hypothesis. A destruction principle, perhaps.

Lemma fin_case :
  forall n (P : Fin (S n) -> Type),
    (P (F1 _)) ->
    (forall x, P (FS _ x)) ->
    (forall x, P x).

The lemma takes a natural number n and predicate P on Fin (S n) and gives sufficient conditions to prove that P holds for all x : Fin (S n).3 In particular, it takes a proof that P holds on F1 and that, forall x : Fin n, P holds on FS x. Intuitively, these are indeed sufficient. Let's prove it.

Proof.
  intros.
  refine (match x as x' in Fin n'
                return forall pf : n' = S n,
                         eq_rect n' Fin x' (S n) pf = x ->
                         P x with
            | F1 _ => _
            | FS _ _ => _
          end eq_refl eq_refl).
  - intros.
    inversion pf.
    subst.
    rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
    auto.
  - intros.
    inversion pf.
    subst.
    rewrite <- Eqdep_dec.eq_rect_eq_dec by apply eq_nat_dec.
    auto.
Qed.

We use the same flavor of match annotation as before and are again left with three cases. In each case we simplify by getting rid of the transport using decidable equality as before. The first two cases follow from the two hypotheses of the lemma, and the proof of the last equality is trivial.

Let's now use our lemma to complete and cleanup the proof of the eloquently named bool_to_Fin_2__Fin_2_to_bool__id.

Lemma bool_to_Fin_2__Fin_2_to_bool__id :
  forall y,
    bool_to_Fin_2 (Fin_2_to_bool y) = y.
Proof.
  intros.

At this point we want to destruct y using our new fin_case lemma. If we try that directly, we don't get quite what we hoped for.

  apply fin_case.

The problem is that the conclusion of fin_case is the very general P x. We're asking Coq to come up with a predicate P : Fin (S n) -> Type and a term x : Fin (S n) such that the goal is P x. Coq decides to take the second occurrence of y as the argument of the predicate. Thus fin_case only destructs that occurrence.

To get the desired result, we need to guide Coq towards choosing the right predicate that destructs both occurrences of y. One way to achieve this is using a with (P := fun x => ...) clause to apply. Another way that involves less typing is the pattern tactic, which "beta expands" the goal with respect to some term, thus factoring the goal into P applied to x. Once the goal has been rewritten in this way, Coq chooses the correct predicate for fin_case.

  pattern y.
  apply fin_case.

This time the case analysis works as expected. We are left with two cases, the first of which is trivial.

  - reflexivity.

In the second case, we do another case analysis to consider FS F1.

  - intros.
    pattern x.
    apply fin_case.
    + reflexivity.

Finally, we are given the absurd case with an inhabit of Fin 0.

    + intros.
      inversion x0.
Qed.

As a last cleanup pass, we can wrap up the pattern ...; apply fin_case sequence in a custom tactic. The final proof is identical to the dependent destruction proof, and it 100% axiom-free!

Ltac fin_dep_destruct v :=
  pattern v; apply fin_case; clear v; intros.

Lemma bool_to_Fin_2__Fin_2_to_bool__id' :
  forall y,
    bool_to_Fin_2 (Fin_2_to_bool y) = y.
Proof.
  intros.
  fin_dep_destruct y.
  - reflexivity.
  - fin_dep_destruct x.
    + reflexivity.
    + inversion x.
Qed.

Print Assumptions bool_to_Fin_2__Fin_2_to_bool__id'.

A few final notes. First, this tactic will not properly destruct a term that appears in hypotheses as well. The fix would be to revert those hypotheses before using pattern. This could even be achieved with a relatively low level of pain using additional Ltac hacking, but I won't go into that here. Exercise for the reader! Second, note that fin_case is similar to the lemma caseS in the standard library (the only difference being where n is quantified).4 There's even a recursive (inductive) version, called rectS that essentially implements a dependent form of induction. I actually developed this trick independently, but it's not at all surprising that such a thing already existed. What is surprising to me is that no readers of the first version of this post pointed out caseS!

Another Example and a Design Pattern for Dependent Case Analysis without Axioms

We've seen how to implement dependent case analysis for Fin, but what about in general? As far as I can tell, the same basic trick works: prove a "destruction principle" for your type, which takes a hypothesis proving the goal on each constructor of the type. The tricky bit is giving a type to the predicate P. In the case of Fin, both constructors return elements of Fin (S n). Importantly, the indices from both cases unify. This allowed us to give P type Fin (S n) -> Type, which simplified things. When the indices from different constructors don't unify, the statement of the case lemma is more complicated, and requires explicit transports to make it typecheck.

I've worked this out in detail for an example from CPDT: the constant folding example from the ProgLang Chapter using the first order representation of syntax. In this situation, we have a types type and term that represent the types and expressions in the object language. The member family encodes dependent deBruijn indices for variable names. The cfold transformation (not shown below) implements constant folding, which we want to prove sound. The proof of its soundness requires case analysis on some terms, for which CPDT uses a variant of dependent destruction. Using the techniques from this post, we can eliminate the reliance on axioms.

Here is the relevant code in a standalone snippet. See the book for more details and the actual proof of soundness. I won't explain it in detail, but hopefully term_case will help people looking to implement their own axiom-less dependent case analysis lemmas. (I also left out the cute hover contexts because the proof isn't that interesting).

Set Implicit Arguments.
Inductive type : Type :=  Nat : type | Func : type -> type -> type.

Require Import List.
Import ListNotations.

Inductive member (A : Type) (elm : A) : list A -> Type :=
    HFirst : forall ls : list A, member elm (elm :: ls)
  | HNext : forall (x : A) (ls : list A),
            member elm ls -> member elm (x :: ls).

Inductive term : list type -> type -> Type :=
  Var : forall (G : list type) (t : type), member t G -> term G t
| Const : forall G : list type, nat -> term G Nat
| Plus : forall G : list type, term G Nat -> term G Nat -> term G Nat
| Abs : forall (G : list type) (dom ran : type),
          term (dom :: G) ran -> term G (Func dom ran)
| App : forall (G : list type) (dom ran : type),
          term G (Func dom ran) -> term G dom -> term G ran
| Let : forall (G : list type) (t1 t2 : type),
          term G t1 -> term (t1 :: G) t2 -> term G t2.

Lemma ty_eq_dec :
  forall x y : type, {x = y} + {x <> y}.
Proof.
  decide equality.
Qed.
Lemma term_case :
  forall G t (P : term G t -> Type),
    (forall p, P (Var p)) ->
    (forall n (pf : t = Nat),
        P (eq_rect Nat _ (Const G n) t (eq_sym pf))) ->
    (forall e1 e2 (pf : t = Nat),
        P (eq_rect Nat _ (Plus e1 e2) t (eq_sym pf))) ->
    (forall dom ran (pf : t = Func dom ran) b,
        P (eq_rect (Func dom ran) _ (Abs b) _ (eq_sym pf))) ->
    (forall dom e1 e2, P (App (dom:=dom) e1 e2)) ->
    (forall ty1 e1 e2, P (Let (t1:=ty1) e1 e2)) ->
    forall e, P e.
Proof.
  intros.
  destruct e eqn:?; auto.
  - specialize (X0 n eq_refl). 
    auto.
  - specialize (X1 t0_1 t0_2 eq_refl). 
    auto.
  - specialize (X2 dom ran eq_refl t0). 
    auto.
Qed.

The term_case lemma follows the same pattern as fin_case: one hypothesis for each constructor. It is complicated by the fact that term has two indices instead of just 1 (but the use of G is simple enough that it doesn't require special handling) and the fact that the constructors do not return indices that unify (the type index is different in each case). Thus to state the hypotheses, we have to use extra transports (ie, eq_rects). The proof of the lemma is easy after using simplification based on decidable equality. We also had to prove that equality on type is decidable. Again, that is the key to this whole approach: if your indices have a type with decidable equality, you don't need JMeq to get dependent case analysis.

At this point we could define a custom tactic to call pattern and term_case, just as we did with fin_case, but I'll leave those out because they are virtually identical. The proof of cfold_sound (which you can find in the book) now goes through by replacing dependent destruction with applications of term_case.

Last updated: January 13, 2016


  1. There's another standard definition of Fin as a Coq Fixpoint from nat to Type. This Fixpoint encoding is often easier to work with because Coq can perform type-level computation. In fact, this encoding would make the example in this post straightforward to define and would not require dependent destruction or any other trickery. But that's not the point! The point is that one does not always have such an encoding to work with, so it's useful to have techniques that work with the inductive definitions. This is just a simple example.

  2. There is some debate in the proof assistant community about whether axioms like JMeq should be used in verification efforts. According to CPDT, JMeq has been proven (on paper) to be a consistent extension of Coq's type theory. I tend to come down on the side of using whatever axioms you want if it helps you be productive. Clearly there are trade-offs here. And in any case, it's nice to know that you can avoid this axiom in the common case (of index types with decidable equality) with a little extra work.

  3. Because both constructors of the Fin type return terms of type Fin (S n), it simplifies the statement of the lemma to give P type Fin (S n) -> Type instead of the more natural (?) Fin n -> Type. For more complex inductive type families where the index is not constant across all constructors, this simplification isn't possible, and a different technique is needed. We'll see that in the example at the end of the post.

    Finally, note that this simplification can make it harder to actually use fin_case. Before you apply it, you need to know that your element of Fin is at index S n for some n. This is always the case for Fin (since Fin 0 is uninhabited), but you might expect fin_case (or the helper tactic defined below) to do it for you. Extending the tactic to do so is a good exercise!

  4. The proof term for caseS is interesting (read: way better than mine). It uses a similarly complicated match annotation, but one that avoids the need for decidable equality or eq_rect. I haven't fully explored how that technique would generalize to non-Fin types, but I expect the results would be cleaner.