LIME - Local Interpretable Model-Agnostic Explanations

Update

I've rewritten this blog post elsewhere, so you may want to read that version instead (I think it's much better than this one)


In this post, we'll talk about the method for explaining the predictions of any classifier described in this paper, and implemented in this open source package.

Motivation: why do we want to understand predictions?

Machine learning is a buzzword these days. With computers beating professionals in games like Go, many people have started asking if machines would also make for better drivers, or even doctors.

Many of the state of the art machine learning models are functionally black boxes, as it is nearly impossible to get a feeling for its inner workings. This brings us to a question of trust: do I trust that a certain prediction from the model is correct? Or do I even trust that the model is making reasonable predictions in general? While the stakes are low in a Go game, they are much higher if a computer is replacing my doctor, or deciding if I am a suspect of terrorism (Person of Interest, anyone?). Perhaps more commonly, if a company is replacing some system with one based on machine learning, it has to trust that the machine learning model will behave reasonably well.

Read More

The greedy algorithm for monotone submodular maximization

First, some context:

Submodular Functions

Submodular functions have the intuitive diminishing returns property. Formally, a submodular function \(f:2^V \rightarrow \mathbb{R}\) assigns a subset \(A \subseteq V\) a utility value \(f(A)\) such that

\[f(A \cup \{i\}) - f(A) \geq f(B \cup \{i\}) - f(B)\]

for any \(A \subseteq B \subseteq V\) and \(i \in V \setminus B\). We call \(V\) the ground set.

Read More

Gradient Descent

\(\newcommand{\norm}[1]{\left\| #1\right\|} \require{cancel}\) Gradient descent is a very popular optimization method. In its most basic form, we have a function \(f : \mathbb{R}^n \rightarrow \mathbb{R}\) that is convex and differentiable. We want to find:

\[x^* = argmin_{x \in \mathbb{R}^n}f(x)\]
Read More

What is a norm?

\(\newcommand{\norm}[1]{\left\| #1\right\|}\) A norm is a function (usually indicated by the vertical bars, such as \(\norm{\cdot}\)) such that for all \(w \in \mathbb{R}^n\):

  • \[\norm{w} \geq 0\]
  • For all \(w\) and for all \(a \in \mathbb{R}\), $$\norm{aw} = a \norm{w}\(. Note that\) a \(denotes the absolute value of\)a$$
  • \(\norm{w} = 0\) if and only if \(w = 0\). Note that 0 can be the zero vector of any length.
  • For all \(u, w\), \(\norm{u +w} \leq \norm{u} + \norm{w}\). This is called the triangle inequality.

Norms are used for measuring different notions of length or size.

Read More