LIME - Local Interpretable Model-Agnostic Explanations
Update
I've rewritten this blog post elsewhere, so you may want to read that version instead (I think it's much better than this one)
In this post, we'll talk about the method for explaining the predictions of any classifier described in this paper, and implemented in this open source package.
Motivation: why do we want to understand predictions?
Machine learning is a buzzword these days. With computers beating professionals in games like Go, many people have started asking if machines would also make for better drivers, or even doctors.
Many of the state of the art machine learning models are functionally black boxes, as it is nearly impossible to get a feeling for its inner workings. This brings us to a question of trust: do I trust that a certain prediction from the model is correct? Or do I even trust that the model is making reasonable predictions in general? While the stakes are low in a Go game, they are much higher if a computer is replacing my doctor, or deciding if I am a suspect of terrorism (Person of Interest, anyone?). Perhaps more commonly, if a company is replacing some system with one based on machine learning, it has to trust that the machine learning model will behave reasonably well.