Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Authors: Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, Rory Sayres
Year: 2018
Summary: TCAV moves beyond explaining predictions in terms of low-level features and instead explains them in terms of high-level, human-understandable concepts. It quantifies the degree to which a user-defined concept (e.g., 'stripes' for a zebra classifier) is important to a model's prediction for a class of inputs. This allows for more global and intuitive explanations.
Link: https://arxiv.org/abs/1711.11279