Senior research scientist at Google Brain

I am interested in designing high-performance machine learning methods that make sense to humans. Quanta magazine described well why I am doing what I am doing. Thank you John Pavlus for writing this piece! Here is another short writeup about why I care.

My focus is building interpretability method for already-trained models or building inherently interpretable models . In particular, I believe the language of explanations should include higher-level, human-friendly concepts so that it can make sense to everyone .

I gave a couple of tutorials on interpretability:
        Deep Learning Summer school at University of Toronto, Vector institute in 2018 (slides)
        CVPR 2018 (slides)
        Tutorial on Interpretable machine learning at ICML 2017 (slides, video).

Other stuff I help with:
        Workshop Chair at ICLR 2018
        Senior program committee at AISTATS 2019
        Area chair and program chair at NIPS 2017 and 2018, ICML 2019
        Steering committee and area chair at FAT* conference
        Program committee at ICML 2017/2018, AAAI 2017, IJCAI 2016 (and many other conference before that...)
        Executive board member of Women in Machine Learning.
        Co-organizer 3rd ICML 2018 Worshop on Human Interpretability in Machine Learning (WHI), 1st ICML 2016 Worshop on Human Interpretability in Machine Learning (WHI), 2nd ICML 2017 Worshop on Human Interpretability in Machine Learning (WHI). and NIPS 2016 Worshop on Interpretable Machine Learning for Complex Systems.

Google Scholar



To Trust Or Not To Trust A Classifier

TL;DR: A very simple method that tells you whether to trust your prediction or not, that happens to also have nice theoretical properties!

Heinrich Jiang, Been Kim, Melody Guan, Maya Gupta
NIPS 2018 (poster)
[pdf] [code] [bibtex]


Human-in-the-Loop Interpretability Prior

TL;DR: Ask humans which models are more interpretable DURING the model training to learn more interpretable model for the end-task.

Isaac Lage, Andrew Slavin Ross, Been Kim, Samuel J. Gershman, Finale Doshi-Velez
NIPS 2018 (spotlight)
[TBD] [bibtex]


Sanity Checks for Saliency Maps

TL;DR: Saliency maps are a type of post-training interpretability method to explain 'evidence' of predictions. But it turns out that it has little to do with the model's prediction! Some saliency maps are visually indistinguishable before and after we randomize the weights of the network (i.e., producing garbage predictions)

Julius Adebayo, Justin Gilmer, Ian Goodfellow, Moritz Hardt, Been Kim
NIPS 2018 (spotlight)
[https://arxiv.org/abs/1810.03292] [bibtex]


Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

TL;DR: We can learn human-concepts in any layer of already-trained neural networks. Then we can do hypothesis testing with them to get quantitative explanations.

Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, Rory Sayres
ICML 2018
[pdf] [code] [bibtex] [slides]


The (Un)reliability of saliency methods

TL;DR: Existing saliency methods could be unreliable. We should be careful using them.

Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T. Schütt, Sven Dähne, Dumitru Erhan, Been Kim
NIPS workshop 2017 on Explaining and Visualizing Deep Learning
[pdf] [bibtex]


SmoothGrad: removing noise by adding noise

Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Viégas, Martin Wattenberg
ICML workshop on Visualization for deep learning 2017
[pdf] [code] [bibtex]


QSAnglyzer: Visual Analytics for Prismatic Analysis of Question Answering System Evaluations

Nan-chen Chen and Been Kim
VAST 2017
[pdf] [bibtex]


Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez and Been Kim
[to appear] Springer Series on Challenges in Machine Learning: "Explainable and Interpretable Models in Computer Vision and Machine Learning", arxiv in 2017
[pdf] [bibtex]


Examples are not Enough, Learn to Criticize! Criticism for Interpretability

Been Kim, Rajiv Khanna and Sanmi Koyejo
Neural Information Processing Systems 2016
[pdf] [NIPS oral presentation talk slides] [talk video] [bibtex] [code]


Diff-clustering: Interpretable embedding example-based clustering

Been Kim, Peter Turney and Peter Clark
under review


Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction

Been Kim, Finale Doshi-Velez and Julie Shah
Neural Information Processing Systems 2015
[pdf] [variational inference in gory detail] [bibtex]


iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction

Been Kim, Elena Glassman, Brittney Johnson and Julie Shah
coming soon (see my thesis for details).


Bayesian Case Model:
A Generative Approach for Case-Based Reasoning and Prototype Classification

Been Kim, Cynthia Rudin and Julie Shah
Neural Information Processing Systems 2014
[pdf] [poster] [bibtex]

This work was featured on MIT news and MIT front page spotlight.


Scalable and interpretable data representation for
high-dimensional complex data

Been Kim, Kayur Patel, Afshin Rostamizadeh and Julie Shah
AAAI Conference on Artificial Intelligence 2015
[pdf] [bibtex]


A Bayesian Generative Modeling with Logic-Based Prior

Been Kim, Caleb Chacha and Julie Shah
Journal of Artificial Intelligence Research 2014
[pdf] [bibtex]


Learning about Meetings

Been Kim and Cynthia Rudin
Data Mining and Knowledge Discovery Journal 2014

[arxiv] [pdf] [bibtex]

This work was featured in Wall Street Journal.


Inferring Robot Task Plans from Human Team Meetings:
A Generative Modeling Approach with Logic-Based Prior

Been Kim, Caleb Chacha and Julie Shah
AAAI Conference on Artificial Intelligence 2013
[pdf] [bibtex] [video]

This work was featured in:
"Introduction to AI" course at Harvard (COMPSCI180: Computer science 182) by Barbara J. Grosz.
[Course website]
"Human in the loop planning and decision support" tutorial at AAAI15 by Kartik Talamadupula and Subbarao Kambhampati.
[slides From the tutorial]


Multiple Relative Pose Graphs for Robust Cooperative Mapping

Been Kim, Michael Kaess, Luke Fletcher, John Leonard, Abraham Bachrach, Nicholas Roy, and Seth Teller
International Conference on Robotics and Automation 2010
[pdf] [bibtex] [video]


Human-inspired Techniques for Human-Machine Team Planning

Julie Shah, Been Kim and Stefanos Nikolaidis
AAAI Technical Report - Human Control of Bioinspired Swarms 2013
[pdf] [bibtex]



Interactive and Interpretable Machine Learning Models for Human Machine Collaboration

Been Kim
PhD Thesis 2015
[pdf] [bibtex] [slides]