Beyond the Black Box: Interpreting ML models with SHAP

Level: Novice Company/Institute: Intuit Room: {'en': 'B07-B08'} Time: 2025-09-01T14:20:00+00:00

Abstract

As machine learning models become more accurate and complex, explainability remains essential. Explainability helps not just with trust and transparency but also with generating actionable insights and guiding decision-making. One way of interpreting the model outputs is using SHapley Additive exPlanations (SHAP). In this talk, I will go through the concept of Shapley values and its mathematical intuition and then walk through a few real-world examples for different ML models. Attendees will gain a practical understanding of SHAP's strengths and limitations and how to use it to explain model predictions in their projects effectively.

Prerequisites

Basic understanding of - Tree based models - Neural Networks

Description

Audience

This talk is for Data Scientists and Machine Learning Engineers at any level. Basic knowledge of machine learning is useful but not necessary.

Objective

Attendees will learn why explainable machine learning is important and how to use and interpret SHAP values for their model.

Details

ML models behave as black boxes in most scenarios. The model predicts or provides a certain output, but it is very difficult to generate any actionable insights directly. This is mostly because we generally have no idea which features are contributing the most to the model's behavior internally. SHAP provides a way to explain model predictions and can be an important tool in a data scientist's toolbox.

In this talk, we will begin by explaining to the audience the need for explainability and why it is essential to understand beyond what the model outputs. We will then briefly review the mathematical intuition behind Shapley values and their origins in game theory. After that, we will walk through a couple of case studies of tree-based and neural network-based models. We will be focusing on the interpretation of SHAP through various plots. Finally, we will discuss the best practices for interpreting SHAP visualizations, handling large datasets, and common pitfalls to avoid.

Outline

Introduction and motivation [1 min]
Why explainability matters? [5 min]
Problem with black box models
Actionable insights
SHAP theory and intuition [5 min]
- Shapley values
- Game theory origins
- SHAP
Case study 1: Tree-based model [4 min]
- Problem definition
- model output
- SHAP visualization
- Global plots
- Local plots
- Interpretation
Case study 2: Neural Network model [8 min]
- Problem definition
- Model output
- SHAP visualization
- Global plots
- Local plots
- Interpretation
Best practices and common pitfalls [4 min]
- Interpret SHAP correctly
- Avoid misleading explanations
- Performance challenges for large datasets
- Other techniques for explainability
Q/A [3 min]

Speaker

Avik Basu

Staff Data Scientist

Avik Basu is a Staff Data Scientist passionate about building intelligent, scalable systems that blend research with practical impact. With extensive experience in time series modeling, anomaly detection, and explainable AI, he focuses on making machine learning robust, interpretable, and production-ready. Avik is a frequent speaker at conferences like PyCascades, PyData and KubeCon, where he shares insights on topics such as reproducible ML workflows, ML-driven observability, etc. He is also an active contributor to the open-source ecosystem, serving as a maintainer of the real-time data processing framework Numaflow and a reviewer for scientific Python projects. Outside of work, he explores the intersection of machine learning, personal finance, and open-source tools, aiming to build software that is accessible, self-hostable, and privacy-focused. He is driven by a strong belief in community, transparency, and empowering others through education and mentorship.

View Full Conference Program