Talk
As machine learning models become more accurate and complex, explainability remains essential. Explainability helps not just with trust and transparency but also with generating actionable insights and guiding decision-making. One way of interpreting the model outputs is using SHapley Additive exPlanations (SHAP). In this talk, I will go through the concept of Shapley values and its mathematical intuition and then walk through a few real-world examples for different ML models. Attendees will gain a practical understanding of SHAP's strengths and limitations and how to use it to explain model predictions in their projects effectively.
Basic understanding of - Tree based models - Neural Networks
This talk is for Data Scientists and Machine Learning Engineers at any level. Basic knowledge of machine learning is useful but not necessary.
Attendees will learn why explainable machine learning is important and how to use and interpret SHAP values for their model.
ML models behave as black boxes in most scenarios. The model predicts or provides a certain output, but it is very difficult to generate any actionable insights directly. This is mostly because we generally have no idea which features are contributing the most to the model's behavior internally. SHAP provides a way to explain model predictions and can be an important tool in a data scientist's toolbox.
In this talk, we will begin by explaining to the audience the need for explainability and why it is essential to understand beyond what the model outputs. We will then briefly review the mathematical intuition behind Shapley values and their origins in game theory. After that, we will walk through a couple of case studies of tree-based and neural network-based models. We will be focusing on the interpretation of SHAP through various plots. Finally, we will discuss the best practices for interpreting SHAP visualizations, handling large datasets, and common pitfalls to avoid.
Staff Data Scientist
Avik Basu is a Staff Data Scientist passionate about building intelligent, scalable systems that blend research with practical impact. With extensive experience in time series modeling, anomaly detection, and explainable AI, he focuses on making machine learning robust, interpretable, and production-ready. Avik is a frequent speaker at conferences like PyCascades, PyData and KubeCon, where he shares insights on topics such as reproducible ML workflows, ML-driven observability, etc. He is also an active contributor to the open-source ecosystem, serving as a maintainer of the real-time data processing framework Numaflow and a reviewer for scientific Python projects. Outside of work, he explores the intersection of machine learning, personal finance, and open-source tools, aiming to build software that is accessible, self-hostable, and privacy-focused. He is driven by a strong belief in community, transparency, and empowering others through education and mentorship.