PyData & Scientific Libraries Stack

A Beginner's Guide to State Space Modeling

Tutorial

A Beginner's Guide to State Space Modeling - Session Card
Level: Novice Company/Institute: Miami Marlins

Abstract

**State Space Models** (SSMs) are powerful tools for time series analysis, widely used in finance, economics, ecology, and engineering. They allow researchers to encode structural behavior into time series models, including *trends*, *seasonality*, *autoregression*, and *irregular fluctuations*, to name just a few. Many workhorse time series models, including ARIMA, VAR, and ETS, are special cases of the general statespace framework. In this practical, hands-on tutorial, attendees will **learn how to leverage PyMC's new state-space modeling** capabilities (`pymc_extras.statespace`) to build, fit, and interpret Bayesian state space models. Starting from fundamental concepts, we'll **explore several real-world use cases**, demonstrating how SSMs help tackle common time series challenges, such as handling missing observations, integrating external regressors, and generating forecasts.

Prerequisites

Prior experience with PyMC is not required but will be beneficial. Optional Additional Resources: - [Introduction to PyMC state space module](https://www.youtube.com/watch?v=G9VWXZdbtKQ) - [Podcast episode on PyMC's state space module](https://learnbayesstats.com/episode/124-state-space-models-structural-time-series-jesse-grabowski) - [PyMC State Space Module GitHub Repository](https://github.com/pymc-devs/pymc-extras/tree/main/pymc_extras/statespace)

Description

State Space Models offer a structured yet flexible framework for time series analysis. They elegantly handle latent processes like trends, seasonality, and noisy observations, making them particularly valuable in real-world applications.

We'll start with a brief overview of the theory behind SSMs, followed by practical examples where participants will:

  • Understand the components of SSMs, including observation and state equations.
  • Learn how to specify and fit SSMs using PyMC's state space module.
  • Implement a modeling workflow using a survey data example, showing how to use SSMs to model the data and generate predictions.
  • Explore advanced topics such as incorporating external regressors, generating forecasts or building custom models.

Target Audience

This tutorial is aimed at data scientists, statisticians, and data analysts with a basic understanding of statistics and Python, who are interested in expanding their toolkit with Bayesian time series methods. Prior experience with PyMC is not required but will be beneficial.

Takeaways

By the end of this tutorial, attendees will:

  • Understand the theoretical foundations of State Space Models.
  • Be able to implement common SSMs (local level, trend, and seasonal models) in PyMC.
  • Evaluate and interpret Bayesian state space models using PyMC.
  • Appreciate practical scenarios where SSMs outperform traditional time series approaches.

Background Knowledge Required

Basic understanding of probability and statistics, and familiarity with Python. Prior experience with PyMC is not required but will be beneficial.

Materials Distribution

All tutorial materials, including notebooks and datasets, will be made available via a GitHub repository.

Outline

0 - 10 min: Introduction to State Space Models

  • What are SSMs, and why use them?

10 - 25 min: State Space Model Fundamentals

  • Observation and state equations.
  • Latent states, Kalman filters, and smoothing in Bayesian frameworks.

25 - 55 min: Implementing SSMs with PyMC (Hands-On)

  • Setting up a local-level model in PyMC.
  • Extending models to incorporate trends and seasonality.
  • Posterior inference: interpreting results and uncertainty.

55 - 75 min: Advanced State Space Modeling (Hands-On)

  • Dealing with missing data and irregular intervals.
  • Adding external covariates (regression components).
  • Model diagnostics and posterior predictive checks.

75 - 85 min: Real-world Application Case Study

  • Demonstrating an end-to-end modeling example with real data.
  • Discussing best practices for practical time series modeling.

85 - 90 min: Wrap-up and Interactive Q&A

  • Open floor for questions and further resources.

Additional Resources

We believe this tutorial will empower participants with practical knowledge of state space modeling in PyMC, enabling them to effectively analyze complex time series data using Bayesian approaches.

Speakers

Alexandre Andorra

Alexandre Andorra

Senior Applied Scientist

⚾ Senior Applied Scientist @ Miami Marlins 🎙️ Creator @ LearnBayesStats Podcast 📊 Cofounder @ PyMC Labs 👨‍🏫 Teacher @ Intuitive Bayes

Jesse Grabowski

Jesse Grabowski

Jesse Grabowski is a PhD candidate at Paris 1 Pantheon-Sorbonne. He is also a principal data scientist at PyMC labs, and a core developer of PyMC, Pytensor, and related packages. His area of research includes time series modeling, macroeconomics, and finance.

View Full Conference Program