Talk
Optimizing user funnels is a common task for data analysts and data scientists. Funnels are not always linear in the real world. often, the next step depends on earlier responses or actions. This results in complex funnels that can be tricky to analyze. I’ll introduce an open-source Python library I developed that analyzes and visualizes non-linear, conditional funnels by utilizing Graphviz and Streamlit. It calculates conversion rates, drop-offs, time spent on each step, and highlights bottlenecks by color. Attendees will learn about how to quickly explore complex user journeys and generate insightful funnel data.
basic knowledge of python, pip and analytics
When we talk about funnels in analytics, most people think of linear funnels, where users move step-by-step through a fixed sequence of actions. But in real-world applications like dynamic forms, on-boarding flows, or diagnostic tools, funnels are often conditional and non-linear. The next step in the journey depends on user input at earlier stages, leading to different paths and variable funnel lengths for every user.
An example is a vehicle pricing tool: while all users answer general questions (e.g., type, mileage), follow-up questions may differ based on previous answers. For instance, only users with electric cars are asked about battery capacity. This branching logic creates challenges for traditional funnel visualization techniques which mostly consider funnels as linear.
Alternative immediate solutions are not perfect:
Visuals like Sankey diagrams are too limited/general and often visually collapse under real-world data messiness (users going back and forth, drop-offs, missing events).
Milestone-based funnels (where you set a few milestones during the funnel to mimic linear funnels) simplify things too much, hiding key details and masking where things actually break down.
As a data analyst, I needed a way to understand and visualize such nonlinear flows in a more straightforward and consumable way. Finding no library that met this need out of the box, I created funnelius, a Python library that processes raw event logs into ready to consume funnel graphs.
The library accepts a pandas DataFrame with user_id, action and action_timestamp columns. Then it will use pandas to transform DataFrame to a suitable format to feed into graphviz. It also adds necessary columns needed to filter and declutter the graph. Then it will visualize the funnel using dot rendering engine which includes:
- Calculating key metrics for every step: number of users per step, conversion rates, time spent, percentage of total users and drop-offs.
- conditional formatting based on different metrics to highlight bottlenecks.
- Comparison with another dataframe and showing changes.
- Showing the answers that users gave in each step and calculate the percentage of answers on every step.l.
The graph can be fine tuned with some options like:
- Only show top-N routes to declutter graph
- Show/hide Dropped users data
- Only include users who started from specific steps. If we know that users must have specific steps as a starting point, this helps remove possible data issues if any.
- Define metrics that should be calculated
There is also a Streamlit-based UI to interactively adjust parameters and export funnel analysis as PDF instead of doing it programmatically.
This tool can be helpful for data analysts and data scientists with Python knowledge who need to analyse conditional funnels.
Github Repository:
https://github.com/yaseenesmaeelpour/funnelius
Senior Data Analyst