Modeling conversion funnels in Looker, a MySQL subselect approach

  • 28 March 2022
  • 0 replies
  • 27 views

Userlevel 5

This content, written by Joshua Moskovitz, was initially posted in Looker Blog on Jul 7, 2014. The content is subject to limited support.

In this post we will describe the process to model an event funnel in LookML. The high-level steps of this process include:

  1. Define the stages of the funnel
  2. Define the duration of stages in the funnel (the interval of time the funnel is evaluated upon)
  3. Create a LookML view, "funnel facts", based on a derived table
  4. Define the dimensions and measures to expose funnel facts for exploration in Looker

Events table

For the sake of this blog post assume all events in your DB are stored in one large table called "events", which contains all of the events for all of the visitors. The important columns:

  1. created_at: the time of the event
  2. user_id: if the visitor signs up, this is their user id
  3. visitor_id: a cookie that gets set for a visitor
  4. action: the name of the event

Part 1: Define all the steps within your funnel

In this example the 'action' is a step in the funnel. Let's assume that our funnel consists of 4 steps.

  1. a user lands on the product page, action = 'product_view'.
  2. a user can select a product type, action = 'select_product_type'.
  3. a user adds the chosen product to the cart, action = 'add_to_cart'.
  4. a user attempts a checkout, action = 'checkout'.

We want to produce a table that looks like the following example. For every visitor ID we'll include a column for each of our steps. If the step occurred we'll place the timestamp of the event in the column. If the step did not occur within our specified time interval we'll use NULL instead.

Part 2 - Define a subselect for each step in the funnel, grouped by visitor_id

Next, we write a subselect statement that defines each step in the funnel, grouped by visitor ID. We'll use these definitions later as we fill in the LookML to create a derived table for the funnel.

Optional: Determine user_id (if exists)

We can also include a final subselect to determine the user_id (if there is one) for a visitor_id, however this can slow down the performance of the query. Without it, the entire table can be built in a few seconds on a few million rows.

Part 3: Define JOIN syntax and determine interval of funnel

Next, we will determine the join logic for each step of the funnel, which we will use later in the LookML view file. Note that in this step we can configure the interval of the funnel the following way.

The join is ensuring that the visitor_ids are the same and that the step event, in this case step 3, was fired within the interval of time allotted for our funnel.

Part 4 - Create a derived table in LookML view file

Now we take the SQL subselect statements and JOIN logic from the last two steps to create a derived table in LookML to give us a view of our funnel. We start with our SQL transformation within a derived table LookML view file.

We now have our derived table in the correct format. "Note the last two parameters,indexes and sql_trigger_value.indexes sets the index on visitor_id.sql_trigger_value manages the persistence of this table in the looker_scratch schema." For details, check the LookML documentation for .

Part 5 - Create dimensions from the derived table for analysis in Looker

With our derived table we can now create LookML dimensions that reference this table. We define a dimension for each of the following:

  • visitor_id
  • whether a visitor completed a step
  • the time each step was completed (or NULL if it was not completed)
  • the interval of time between each step and the first (in minutes)

Part 6 - Creating aggregation measures from the dimensions

With our dimensions created, we can create measures that reference them to compute the following:

  • unique visitors
  • how many visitors completed each step
  • percent fall-off between each step
  • the average time spent in each step of the funnel

Querying our funnel

With all of this in place we can now query our funnel (once we've exposed it as a base view). Let's see how our funnel is performing over time.

Various ways to extend this technique further

Dealing with massive elements tables

For cases in which you have a very large events table (hundreds of millions, if not billions of rows) theres an alternate strategy to first summarize all events to an individual time frame (day) and visitor (visitor_id).

In this case we'd transform the events table to the following form:

Time bounded funnels

It's not uncommon for e-commerce companies to look at both lifetime funnels (described above) and much shorter day funnels, especially when examining product changes. We might have a product funnel bounded by the day of a new product release. This would allow users to quickly look at the impact of different in-app changes on fall-off behavior (rather than looking at longer lagged lifetime metrics).

We can limit the funnel interval within the join itself by ensuring that each step was completed within 24 hours of the user completing the first step.

Filtering by User, Device, and OS dimension

To really harness the power of Looker, we could also include a join of the users view to this funnel, which would allow us to group by every dimension in the users view of our project.

Now our funnel analysis base view will include all of the dimensions from the User view. The funnel can be viewed through the context of any particular user dimension.

We could also utilize the fact that there is information about platform and OS associated with each event to see how the funnel performs across those facets. This would require also selecting those values from the events table when creating the funnel view, we can identify from which platform/device the event was fired.

Testing many variations of funnels

Imagine we had many hundreds of different event actions. We could use the methodology outlined above to create many different funnel variations, in order to analyze different sequences of visitors through those funnels.

We've got many different customers at Looker, all analyzing their funnels in interesting ways. The combination of Looker, LookML, and your own eventstream will empower you to easily and rapidly explore your company's key funnels.


0 replies

Be the first to reply!

Reply