[Analytic Block] Measuring Activity and Engagement

Dillon_Morrison · 02-11-2015 04:47 PM

About This Block

Tracking users’ activity and engagement allows you to dig beyond a simple user count like Daily Active Users and begin to develop a deeper understanding of how customers are interacting with your product. Using this block will enable you to answer questions like:

How much time are users spending in your product?
How long does a task take on average?
How do different segments of the user population interact with the product? E.g. How long do repeat customers require to complete a task compared to first time buyers?
What are the characteristics of the most/least engaged customers?

Ideal Data Types

This block is a great way to analyze event data, which can come from Google Analytics, Segment, Snowplow, or a custom event tracking system that stores individual events as rows.

Expected Output

Suppose I have an e-commerce store logging web traffic in an event table. Approximating product usage per user allows us to dig in and answer questions like: over the past week, what is the total time people spent on my website and what was the average time each user spent, broken out by the hour of day.

Explore Data in Full Screen

We can dig even deeper into usage data by filtering on the type of event and joining in user data so that we can segment our customers. In this example, we look at the daily usage in minutes that users from different traffic sources spend in the /add_to_cart section of our website:

Explore Data in Full Screen

Try it Yourself

How it’s Done:

Users’ events are typically not uniform in time, rather, they occur in clusters. We take advantage of that fact to approximate the time a user is engaged with the product by making an assumption about the timeframe in which events typically occur. We consider this timeframe an ‘engagement’ and can think of it as roughly the attention span of the user on our product. In the example above using web pageviews, we look at a 2-minute engagement. So if a user comes to our site, looks at one page and leaves, we are going to call that a 2-minute engagement. If a user comes to our site, looks at pages over at 30 minute period, loading at least a page every two minutes, we want to count that as 30-minutes engaged. By counting up the 2-minute engagements, we approximate the total usage time in a very simple way that gives us powerful insight into our customers’ activity.

Selecting an appropriate engagement length will depend on the product and how customers use it. A gaming company may want to use a 1-minute engagement since consumers are producing frequent events. Here at Looker, we use a 5-minute engagement because we find that it better captures the attention time of a typical user. Consider that using a day long engagement time effectively would re-create Daily Active Users while shrinking it to 1 second would start to approximate an event count. Remember the goal is to approximate and get an answer that is good enough!

We want to associate each event with a particular timeframe. We can accomplish this by converting the timestamp into seconds, dividing by the length of the engagement (in seconds) and taking the floor of the result. In Redshift, with a 2-minute engagement, this looks like:

FLOOR(extract(epoch from event.rtime)/(60*2))

We also want to identify each engagement with a particular user. So we concatenate the bucketed time with the user’s id:

event.user_id || FLOOR(extract(epoch from event.rtime)/(60*2))

Finally, we count each distinct user engagement and multiply by our timeframe in order to approximate minutes:
(COUNT(DISTINCT (event.user_id || FLOOR(extract(epoch from event.rtime)/(60*2)))))*2

Event tables can grow very large very quickly. This block can easily be integrated into a persistent derived table to stage the data for enhanced performance. See this post from our blog for more information on how to roll this up to a daily usage persistent derived table.

Try it Yourself

  - dimension: user_time_bucketed        #Buckets the time into 2 minute increments and combines with the user_id
    hidden: true                         #to create 2 minute buckets of usage for each user
    type: int
    sql: |
            FLOOR(extract(epoch from event.rtime)/(60*2)) || ${user_id}
    
  - measure: number_of_time_buckets      # Counts the number of 2 minute buckets for each user
    hidden: true
    type: count_distinct
    sql: ${user_time_bucketed}
    
  - measure: approximate_usage_in_minutes  # Multiply number of buckets by 2 to get approximate usage in minutes
    type: number
    sql: ${number_of_time_buckets}*2

  - measure: number_of_distinct_users
    type: count_distinct  
    sql: ${user_id}
    
  - measure: average_usage_minutes_per_user
    type: number
    sql: 1.0*${approximate_usage_in_minutes}/nullif(${number_of_distinct_users},0)
    value_format: '0.0'

rohit_gawli-166

Does this still work?