[Analytic Block] Sequencing Transactions

Dillon_Morrison · 01-16-2015 05:03 PM

##About This Block

Nearly all businesses want customers to come back. It’s often much easier to increase your profitability by selling more to your existing customers than to acquire new customers. An interesting field to create for this is ORDERS Order Sequence Number.

ORDERS means its an attribute of an order. Order Sequence Number means that we are computing an index for this order relative to the other orders a user has placed. For a buyer, their first order will be numbered ‘1’, their second order will be numbered ‘2’, and so on.

Sequencing Transactions is an excellent foundation to build other analytical patterns on. Once you have ordered transaction, you gain insights into new facets of customer behavior, and answer questions such as (among others):

What items are driving return purchasers?
What is the first-order behavior of our customers?
What percent of business comes from new customers? Returning Customers? Power Users?
What type of products draw people to your company?

##Ideal Data Types

This Block works great a large portion of companies in the e-commerce, consumer goods, marketing, healthcare, finance, and gaming spaces, among other verticals. The logic can be applied to any usage or transactional data where actions can be mapped to users. Sample usage and transactional data includes orders, order items, sessions, visits, postings, donations, events, etc…

Expected Output

Here’s an example of insights you can gather once you’ve sequenced transactions. This table shows us how many of our orders are a customers’ first, second, third (and so on). We can use this information to form the basis of business decisions. Do we want to incentivize repeat orders? By digging into users who have ordered at least 3 times, what can we learn about how to encourage high numbers of orders?

Please note that this data is fictitious, and may not be representative of standard metric levels.

###Orders by Order Sequence Number, the last 10 days.

Explore Data in Full Screen

###Breakdown of Orders by Order Sequence Number (Up to 10 Orders)

Explore Data in Full Screen

As we see in this case, the majority of purchases are from first-time orders. Ideally we’d like to see higher values for higher numbers, which would indicate that customers are satisfied and coming back to buy from you again.

Try it Yourself

How it’s Done:

There are two samples of code at the bottom of this article. The first utilizes a correlated subquery to count the number of orders by this user, before the current order. In order for this to work you will need an index on orders.user_id. This pattern is compatible MySQL and SQL Server. The second example utilizes a window function. Window functions are compatible with PL/pgSQL (PostgreSQL, Redshift, Greenplum), Microsoft SQL Server (2012+), Vertica, among others.

###Correlated Subquery

First, create a derived table with two columns:

order_id, squence_number

Then, for each order in the orders table, create a row in the order_user_sequence table. The index the table and join it into orders.

- connection: thelook     ## ignore
- scoping: true           ## ignore

- explore: orders
  hidden: true            ## ignore

- view: orders
  fields:
  - dimension: id
    type: int
    primary_key: true
    sql: ${TABLE}.id

  - dimension: user_order_sequence_number
    type: number
    sql: |
      (
        SELECT COUNT(*)
        FROM orders o
        WHERE o.id <= ${TABLE}.id
          AND o.user_id = ${TABLE}.user_id
      )

  - measure: count
    type: count
    drill_fields: [id, created_time]

Window Functions

Window Functions are recommended for large tables.

- explore: orders
  joins:
  - join: order_user_sequence
    foreign_key: id

- view: orders
  fields:
  - dimension: id
    type: int
    primary_key: true
    sql: ${TABLE}.id

  - measure: count
    type: count
    detail: [id, created_time]

# Compute the order's sequence over the users lifetime.  Is this the first, second, third, etc.

- view: order_user_sequence
  derived_table:
    sql: |
      SELECT
        id AS order_id
        , ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY created_at) as user_order_sequence_number
      FROM orders
    
  fields:
  - dimension: order_id
    primary_key: true
    sql: ${TABLE}.order_id

  - dimension: user_order_sequence_number
    type: int
    sql: ${TABLE}.user_order_sequence_number

Vaite

Hi,

I received error message ‘PSQLException: ERROR: This type of correlated subquery pattern is not supported yet’.
Is there a workaround?
Thanks

lloydtabb

@Vaite, if your SQL dialect doesn’t support coorelated sub queries, you can compute sequence nubmers with a window function instead. Here is a writeup on how to do it.

Sequencing Transactions With Window Functions

Vaite

Thanks, I’ll do that!

segahm

Also, for larger tables it is not recommended to use correlated subqueries. Window functions (in Lloyd’s link) are preferred.