Question

Calculating User Repurchase Rate

  • 9 February 2018
  • 1 reply
  • 2761 views

Userlevel 4

Definition:


How frequently do your users, who have already purchased an item, purchase again?


Example


Mary buys shoes on Monday.

Steve buys socks on Monday.

John buys a tie on Monday.

Elena buys a hat on Tuesday.


Mary buys a dress on Friday the same week. (5 days later)

Steve buys a belt 57 days after he bought his shoes. (57 days later)

Mary buys another dress the same day as Steve. (57 days later)

John doesn’t buy again.

Elena buys socks the following Wednesday. (8 days later)


The 7 day repurchase rate is 25%. Only Mary (1) of all customers (4) bought within 7 days.

The 30 day RPR is 50%. Mary and Elena (2) of all customers (4) bought within 30 days.

The 60 day RPR is 75%. Mary, Steve and Elena (3) of all customers (4) bought within 60 days.


This means that within 60 days of a user originally purchasing, they’re 75% likely to purchase something again. Note that Mary bought 2 things within 60 days, but we don’t count her towards that percentage. We can, but for now let’s not to keep it simple.


Actionable Insights:

Understanding the answer to a question such as “How long after my users initially make an order will I see an organic drop off?”



So we see that users aren’t super likely to buy again within a month, but within 2 months of their first purchase, they are ~35% likely to rebuy. Between the 3rd and 4th month, there’s an organic drop off.


The resulting action might be to send a promotion via email or Twilio using Looker’s Data Actions.


I’ve also included a Templated Filter in the LookML which will expose a Filter option to users if they want to understand a question such as “For users who buy X product, how long after they’ve bought it will they buy that same product again?”


Example:


LookML


view: order_items_repurchase_facts {
derived_table: {
# sql_trigger_value: SELECT CURRENT_DATE() ;; --uncomment for persistence, but be careful of Templated Filter
sql: SELECT ROW_NUMBER() OVER(ORDER BY ordered) as RN,*
FROM
( SELECT
order_items.user_id as user_id,
order_items.id AS order_id,
p.category as category,
MIN(order_items.created_at) OVER(PARTITION BY order_items.user_id) as first_ordered_date,
order_items.created_at as ordered,
COUNT(order_items.id) OVER(partition by order_items.user_id) as lifetime_orders,
ROW_NUMBER() OVER(PARTITION BY order_items.user_id ORDER BY order_items.created_at) as order_sequence_number,
LEAD(order_items.created_at) OVER(partition by order_items.user_id ORDER BY order_items.created_at) as second_created_at,
DATEDIFF(DAY,CAST(order_items.created_at as date),CAST(LEAD(order_items.created_at) over(partition by order_items.user_id ORDER BY order_items.created_at) AS date)) as repurchase_gap,
DATEDIFF(DAY, CURRENT_DATE,CAST(MIN(order_items.created_at) OVER(partition by order_items.user_id) as DATE)) as days_since_first_order
FROM order_items
JOIN inventory_items ii ON order_items.inventory_item_id=ii.id
JOIN products p on ii.product_id = p.id
WHERE {% condition product_category %} p.category {% endcondition %}
GROUP BY 1,2,3,5
)
;;
}

filter: product_category {
suggest_dimension: category
}
dimension: category {sql:${TABLE}.category ;; type:string}

dimension: row_number {
primary_key: yes
hidden: yes
sql: ${TABLE}.RN ;;
}
measure: count {
hidden:yes
type: count
drill_fields: [detail*]
}
dimension: user_id {
hidden: yes
}

dimension: lifetime_orders {
sql: ${TABLE}.lifetime_orders ;;
hidden: yes
}

dimension_group: first_ordered_date {
sql: ${TABLE}.first_ordered_date ;;
type: time
timeframes: [month,month_name,year]
}
dimension: lifetime_orders_tier {
sql: ${lifetime_orders} ;;
type: tier
tiers: [1,2,3,4,5,6,7,8,9,10]
style: integer
}

dimension: order_id {
# primary_key: yes
type: string
sql: ${TABLE}.order_id ;;
}

dimension: ordered {
hidden:yes
type: string
sql: ${TABLE}.ordered ;;
}

dimension: order_sequence_number {
type: string
sql: ${TABLE}.order_sequence_number ;;
}

dimension: second_created_at {
hidden: yes
type: string
sql: ${TABLE}.second_created_at ;;
}

dimension: repurchase_gap {
type: number
sql: ${TABLE}.repurchase_gap ;;
}

dimension: repurchase_tier {
type: tier
tiers: [30,60,90,120,150,180]
style: integer
sql: ${repurchase_gap} ;;
}

dimension: days_since_first_order {
type: number
sql: ${TABLE}.days_since_first_order ;;
}
dimension: days_since_first_order_tier {
type: tier
tiers: [30,60,90,120,150,180]
style: integer
sql: ${days_since_first_order} ;;
}

dimension: repurchase_made{
type: yesno
hidden: yes
sql: ${repurchase_gap} IS NOT NULL ;;
}

measure: count_repurchases {
description: "Count of unique users who have made more than 1 purchase"
type: count_distinct
drill_fields: [detail*]
sql: ${user_id} ;;
filters: {
field: repurchase_made
value: "yes"
}
filters: {
field: order_sequence_number
value: "2"
}
}

### Using an Average Distinct measure will get the average time it takes for a user to make a 2nd purchase
### and disregard duplicate users and subsequent orders
measure: average_repurchase_gap {
description: "The average time in days it takes for users to make a subsequent purchase"
drill_fields: [detail*]
type: average_distinct
sql: ${repurchase_gap} ;;
sql_distinct_key: ${order_id} ;;
filters: {
field: order_sequence_number
value: "2"
}

}

### These dimensions will check if a user's 2nd purchase was within certain time intervals
dimension: repurchase_30 {type:yesno sql:${repurchase_gap} <=30 AND ${order_sequence_number}=2;; hidden:yes}
dimension: repurchase_60 {type:yesno sql:${repurchase_gap} <=60 AND ${order_sequence_number}=2;; hidden:yes}
dimension: repurchase_90 {type:yesno sql:${repurchase_gap} <=90 AND ${order_sequence_number}=2;; hidden:yes}
dimension: repurchase_120 {type:yesno sql:${repurchase_gap} <=120 AND ${order_sequence_number}=2;; hidden:yes}
dimension: repurchase_150 {type:yesno sql:${repurchase_gap} <=150 AND ${order_sequence_number}=2;; hidden:yes}
dimension: repurchase_180 {type:yesno sql:${repurchase_gap} <=180 AND ${order_sequence_number}=2;; hidden:yes}

### Count of repurchases by users in N days since first purchase
measure: count_repurchases_first_30_days {
label: "1m"
group_label: "Count Repurchases"
type: count_distinct
sql: ${user_id} ;;
drill_fields: [detail*]
filters: {
field: repurchase_30
value: "yes"
}
}
measure: count_repurchases_first_60_days {
label: "2m"
group_label: "Count Repurchases"
type: count_distinct
sql: ${user_id} ;;
drill_fields: [detail*]
filters: {
field: repurchase_60
value: "yes"
}
}
measure: count_repurchases_first_90_days {
label: "3m"
group_label: "Count Repurchases"
type: count_distinct
sql: ${user_id} ;;
drill_fields: [detail*]
filters: {
field: repurchase_90
value: "yes"
}
}
measure: count_repurchases_first_120_days {
label: "4m"
group_label: "Count Repurchases"
type: count_distinct
sql: ${user_id} ;;
drill_fields: [detail*]
filters: {
field: repurchase_120
value: "yes"
}
}
measure: count_repurchases_first_150_days {
label: "5m"
group_label: "Count Repurchases"
type: count_distinct
sql: ${user_id} ;;
drill_fields: [detail*]
filters: {
field: repurchase_150
value: "yes"
}
}
measure: count_repurchases_first_180_days {
label: "6m"
group_label: "Count Repurchases"
type: count_distinct
sql: ${user_id} ;;
drill_fields: [detail*]
filters: {
field: repurchase_180
value: "yes"
}
}
#### Repurchase rates

measure: repurchase_rate {
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases}/nullif(${count_customers},0) ;;
}
measure: repurchase_rate_30 {
label: "1m"
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases_first_30_days}/nullif(${count_customers},0) ;;
}
measure: repurchase_rate_60 {
label: "2m"
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases_first_60_days}/nullif(${count_customers},0) ;;
}
measure: repurchase_rate_90 {
label: "3m"
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases_first_90_days}/nullif(${count_customers},0) ;;
}
measure: repurchase_rate_120 {
label: "4m"
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases_first_120_days}/nullif(${count_customers},0) ;;
}
measure: repurchase_rate_150 {
label: "5m"
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases_first_150_days}/nullif(${count_customers},0) ;;
}
measure: repurchase_rate_180 {
label: "6m"
group_label: "Repurchase Rates"
type: number
drill_fields: [detail*]
value_format_name: percent_1
sql: 1.0*${count_repurchases_first_180_days}/nullif(${count_customers},0) ;;
}
measure: count_customers {
drill_fields: [detail*]
type: count_distinct
sql: ${user_id} ;;
}
measure: count_orders {
drill_fields: [detail*]
type: count_distinct
sql: ${order_id} ;;
}
measure: percent_of_customers {
drill_fields: [detail*]
type: percent_of_total
sql: ${count_customers} ;;
}
set: detail {
fields: [
user_id,
order_id,
ordered,
order_sequence_number,
lifetime_orders,
first_ordered_date_month
]
}
}

This simple View can easily be joined to other Views as part of a larger Explore so you can now leverage these metrics to understand these rates across products, brands and demographics:


explore: order_items {
join: order_items_repurchase_facts {
type: left_outer
sql_on: ${order_items.id}=${order_items_repurchase_facts.order_id} ;;
relationship: one_to_one
}
join: inventory_items {
type: left_outer
sql_on: ${order_items.inventory_item_id} = ${inventory_items.id} ;;
relationship: many_to_one
}

join: users {
type: left_outer
sql_on: ${order_items.user_id} = ${users.id} ;;
relationship: many_to_one
}

join: products {
type: left_outer
sql_on: ${inventory_items.product_id} = ${products.id} ;;
relationship: many_to_one
}

join: distribution_centers {
type: left_outer
sql_on: ${products.distribution_center_id} = ${distribution_centers.id} ;;
relationship: many_to_one
}
}



(1) Which brand do Men repeatedly buy more than others?





(2) Where are these high repurchasers?



1 reply

I am trying to adapt this block to my company’s data structure.  Can you please describe the underlying data for “order_items”, “inventory_items” and “products” as used here?  

Our data is comprised of (1) ORDERS: Each user’s orders which can contain many “order_lines” (products), (2) ORDER_LINES: Each product for each order, and (3) PRODUCTS: product description and categories of products.  

They are connected in the following way: ORDERS.order_id = ORDER_LINES.order_id; ORDER_LINES.product_id = PRODUCTS_product_id; ORDERS.user_id = ORDER_LINES.user_id.

If I replace the “order_items” in the Select statement with order “orders” I am able to get the correct measures for 1st, 2nd, and repurchase gap by user ONLY if I do not join with ORDER_LINES.  However, I need to join with ORDER_LINES to get category.

Thank you!

Reply