Question

How to: Creating a View of Your Customer Without a Dedicated Customer ETL

  • 30 October 2019
  • 0 replies
  • 296 views

Userlevel 2

How to: Creating a View of your Customer Without a Dedicated Customer Data Feed


Originally authored for Looker 6.22.12


Business Requirement:


Your business intelligence platform is by necessity a low-cost application. Budgets for enhancements and additional data sources are very limited. Business users have requested insight into customer behavior but your database does not currently have a CRM-sourced data feed. You will need to use the data available in your sales history table to provide the most insight possible.


Technical Pre-Requisites/Assumptions:


This write up is intended for a technical audience. The write up assumes a strong understanding of LookML as well as write privileges for your instance. All SQL is written for BigQuery.


Although this example is taken from a physical retail and ecommerce business, the approach could be applied to any business in which your sales history contains a key for customer (this could be email, customer name, a foreign key to your CRM system, etc.).


Step 1: Writing the query to define your customer PDT


Author the query that will populate your customer PDT. The code below uses a query/sub-query to this effect:


Sub-query: This query pulls the key for customer and all associated transactions in order (notice the application of the dense_rank( function). Pay special attention to the where clause as it helps to cleanse the data somewhat (disregards invalid or null email addresses, etc.).


Query: Based on the results of the sub-query this outer statement aggregates the results at a customer level and begins to expose insightful customer data points.


The code below assumes the following data structure:



  • date and time are held in separate fields

  • email is the Looker primary key for customer

  • only two different ‘channels’ of consumption exist (physical retail store vs. ecommerce)

  • each transaction is assigned a unique universal_no


Please feel free to copy this code and adapt it to your data. Although the column names will change, the logic and functions are universal.


SELECT
xx.email,
MIN(xx.adate) AS firstpurchasedate,
MAX(xx.adate) AS lastpurchasedate,
DATE_DIFF(CAST(MAX(xx.adate) AS DATE), CAST(MIN(xx.adate) AS DATE), day) AS customerdurationdays,
DATE_DIFF(CAST(CURRENT_DATE() AS DATE), CAST(MAX(xx.adate) AS DATE), day) AS dayslapsed,
COUNT(DISTINCT(xx.alocation)) AS locationcount,
COUNT(DISTINCT(xx.channel)) AS channelcount,
COUNT(DISTINCT(xx.universal_no)) AS transcount
FROM (
SELECT
email AS email,
adate,
atime,
alocation,
universal_no,
CASE
WHEN CAST(alocation AS string) LIKE "%ECOMMERCE%" THEN "Yes"
WHEN CAST(alocation AS string) NOT LIKE "%ECOMMERCE" THEN "No"
END
AS channel,
DENSE_RANK() OVER (PARTITION BY email ORDER BY ADATE ASC, ATIME ASC) AS purchaseorderoldtonew,
DENSE_RANK() OVER (PARTITION BY email ORDER BY ADATE DESC, ATIME DESC) AS purchaseordernewtoold
FROM
twr.salesjournalhistory
WHERE
email IS NOT NULL
AND CAST(email AS string) != ""
AND CAST(email AS string) != " "
AND CAST(email AS string) LIKE "%@%"
GROUP BY
1, 2, 3, 4, 5
ORDER BY
email ASC, adate ASC, atime ASC) AS xx
GROUP BY
1
ORDER BY
1 ASC, 2 ASC

To read more about ranking logic in BigQuery, follow this link.


Step 2: Construction of View to Expose Customer


Code sample below:


  dimension: customeremail {
label: "Cust Email"
primary_key: yes
type: string
sql: ${TABLE}.email ;;
}

dimension: customeremailmask {
label: "Cust Email Masked"
description: "Customer email encrypted as MD5 string. Anonymizes email for presentation/sharing."
html: <font size = 3.5px>{{rendered_value}}</font> ;;
type: string
sql: MD5(${TABLE}.email) ;;
}

dimension: customeremaildomain {
label: "Cust Email Domain"
html: <font size = 3.5px>{{rendered_value}}</font> ;;
type: string
sql: SUBSTR(CAST(${TABLE}.email AS string), STRPOS(email, '@'), LENGTH(email)) ;;
}

dimension: customerchinadomain {
label: "Cust Email China"
description: "Yes/No field indicating if the email domain is China-based"
html: <font size = 3.5px>{{rendered_value}}</font> ;;
type: string
sql: CASE
WHEN SUBSTR(CAST(${TABLE}.emailouter AS string), STRPOS(email, '@'), LENGTH(email)) in
('@qq.com', '@163.com', '@126.com', '@sina.com', '@123.com', '@vip.qq.com', '@sina.cn',
'@sohu.com', '@vip.163.com', '@vip.sina.com', '@qq.con', '@263.net')
then "China Domain"
ELSE "Rest of World" END ;;
}

dimension: customerfirstpurchasedate {
label: "Cust Acq Date"
type: date
sql: ${TABLE}.firstpurchasedate ;;
}

dimension: customeracquisitioncohort {
label: "Cust Acq Cohort"
description: "Acquisition date: YYYY-MM. Enables cohort analysis."
type: string
sql: substr(cast(${TABLE}.firstpurchasedate as string), 0, 7) ;;
}

dimension: customerlastpurchasedate {
label: "Cust Last Purch Date"
description: "Maximum date on which email purchased from any channel"
type: date
sql: ${TABLE}.lastpurchasedate ;;
}

dimension: customerdurationdays {
label: "Cust Lifespan"
description: "Days between first purchase date and last purchase date"
type: number
sql: ${TABLE}.customerdurationdays ;;
}

dimension: customerdayslapsed {
label: "Cust Lapsed Days"
type: number
sql: ${TABLE}.dayslapsed ;;
}

dimension: customerlocationlapsed {
label: "Cust Location Count"
description: "Count of unique locations through which a customer has purchased"
type: number
sql: ${TABLE}.locationcount ;;
}

dimension: customerlifetimepurchasecount {
label: "Cust Lifetime Purchase Count"
description: "Count of unique transactions completed; includes sales and returns"
type: number
sql: ${TABLE}.transcount ;;
}

dimension: customerchannelcount {
label: "Cust OMNI?"
type: string
case: {
when: {
sql: cast(${TABLE}.channelcount as string) = "1" ;;
label: "Not Omni"
}
when: {
sql: cast(${TABLE}.channelcount as string) = "2" ;;
label: "Omni"
}
}
}

measure: customercount {
label: "Customer Count"
value_format: "#,###"
type: count_distinct
sql: ${TABLE}.email ;;
}
}

Step 3: Building Reporting


If this is a new subject area for your users, it is important to build some easily-understood reports that demonstrate the functionality of this subject area. The view defintion above enables reports such as:



  • Customer Acquisition by Month

  • Lapsed Customers (could be used to prescribe action in stores or by email campaigns)

  • Best Customers (based on life-to-date transaction count)


Step 4: IT Testing, UAT, Deployment to Production, etc.


Step 5: Change Management & End-User Training


Depending on what other data and views you have available in your project, it may be possible to join this explore with a sales explore to begin calculating lifetime value (LTV). Customer email is a good candidate for a join key between your existing sales explore and this view of a customer. Be aware that some customers may purchase under multiple email addresses (customer : email :: one : many).


Good luck!


0 replies

Be the first to reply!

Reply