creating a combinationtable to explore customer buying behaviour.

Hello.

I have a set of customers, and the products they have bought. I want to do a vizualisation exploring the how many other products are usually bought when a specific product is bought. as well as what the most common product combinations are both in number of products and specific products.

I have a hard time going knowing where to start and would like some help. I have built the whole solution in excel mostly to see if my model was sound, but i’m now having a hard time moving this to an interactive Look.

In the picture above I have three tables.
The raw data is derived from our bigquery table and is the only data I need. customer_Id and if the customer have bought a product or not. the number of bought products is irrelevant for this.

The overview table is essentially a better form (imo) of the raw table where I have each product as its own dimension and a true/false (1=true) value for if the customer have bought the product or not.

I use the overview table to create the final table, which essentially is one large sum-if function. showing in what combinations a product has been bought.

For example:
product b has been bought by two different customers, and both bought it in combination with one other product.
product a has also been bought by two different customers, one who bought one additional product and the other bought two other products.

and so on.

How should I go about doing this in looker? I have a hard time knowing where to start and actually creating the components necessary. I am new to using Looker, so explain this to me as if I was 5 years old.

0 1 300
1 REPLY 1

lizzy1
New Member

I think I wouldn’t be trying table calculations for this. Mostly because it’s requiring an aggregation to get from Raw to Overview and then another aggregation to get from Overview to Final. When things require subqueries like that it’s a lot easier to write a derived table with the SQL you need to do each step as long as your access allows you to write derived tables. If you do here is what I would write:

with numproducts as
     (select customer_id, count(distinct(product_id)) as total_products 
      from raw
     group by 1)

  , raw2 as
          (select a.customer_id, a.product_id, b.total_products
           from raw a
           join numproducts b
           on a.customer_id = b.customer_id)

   select product_id, total_products, count(distinct(customer_id)) as customers
   from raw2
   group by 1, 2

Then I would build the viz on that table selecting the product_id as the rows, pivoting by the total_products, and putting the count of customers inside

Top Labels in this Space
Top Solution Authors