Question

'Checking' a dimension for a long period back in time

  • 27 March 2017
  • 4 replies
  • 87 views

My problem

I would like to know which product types have reached a certain sales volume threshold for the first time in the previous week (e.g., sports clothing items have not been sold in the past 13 weeks and last week have been sold at least 100 times).


What data do I have?

Weekly sales data over multiple product types for more than a year.


How do I try this at the moment?

I have now pivoted the week dimension, included the product type as normal dimension, and the sales volume is the measure I use. I’ve made a table calculation to check whether a product type has been sold for the first time (significantly) in the past week. To be certain it was not sold before I check for 13 weeks back and see if there is a signifiant sales volume in those weeks. The table calculation basically works as follows:


pivot_index(${datatable.salesvolume},14)-if(is_null(pivot_index(${datatable.salesvolume},13)),0,pivot_index(${datatable.salesvolume},13))>100 AND (pivot_index(${datatable.salesvolume},13)<5 OR is_null(pivot_index(${datatable.salesvolume},13))) AND (pivot_index(${datatable.salesvolume},12)<5 OR is_null(pivot_index(${datatable.salesvolume},12))) ... AND (pivot_index(${datatable.salesvolume},2)<5 OR is_null(pivot_index(${datatable.salesvolume},2))) AND (pivot_index(${datatable.salesvolume},1)<5 OR is_null(pivot_index(${datatable.salesvolume},1)))


I filter the data on the past 14 completed weeks. Accordingly, I only show the newly introduced product types in my visualisation by Hiding no’s from visualization.


The problem

As I have approx. 3000 product types, and look back 14 weeks in time, I’ve quite a massive table. When I schedule this table, the results are sent to the recipients before the Hiding no’s from visualization can do the trick, i.e., it’ll send the full table (also with the non-new product types) instead of only the newly created products.


Does anyone have an idea how to work around this precise issue? Thanks in advance for your help!


4 replies

Userlevel 6
Badge

This sounds like a job for a derived table.


You could build a table that looks something like the table below and join it in, then use is_first_week to limit results to first week of sales.


view: product_week_facts {
derived_table: {
sql:
SELECT
product_id
, SUM(amount) as sales_valume
, MIN( DATE_TRUNC(sale_date, week)) as first_week
, MAX( DATE_TRUNC(sale_date, week)) as last_week
, ROW_NUMBER() OVER () as id
FROM sales
GROUP BY 1
;;
}
dimension: id {primary_key: yes}
dimension: product_id {}
dimension: is_first_week {
type: yesno
sql: ${TABLE}.first_week = ${datatable.sale_week}
}
}

Hi @Tomas, Thanks for the post!


@lloydtabb comment is a great way to improve the performance of these calculations for you. Another thing to note is that you can also save the Explore as a Look. If you select the Table as the type of visualization for the Look, then any dimensions you hide in the resulting dimensions will disappear in the Table. This will ensure that the hidden columns don’t get sent when the Look is scheduled.

Hi @lloydtabb and @jonathon,


Thanks for your comments. I’ve tried playing around with the example code that you gave me and we’re getting there, but there’s two more issues:



  1. In my particular example, it could be that a new product type already has been sold a few times in one piloting store in earlier, but since last week we started selling it in all stores. What I’m trying to say is that we only consider it to be officially introduced when a certain sales volume is reached. The example SQL query does not account for that, even if a new product type has a sales volume of 10 dollars, it is already considered as ‘newly introduced’.

  2. Another complicating factor is that some product types might already be selling small volumes for a while in a piloting store (but have not met the threshold value yet). These older sales volumes are now incorporated in the sales volume, but I’d only like to show the sales volume in the last week for a newly introduced product type. In other words, I don’t want the full sales volume, but for the product types where the min(week)=max(week) (and threshold value) holds for, I’d only like the sales volume of that last week.


Let me know if you’ve any questions. Thanks for your help!

Userlevel 6
Badge

You can introduce a step in your query to eliminate the noise. First step is to generate sales by week. Second step is to filter on thresholds and the third step is to find the minimum.


view: product_week_facts {
derived_table: {
sql:
-- first, get all the sales by week.
WITH step1 AS (
SELECT
product_id
, DATE_TRUNC(sales_date, week) as sale_week
, SUM(amount) as sales_valume
, COUNT(DISTINCT store_id) as num_stores
FROM sales
GROUP BY 1,2
),
-- second, eliminate the test data
step2 AS (
SELECT *
FROM step1
WHERE
num_stores > 5
OR sales_volume > XXX
)
-- step 3, find the minimums.
SELECT
prioduct_id
, MIN( sale_week) as first_week
, ROW_NUMBER() OVER () as id
FROM step2
GROUP BY 1
;;
}
dimension: id {primary_key: yes}
dimension: product_id {}
dimension: is_first_week {
type: yesno
sql: ${TABLE}.first_week = ${datatable.sale_week}
}
}

Some SQL dialects don’t support the WITH syntax in which case you can write a nested queries in the form below.


   SELECT ...
FROM (
SELECT ..
FROM (
SELECT ...
FROM xx
) step1
) step2

Reply