Question

Offset_list with filter

  • 26 April 2019
  • 3 replies
  • 1222 views

Hi,


I have a pretty tricky situation which I’d be interested in seeing if it’s familiar to anyone. I have a table with the raw data has the form:


week, metric, value

20190415, ‘Signups’ 15443

20190415, ‘Sales’, 28474

20190408, ‘Signups’ 19393

20190408, ‘Sales’, 39494


etc. (in reality many more metrics)


I have created some calculated fields to get current quarter, and weeks since beginning of quarter:


week, metric, value, current_q, weeks_since_beginning_q

20190415, ‘Signups’ 15443, 20190401, 2

20190415, ‘Sales’, 28474, 20190401, 2

20190408, ‘Signups’ 19393, 20190401, 1

20190408, ‘Sales’, 39494, 20190401, 1


Now what I want to do is get Quarter to Date numbers for each metric, so something like sum(offset_list(value,0, weeks_since_beginning_q)), but only for rows where metric matches the current row’s metric:


week, metric, value, current_q, weeks_since_beginning_q, qtd

20190415, ‘Signups’ 15443, 20190401, 2, 34836 (15443 + 19393)

20190415, ‘Sales’, 28474, 20190401, 2, 67968 (28474 + 39494)

20190408, ‘Signups’ 19393, 20190401, 1, 19393

20190408, ‘Sales’, 39494, 20190401, 1, 19393


Does anyone know of some intricate combination of ifs/lookups/offsets that would accomplish this? I’ve really tried but can’t seem to be able to work it out. 😦 This looks to be aiming at something similar but seems to have been handled offline: Summing rows with a specific condition


If there was an interval parameter in offset_list that would work here too as there’ll always be a constant number of metrics, but doesn’t seem like there is. Any ideas would be greatly appreciated. Thanks!


/Petter


3 replies

Userlevel 3

Hi @Petter,


First of all, are there too many metrics that you would not want them to be pivot’ed?

Because that might solve the problem, using a running_total, quite easily already.


I would still not be sure how to implement this in your example, but I have at least one question for more clarification:

In your example your quarter starts exactly with the beginning of a week. However, what would you - for example - have wanted to see in January, where the Quarter starts at 2019-01-01, but weeks started at 2018-12-31 and 2019-01-07? When is one week passed and when not?


Sadly the start of weeks vs. months/quarters/years work separately and quite differently.

(If only we had 360 days in a year, with 9 months of 40 days and 4 weeks of 10 days…)

Userlevel 3

I see, that would be a lot indeed. Maybe with a different visualisation, it might become clearer?

Anyhow, great that you solved it!


If you are using a Derived Tables anyway, you could - instead of an UNION for each metric - also use a Window function.

In PostgreSQL that would look like:


SUM(CASE WHEN week >= current_q THEN value ELSE 0 END) OVER (PARTITION BY metric ORDER BY week ASC)

This will sum all "value"s, only if the week-date is AFTER the current_q value, ordered by week, AND grouped by metric(!). For any amount of different metric-types.

This way you won’t need the “weeks_since_beginning_q” either.


This might actually also work using the running_total function in Table Calculations. But I’m not sure. 🙂 This calculation below does not look at the metric, which was actually the problem of course…


running_total(if(week>=current_q,value,0))

Maybe this will help you even further?

Thanks for the response @DaanIF! After trying to figure out a smart way to do this with table calculations, I eventually gave up and created a derived table with a bunch on UNIONs, which achieves the right thing but requires another derived table, is slow when it’s built etc, but I’m not sure it’s possible to do it in a better way.


The reason I wanted each metric on a row each is mainly that there are three columns per metric already, so there would be like 10 different metrics that’d be 30 columns and just one row to see the performance for last week. 🙂 For the beginning of quarter/year, I just set the first week as the beginning, so it’s not exact but acceptable here.


I feel like an additional parameter in offset_list to get the values for, say, every other row would be a great addition for this and other use cases, but it might not be possible to do what I had been thinking of with the current set of functions.

Reply