Solved

How to filter on last 7 available dates.

  • 10 May 2022
  • 9 replies
  • 129 views

 

I am trying to filter on last 7 available dates in a dataset.


For example I found this post filter-on-the-last-day-with-available-data that allows you to filter on the last day in your dataset but I need the last 7 days in my dataset.

How can I reword this dimension to pull this off?
 

dimension: latest_date { 
type: yesno
sql: ${created_date} = max(${created_date}) ;;
}


 

Say I have a date column in my dataset that looks like this:

COMPLETED_DATE:

05/09/2022
05/06/2022
05/05/2022
05/04/2022
05/03/2022
04/20/2022
04/18/2022
04/17/2022
04/16/2022

I need looker to display the results related to the last 7 reported dates like this:

COMPLETED_DATE:

05/09/2022
05/06/2022
05/05/2022
05/04/2022
05/03/2022
04/20/2022
04/18/2022
icon

Best answer by MikeM 12 May 2022, 14:16

View original

9 replies

I believe you might want to use a “Window function” here https://help.looker.com/hc/en-us/articles/4419773061395-Window-Functions-in-Looker-Community-

That is a bit vague. You do not specify what function might be useful or why. That is like if someone asked how to write a proper sentence and you suggested they use a pen. I need something that I can work with. I am aware of windows functions I have used them rarely with SQL but I have not enough experience with them to know what to use in this case.

Userlevel 7
Badge +1

In this case, date filter is not going to work. Last 7 days, or anything like that will always translate to time and you can’t guarantee what are the last 7 days, can you?

Therefore, I would use the following in your Data Model (wherever you write your SQL transformations):

ROW_NUMBER() OVER(PARTITION BY completed_date ORDER BY completed_date DESC)

This is just an assumption. If completed_date is a timestamp/datetime (not date) you want to use only the DATE part in your partition

ROW_NUMBER() OVER(PARTITION BY DATE(completed_date) ORDER BY completed_date DESC)

But if your columns are named correctly, I hope it’s a date. 

This column could be exposed then to Looker as:

  • completed_date_reverse_order
  • completed_date_seq_reversed

Those are just example naming conventions that could be used.

 

Then in Looker you expose it as a dimension:

dimension: completed_date_reverse_order {

  type: number

  sql: ${TABLE}.completed_date_reverse_order ;;

}

 

You can hide it, if people are not going to use this, meaning you only want last 7, nobody will need last 10, 15, 27

 

Then, to simplify things for yoursel, create yesno dimension to use as a filter

dimension: last_7_dates {

  type: yesno

  sql: ${completed_date_reverse_order} <= 7 ;;

}

 

So oddly enough that does not work with a datetime field.

Results end up like this.

 

SELECT distinct top 31 received_time, ROW_NUMBER() OVER(PARTITION BY CAST(received_time as DATE) ORDER BY received_time DESC)
FROM TABLE_NAME
ORDER BY received_time DESC
DT                         Count
2022-05-06 23:00:20.000    1
2022-05-06 23:00:20.000    2
2022-05-06 22:00:14.000    4
2022-05-06 22:00:14.000    3
2022-05-06 21:00:22.000    5
2022-05-06 21:00:22.000    6

 

If I try to cast the order by it gets much worse:

 

SELECT distinct top 31 received_time, ROW_NUMBER() OVER(PARTITION BY CAST(received_time as DATE) ORDER BY CAST(received_time as DATE)  DESC)

FROM TABLE_NAME
ORDER BY received_time DESC
DT                      Count
2022-05-06 23:00:20.000 4
2022-05-06 23:00:20.000 116
2022-05-06 22:00:14.000 117
2022-05-06 22:00:14.000 5
2022-05-06 21:00:22.000 118
2022-05-06 21:00:22.000 6


So I tried to write a table that just had the unique dates and then for whatever reason this windows function returns only 1’s either as a sub query or as a temp table.
 

IF OBJECT_ID('tempdb..#RT_Temp','U') IS NOT NULL
DROP TABLE #RT_Temp;
CREATE TABLE #RT_Temp
(RT DATE);

INSERT INTO #RT_Temp

select distinct CAST(CD.RECEIVED_TIME as DATE) RT
from TABLE_NAME CD
group by CAST(CD.RECEIVED_TIME as DATE)
ORDER BY CAST(CD.RECEIVED_TIME as DATE)
DESC


select RT,
ROW_NUMBER() OVER(PARTITION BY RT ORDER BY RT DESC) RN
from #RT_Temp
ORDER BY RT desc

​​​

RT	RN
2022-05-06 1
2022-05-05 1
2022-05-04 1
2022-05-03 1
2022-05-02 1
2022-05-01 1
2022-04-30 1
2022-04-29 1
2022-04-28 1

 

Userlevel 7
Badge +1

In the data you provided in your first post, you showed us dates, so what is the structure of the data. Do you want to show last 7 rows or all the rows from last 7 dates?

The need is all rows that match the last 7 unique dates.

So for example if the last 100 rows have 10 unique dates between all rows and the last 7 unique dates only show up on 50 of those rows I need those 50 rows of data.

 

Userlevel 7
Badge +1

In this case try DENSE_RANK window function but keep partitioning as CAST(received_time AS DATE)

Turns out removing the “partition by portion” fixed the count issue.
I removed it after seeing an error that said the ORDER_BY was required with the OVER clause and for some reason that prompted me to check what it would say if I removed the PARTITION BY portion. For whatever reason the counts are showing up now using ROW_NUMBER and RANK without issue. 

 


Select received_date, ROW_NUMBER() OVER(ORDER BY received_date DESC) from
(SELECT distinct CAST(received_time as DATE) received_date FROM TABLE_NAME) s
WHERE received_date <= '2022-04-12'
ORDER BY received_date DESC

AND
 

Select received_date, RANK() OVER(ORDER BY received_date DESC) from
(SELECT distinct CAST(received_time as DATE) received_date FROM TABLE_NAME) s
WHERE received_date <= '2022-04-12'
ORDER BY received_date DESC

Both results in:

 

received_date	RANK
2022-04-12 1
2022-04-11 2
2022-04-10 3
2022-04-08 4
2022-04-07 5
2022-04-06 6
2022-04-05 7

As you can see I get the count I was looking for as we do not have a 04-09 date in the available dates and the count moves on to the next date. 

Thank you for your time @Dawid. Just so you know the DENSE_RANK() was resulting in the same problems I was seeing with ROW_NUMBER() but also worked fine after removing the PARTITION BY section.

Userlevel 7
Badge +1

I’m glad it worked, I actually was working on something similar and totally forgot how ORDER BY impacted my grouping :)

Reply