Question

Measure based on date dimension

  • 12 July 2018
  • 6 replies
  • 2641 views

Userlevel 2

I have a table with 4 columns: timestamp, machine, created_date, deleted_date



An example of data would be:



timestamp       machine      created_date          deleted_date

2018-04 A 2018-04 2018-06

2018-04 B 2018-04 null

2018-05 A 2018-04 2018-06

2018-05 B 2018-04 null

2018-05 C 2018-05 null

2018-05 D 2018-05 2018-07

2018-06 A 2018-04 2018-06

2018-06 B 2018-04 null

2018-06 C 2018-05 null

2018-06 D 2018-05 2018-07

2018-06 E 2018-06 null

2018-07 B 2018-04 null

2018-07 C 2018-05 null

2018-07 D 2018-05 2018-07

2018-07 E 2018-06 null

2018-07 F 2018-07 null

2018-07 G 2018-07 null



So every month I get a list of existing machines with the values of creation and deletion (once the machine is deleted, the records are updated for the deleted_date of the previous months).


What I’m trying to do is create a chart with the cumulative growth of number of machines per month.


I can’t simply do a count distinct machine because they repeat each month and I would be counting multiple times the same one on the cumulative chart.


In the example above I have


2018-04 > 2 created


2018-05 > 2 created


2018-06 > 1 created and 1 deleted


2018-07 > 2 created and 1 deleted



So my chart would have the values: 2 - 4 - 4 - 5


My question is: how to create a measure that will count by my timestamp how many machines were created minus how many were deleted on that month?



Thank you


6 replies

I’m just another Looker user, not anyone official, but I’d suggest creating a dimension which is the concatenation of the machine column and the created_date, then doing a count distinct on that column, and filtering for when deleted_date is null.



 dimension: machine_created {

type : string

sql: ${machine} || ${created_date_month} ;;

}



measure: machines_created {

type: count_distinct

sql: ${machine_created}

}
Userlevel 2

Unfortunately, that wouldn’t help. First because machine or machine + create date would be the same, the machine is already unique, so the same distinct rule apply for both cases.


Filtering the deleted with null would only make the ones that still exist to show up, which is not what I want, I want how is the total count of machines cumulative through time

Userlevel 6
Badge

Maybe I’m missing something, but it looks like you just need to drop some of the records that you might accidentally count.



If you simply group by timestamp and measure machine count, you over count because the machines are showing up in the month they where deleted. If you drop the deleted records, your counts should be accurate.



explore: machine_activity {

sql_always_where: ${timestamp} < ${deleted_date}

}



view: machine_activity {

dimension: timestamp {}

dimension: machine {}

dimension: created_date {}

dimension: deleted_date {}

measure: machines_count {

type: count_distinct

sql: ${machine}

drill_fields: [machine}

}

}

With the same logic as lloydtabb mentioned, you could create a measure like this…



measure: count_of_machines{ type: sum sql: CASE WHEN ${deleted_date}>${timestamp} OR ${deleted_date} IS NULL THEN 1 ELSE 0 END;; drill_fields: [machine] }



Then as long as timestamp is your dimension, it will supply you with the proper sum you need.

Userlevel 2

I can’t simply delete the records. The table is a lot more complex then what I mentioned, it gives several stats and information by machine per month, for example, amount of activity for that month, so I need a record per machine per month, even on the month it was deleted (since I have data until the day it was deleted)


A simple way of explaining will be, imagine I have


April = [ A, B];


May = [ A, B, C, D];


June = [A*, B, C, D, E];


July = [B, C, D*, E, F, G];



X* meaning it will be deleted that month



I have a chart with machines per month, which would be 2, 4, 5, 6


Now I want a cumulative one, so: 2, 4 (2 new), 4 (1 new and 1 deleted), 5 (2 new and 1 deleted)



I’m going to try kevinbalzer solution 🙂

Userlevel 2

Kevin’s idea didn’t quite work but gave me the idea for the solution 🙂


Just in case someone has this issue in the future, my solution was create the measure:



measure: cumulative_count_of_machines {

type: sum

sql: CASE WHEN ${created_month} = ${deleted_month} THEN 0 WHEN ${created_month} = ${timestamp_month} THEN 1 WHEN ${deleted_month} = ${timestamp_month} THEN -1 ELSE 0 END;;

drill_fields: [instance_id]

}



With this I have the difference for each month and on the chart creation I added an extra column for the cumulative calculation with



running_total(${my_view.cumulative_count_of_machines})



Thank you everyone for the help and suggestions!

Reply