Given that some SQL dialects cannot perform aggregations on aggregations [e.g. avg(sum(xyzvalue))] I was wondering if anyone has any ideas on how to tackle this while still allowing a “user” to select dynamic dimensions over which these pre-aggregations and then post-aggregations take place.
EXAMPLE
Assume the data I’m working with is defined by the following:
select 1 uid, 100 value, ‘x’ clientid, 101 itemid, ‘20161230’ asofdate UNION ALL
select 2 uid, 200 value, ‘x’ clientid, 202 itemid, ‘20161230’ asofdate UNION ALL
select 3 uid, 500 value, ‘x’ clientid, 303 itemid, ‘20161230’ asofdate UNION ALL
select 4 uid, 800 value, ‘y’ clientid, 101 itemid, ‘20161230’ asofdate UNION ALL
select 5 uid, 250 value, ‘y’ clientid, 303 itemid, ‘20161230’ asofdate UNION ALL
select 6 uid, 100 value, ‘x’ clientid, 101 itemid, ‘20161231’ asofdate UNION ALL
select 7 uid, 400 value, ‘x’ clientid, 202 itemid, ‘20161231’ asofdate UNION ALL
select 8 uid, 600 value, ‘x’ clientid, 303 itemid, ‘20161231’ asofdate UNION ALL
select 9 uid, 100 value, ‘y’ clientid, 303 itemid, ‘20161231’ asofdate
a few illustrative analyses I may want to allow a user to perform are:
- Calculate the average total value (across all items) by client by asofdate…or “average total value from client by date”
- Calculate the average total value (across all items and all clients) by asofdate …or “average total value by date”
- Calculate the average total value (across all clients) by item by asofdate…or “average total value from item by date”
There are several approaches I’ve tested but with little success… i may be missing something and/or may not fully understand the posts i found related but would love to hear if anyone has been able to successfully implement something similar and could shed some light for me:
PDTs with window-analytic functions: this allows for pre-aggregation --> but doesnt fit this use case in that it restricts the dimensions over which aggs take place …
Waterfall sequencing of pre- and post-aggregation PDTs: “PDTstep1” with my pre-aggregations and “PDTstep2” as the aggregation of the aggregations in “PDTstep1” (example code below) --> but the lookml aggregations defined in PDTstep1 don’t come through as actual “columns” in “PDTstep2”
explore: PDTstep1and2 {
from: PDTstep1
join: PDTstep2 {
sql_on: ${PDTstep1and2.uid} = ${PDTstep2.uid} ;;
type: inner
relationship: one_to_one
}
}
view: PDTstep1 {
derived_table: {
sql:
select 1 uid, 100 value, 'x' clientid, 101 itemid, '20161230' asofdate UNION ALL
select 2 uid, 200 value, 'x' clientid, 202 itemid, '20161230' asofdate UNION ALL
select 3 uid, 500 value, 'x' clientid, 303 itemid, '20161230' asofdate UNION ALL
select 4 uid, 800 value, 'y' clientid, 101 itemid, '20161230' asofdate UNION ALL
select 5 uid, 250 value, 'y' clientid, 303 itemid, '20161230' asofdate UNION ALL
select 6 uid, 100 value, 'x' clientid, 101 itemid, '20161231' asofdate UNION ALL
select 7 uid, 400 value, 'x' clientid, 202 itemid, '20161231' asofdate UNION ALL
select 8 uid, 600 value, 'x' clientid, 303 itemid, '20161231' asofdate UNION ALL
select 9 uid, 100 value, 'y' clientid, 303 itemid, '20161231' asofdate
;;
}
dimension: uid {
type: number
value_format_name: id
sql: ${TABLE}.uid ;;
primary_key: yes
}
dimension: date_key {
type: string
sql: ${TABLE}.asofdate ;;
}
dimension: asofdate {
type: date
sql: ${TABLE}.asofdate ;;
}
dimension: client_id {
type: string
sql: ${TABLE}.clientid ;;
}
dimension: item_id {
type: number
value_format_name: id
sql: ${TABLE}.itemid ;;
}
measure: value_pre_sum {
description: ""
type: sum
sql: ${TABLE}.value ;;
value_format_name: decimal_2
}
}
view: PDTstep2 {
derived_table: {
sql:
select
*
from ${PDTstep1.SQL_TABLE_NAME} agg;;
}
dimension: uid {
hidden: yes
type: number
value_format_name: id
sql: ${TABLE}.uid ;;
# primary_key: yes
}
dimension: date_key {
hidden: yes
type: string
sql: ${TABLE}.date_key ;;
}
dimension: asofdate {
type: date
sql: ${TABLE}.asofdate ;;
}
dimension: client_id {
hidden: no
type: string
sql: ${TABLE}.clientid ;;
}
dimension: item_id {
hidden: no
type: number
value_format_name: id
sql: ${TABLE}.itemid ;;
}
measure: value_sum {
description: ""
type: sum
sql: ${TABLE}.value_pre_sum ;;
value_format_name: decimal_2
}
measure: value_avg {
description: ""
type: average
sql: ${TABLE}.value_pre_sum ;;
value_format_name: decimal_2
}
}