Question

Creating time dimensional based measures from records with two dates

  • 11 October 2017
  • 2 replies
  • 695 views

We’re analyzing JIRA data that at a root level contains a start and end date, and trying to report by day and with filters reporting on the ‘age in days’ on any given day. Obviously I can report this overall as a stand alone sum or average, but the requirements call to chart and trend this across time. Without a traditional aggregate ‘fact’ table, I’m just not sure how to do it because the measure involves both dates for every issue, and I’m not sure where or how (or even if) I join the issues to the calendar table. Any ideas?


2 replies

Userlevel 6
Badge

We use Looker on top of our Github data. To get what you are looking for join a table of dates and create a new dimension ‘active_date’.


Once you have ‘active_date’ you can group by it and see how many issues where opened on a particular day.


Here is the code in our explore:


explore: issue {
description: "Github issue with joins to issue_facts, analyst, repo, milestone, issue_label, label, comment, issue_mentions, customer_reported_issues, engage_tickets"
label: "Github Issues "

# ... Bunch of Joins

join: active_date {
from: dates
sql_on: active_date.event > issue.created_at
AND ( active_date.event < issue.closed_at OR issue.closed_at IS NULL)
AND active_date.event < GETDATE() ;;
relationship: one_to_many
type: inner
}
}

You can use this pattern to create a date table if your database doesn’t actively support it.



We’ve done very similar things to get at counts of issues with a particular status by day. Thanks for the reply! We are trying to report on the delta in days between created and resolved dates on a given day, and that is where our trouble lies.


Because there is no immediate relationship with our calendar dimension to each issue’s created and resolved dates directly, I’m having a hard time summing up the proper results. Factually, it is even difficult to envision it as a SQL query - everything is fine until you get to the part where you try to join the date dimension to the created and resolved dates - because both joins would be required - and that is when things get challenging.


Certainly I could build an aggregate fact table, one that has a day for each date then a pre-aggregate sum of the days between the two dates in question, but this would be at the expense of having to very carefully choose what dimensional attributes to group by, as too many permutations would explode the data. (I think…)


Happy to hear any other ideas or feedback you have on this and thanks again for the reply!

Reply