We have a table which pulls in rows of data each time our continuous integration server runs tests. It imports ~500 rows for each CI run, one for each source code file, and for each one shows the test coverage metrics ‘# lines tested’, and ‘# lines in file’, and file category (things like ‘model’, ‘controller’)
Schema is like:
- timestamp
- file_name
- num_tested_lines
- num_relevant_lines
- file_category
What I’m trying to do is show a snapshot of ‘% coverage by category’
However, I can’t figure out how to structure things so I’m just looking at the “most recent version” of each file.
In plain SQL, to figure out coverage for controllers, I would write:
SELECT 100.0*SUM(num_tested_lines) / SUM(num_relevant_lines)
FROM code_coverage_reports
WHERE file_category = 'controller'
AND timestamp = (SELECT max(timestamp) from code_coverage_reports)
It’s the timestamp filtering that I can’t figure out in Looker. What I’m asking semantically is “when doing the aggregation, only consider the last set of matching rows”. I feel like there’s a WINDOW trick for this, but I haven’t been able to figure it out.
Any pointers?