This content, written by Mike Xu, was initially posted in Looker Blog on Feb 27, 2014. The content is subject to limited support.
Today, we are announcing the next big step for Looker: Persistent Derived Tables, an elegant new tool for managing data transformations. While Looker already supported derived tables, the addition of persistence greatly expands the ways they can be used to extract useful results from complex datasets.
Data transformations are a crucial part of analysis. Raw data is not always sufficient for answering all of our questions, especially the complex questions. To meet these needs, we need to transform the raw data into a useful form. Traditionally, this is slow and expensive because it requires engineering efforts to build out and maintain data structures within the data stack. At Looker, we’ve been seeing this as a growing burden to our customers as they become increasingly sophisticated and curious about all areas of their business.
Persistent Derived Tables allow analysts to simply create and manage transformations directly within their Looker model. They seamlessly integrate with connected data source(s) to make complex data transforming effortless. Analysts are now more empowered than ever to discover profound insights.
Data transformation with derived tables
In analyzing complex data, it’s often useful to have transformations that involve multiple steps. A derived table is a powerful abstraction for capturing one stage of a transformation, preparing for and presenting new dimensions of access to the information hidden in a dataset. For those familiar with the concept from working with databases, a derived table can be thought of as a kind of .
The information users are looking for is often obscured or associated inconveniently, and derived tables have proven to be a simple, powerful way of solving this class of problem. In LookML, derived tables are simply another type of view. They can be used as a base view for end-user exploration or joined for incorporating the results of the transformation into other views. LookML provides the ability to specify the SQL for the transformation step, and the LookML dimension and measure syntax used for all views provides comprehension of the computational results.
Entity/Transaction Facts are the most common derived table pattern. From transaction records, a fact table rolls up information into an indexed table to enable business-model queries simply and quickly. Information hidden in the transaction records, such as the following, becomes visible:
- User lifetime orders
- User first and last order date
- User number of orders in the last n days
- User lifetime value
- User number of months in which a purchase was made
- Product number of items ever sold
- Product rank
- Product average days in inventory
Other common uses of derived tables include:
- Sequence numbering of transactions
- Combining multiple transactions into a single event table
- Cleaning up errors in data
- Adding indexes to tables
What are Persistent Derived Tables?
Sometimes the abstraction provided by a derived table is sufficient in itself to enable the information to be accessed and shared as needed by a business. However, there are cases where the time needed to perform the computation that results in a derived table is significant.
In these cases, analysts can eliminate this overhead cost by using their knowledge of the underlying data referenced by the computation. In today’s release of Looker, there are new LookML properties for derived tables that enable Looker to retain the data from a derived table computation in a Persistent Derived Table. These new LookML properties (persist_for:, sql_trigger_value:, and sql_trigger_if:) allow analysts to precisely specify the conditions which would require the derived table computation to be run again.