Looker and Vertica - geospatial analytics

  • 28 March 2022
  • 0 replies
  • 57 views

Userlevel 5

This content, written by Erin Franz, was initially posted in Looker Blog on Dec 22, 2015. The content is subject to limited support.

Vertica offers a suite of geospatial analytics functions with its HP Vertica Place package. This package enables storing and querying of geometric objects - translating locations into geometric points and regions into polygons. Common asks for measuring distance, length, area can be completed directly in Vertica and these requests are optimized at scale using spatial indexes and the underlying power of Vertica’s database.

Because Looker performs all analytical workloads in HP Vertica, Looker can utilize these geospatial analytic functions to enable exploration and visualization of results dynamically. To demonstrate some of what’s possible, we’ll start with a table of bookings in Vertica that have associated latitude and longitude locations. When we query for a sample of the table, called locations, we see the following:

select * from place.locations limit 10


In Looker, we can easily examine where the bookings are coming from by creating a view file, where we define how to query the table through dimensions and measures. In the view file, we can derive a location type field so that Looker can plot the latitude and longitude on a map.

- explore: locations

- view: locations
  sql_table_name: place.locations
  fields:

  - dimension: booking_id
    type: number
    sql: ${TABLE}.booking_id

  - dimension: booking_lat
    type: number
    sql: ${TABLE}.booking_lat

  - dimension: booking_long
    type: number
    sql: ${TABLE}.booking_long

  - dimension: booking_location
    type: location
    sql_latitude: ${booking_lat}
    sql_longitude: ${booking_long}

  - measure: count
    type: count
    drill_fields: []


Using these dimensions and measures, we can see where our bookings are coming from in Looker, by selecting the location dimension and count measure.

Though we can visually see where the bookings are coming from, we’d probably want to categorize them further for reporting. Suppose we have 4 sales regions: Northeast, Northwest, Southeast, and Southwest. We can define these regions using Polygon Geometry Objects and store this data in a table called regions.

CREATE TABLE place.regions AS
SELECT '1' as region_id, 'northeast' as region_name, 'POLYGON((-70 50,-95 50,-95 40,-70 40,-70 50))' as region_shape UNION ALL 
SELECT '2', 'southeast', 'POLYGON((-70 40,-95 40,-95 30,-70 30,-70 40))' UNION ALL 
SELECT '3', 'northwest', 'POLYGON((-95 50,-120 50,-120 40,-95 40,-95 50))' UNION ALL 
SELECT '4', 'southwest', 'POLYGON((-95 40,-120 40,-120 30,-95 30,-95 40))'


And create the corresponding view file in LookML to reference the values from table as dimensions:

- view: regions
  sql_table_name: place.regions
  fields:

  - dimension: region_id
    primary_key: true
    type: number
    sql: ${TABLE}.region_id

  - dimension: region_name
    sql: ${TABLE}.region_name

  - dimension: region_shape
    sql: ${TABLE}.region_shape


We can then take advantage of two Vertica Place functions to map our booking locations to regions. First, we’ll use ST_GeomFromText to convert text into a geometry object, for both the polygons defining the sales regions and the points defining the bookings. Then, we can use the ST_Intersects function to determine if the points overlap with the regions at any given point. This can be done via a join between the locations table and the regions table, which we can define in our Explore definition.

- explore: locations
  joins:
    - join: regions
      sql_on: |
        ST_Intersects(
          ST_GeomFromText('POINT(' || ${locations.booking_long} || ' ' || ${locations.booking_lat} || ')'),
          ST_GeomFromText(${regions.region_shape})
        )
      relationship: many_to_one


Now, when we explore locations, we can filter on region. For instance, maybe we are interested in performance in the Southwest region.

We can now associate each location to a region and look at region specific performance, or compare regions side by side like below. It’s easy to see that the Southwest region has the largest number of booking requests overall.

Now that we have an idea of regional performance, we likely want to dig deeper and examine performance even more locally, perhaps at a metro level. Assuming that we have a table called cities with latitude and longitude information, it’s possible to look at the bookings with a certain distance from each city. This mapping is done similarly to our region join, but instead of the ST_Intersects function, we’ll use the STV_DWithin function, which determines if the shortest distance from one object to another object is within a specified distance. We can add this to our explore definition, assuming in our view file for cities, in addition to the dimensions for city name and location, we’ve created a parameter for selecting the distance we’d like to be included in our metro area.

- explore: locations
  joins:
    - join: regions
      sql_on: |
        ST_Intersects(
          ST_GeomFromText('POINT(' || ${locations.booking_long} || ' ' || ${locations.booking_lat} || ')'),
          ST_GeomFromText(${regions.region_shape})
        )
      relationship: many_to_one
    - join: cities
      sql_on: |
        STV_DWithin(
          ST_GeomFromText('POINT(' || ${locations.booking_long} || ' ' || ${locations.booking_lat} || ')'),
          ST_GeomFromText('POINT(' || ${cities.longitude} || ' ' || ${cities.latitude} || ')'),
          {% parameter cities.distance_miles %} / 65)
      relationship: many_to_one


Let’s assume we’re interested in looking at Dallas bookings within a 250 mile radius. Filtering on both City and Distance Miles yields the following data and visualization in Explore:

We can now further examine the bookings isolated to the Dallas Metro, completing all transformation by defining it in the LookML modeling layer and using Vertica Place analytics package by directly querying the database.

Visit to learn more about Looker and the current features.


0 replies

Be the first to reply!

Reply