Question

Build Analytics using Square Connect

  • 20 December 2015
  • 0 replies
  • 63 views

Userlevel 3

Recently I’ve had the pleasure of using Square Connect API for a family business. Square’s API is interesting in that it is designed fully with Entity-Relational-Model in mind.


With that, let’s have a look at how we would begin our Square model to power a typical Dashboard.


First, we would probably want to design everything around Business entities and Transactions. There is no hard rule for this, but from my experience I know that we might be interested in very specific data (at the transactional level) and we would want that to be unfiltered. We would also definitely want to analyze specific businesses - irrespective of whether they had transactions or not.


So perhaps something like this:


- explore: business
joins:
- join: locations
sql_on: ${business.id} = ${locations.business_id}
relationship: one_to_many

- join: employees
sql_on: ${business.id} = ${employees.business_id}
relationship: one_to_many

- join: roles
sql_on: ${employees.role_id} = ${roles.id}

- join: timecards
sql_on: ${employees.id} = ${timecards.employee_id}
relationship: one_to_many

- explore: transactions
joins:
- join: locations
sql_on: ${transactions.location_id} = ${locations.id}

- join: items
sql_on: ${locations.id} = ${items.location_id}
relationship: one_to_many

- join: business
sql: ${locations.business_id} = ${business.id}

Here I’ve described some of the basic relationships between tables. We can now begin asking questions about transactions volume per business/location/employee. We can analyze transaction types and amounts.


Second, we want to add additional mathematical concepts to describe potentially useful aggregates:


    - measure: total_amount
type: sum
sql: ${amount}

- measure: amount_tiers
type: tier
tiers: [10,20,50,100]
sql: ${amount}

Having total_amount will allow us to answer most simple questions about businesses and locations–whether over time or at some snapshot.


However, we still can’t infer anything about revenue. Purchases can be cancelled and some stores might run into significant cancellations.


Since cancellations are just negative amounts for the same item and customer, we can create the following logic:


- view: cancelled_transactions  
derived_table:
sql: |
SELECT
user_id
, item_id
, amount
FROM transactions
WHERE amount < 0

- explore: transactions
joins:
- join: cancelled_transactions
sql_on: |
${transactions.user_id} = ${cancelled_transactions.user_id}
AND
${transactions.item_id} = ${cancelled_transactions.item_id}
AND
${transactions.amount} = -${cancelled_transactions.amount}
...

This would work for simple scenarios, but if we have repeated positive equal amount and a cancellation in-between, we would fail miserably. So perhaps it is time to stretch our brain muscles and come up with something more reliable:


- view: cancelled_transactions  
derived_table:
sql: |
SELECT
*
FROM (
SELECT
id
, LAG(id) OVER (PARTITION BY user_id, item_id, -amount ORDER BY created_at ASC ROWS UNBOUNDED PRECEDING) AS cancelled_id
, item_id
, amount
FROM transactions
) as a
WHERE
amount < 0

...
- explore: transactions
joins:
- join: cancelled_transactions
sql_on: |
sql_on: ${transactions.id} = ${cancelled_transactions.cancelled_id}
relationship: one_to_one


To be clear, the point of all this is not to perform a specific analysis. We are simply laying the foundation (or framework) for future repeated analysis. We do 80% of the work upfront, so that 80% of our future analysis only takes 20% of the effort. You can call this 80-80-20 rule 😉.


Good luck with your Square Connect modeling!


0 replies

Be the first to reply!

Reply