At Buffer, we are continually working on ways to streamline access to data within the company. We are big on transparency, and that goes for data too! After noticing a bit of a friction point with interpreting the data of quantitative user experiments (ie, A/B tests) for non-analysts, we built a LookML pattern to serve up the analysis dynamically at any time to any user. Many of the experiments we run are simple two group A/B tests , with one control group and one enabled group. Our solution utilizes a Python UDF in Redshift, along with a case statement and conditional formatting in LookML. We thought it might be helpful to others in the Looker community to share 🙂 . Let’s take a look using a simple marketing landing page A/B test as a example.
The Actual Stats Part (yeah math!)
Leveraging the awesomeness of Redshift’s python UDF’s, which already ships with a bunch of libraries that are often used in data science and analytics work (like numpy, scipy and pandas), we started with creating our own UDF to calculate a p-value. Given the nature of most experiments we run, we decided on the SciPy function chi2_contingency
, which performs a chi-square test of independence of variables, which we named experiment_result_p_value()
. This function excepts four parameters (the control group population, the control group count of successes, the enabled group population, and the enabled group count of successes) and returns a p_value. If you are up for exploring more, feel free to read in more detail how we created this UDF.
Using the UDF in Looker
With the UDF created in Redshift, we just needed to create four measures and pass them through as parameters in experiment_result_p_value()
. Our experiment data is stored as events in Redshift, which tracks: user_id, experiment name, experiment group name (ie, control or enabled), and the timestamp the user entered into the experiment. Since we run a fair number of experiments on the main marketing landing page, we felt it worthwhile to create a specific view in Looker to handle experiments where the determination of success was if a visitor created an account and became a user. We extended our main visitor_experiment_events
view as a new view visitor_experiment_conversions
in order to add in the scenario specific dimensions and measures. Here are the four measures we needed for the parameters in the UDF:
Old LookML
```
- measure: visitor_count_control
type: count_distinct
sql: ${TABLE}.visitor_id
filters:
test_group: '%control%'
measure: visitor_count_enabled
type: count_distinct
sql: ${TABLE}.visitor_id
filters:
test_group: ‘%enabled%’
measure: user_count_control
type: count_distinct
sql: ${TABLE}.user_id
filters:
test_group: ‘%control%’
measure: user_count_enabled
type: count_distinct
sql: ${TABLE}.user_id
filters:
test_group: ‘%enabled%’
</details>
<details>
<summary>New LookML</summary>
measure: visitor_count_control {
type: count_distinct
sql: ${TABLE}.visitor_id ;;
filters: {
field: test_group
value: “%control%”
}
}
measure: visitor_count_enabled {
type: count_distinct
sql: ${TABLE}.visitor_id ;;
filters: {
field: test_group
value: “%enabled%”
}
}
measure: user_count_control {
type: count_distinct
sql: ${TABLE}.user_id ;;
filters: {
field: test_group
value: “%control%”
}
}
measure: user_count_enabled {
type: count_distinct
sql: ${TABLE}.user_id ;;
filters: {
field: test_group
value: “%enabled%”
}
}
</details>
So with our four measures created, we could then pass those through as parameters in the `experiment_result_p_value()` function in the `sql:` line of the `p_value` dimension. We then used the `p_value` to create an `is_significant` yesno dimension.
<details open>
<summary>Old LookML</summary>
dimension: p_value
type: number
sql: round(experiment_result_p_value(${visitor_count_control}, ${user_count_control}, ${visitor_count_enabled}, ${user_count_enabled}), 3)
dimension: is_significant
type: yesno
sql: ${p_value} < 0.05
</details>
<details>
<summary>New LookML</summary>
dimension: p_value {
type: number
sql: round(experiment_result_p_value(${visitor_count_control}, ${user_count_control}, ${visitor_count_enabled}, ${user_count_enabled}), 3) ;;
}
dimension: is_significant {
type: yesno
sql: ${p_value} < 0.05 ;;
}
</details>
Additionally, we created two measures for calculating the conversion ratio from visitor to user for each of the experiment groups:
<details open>
<summary>Old LookML</summary>
measure: control_conversion_ratio
type: number
sql: 100.00 * ${user_count_control} / NULLIF(${visitor_count_control}, 0)
measure: enabled_conversion_ratio
type: number
sql: 100.00 * ${user_count_enabled} / NULLIF(${visitor_count_enabled}, 0)
</details>
<details>
<summary>New LookML</summary>
measure: control_conversion_ratio {
type: number
sql: 100.00 * ${user_count_control} / NULLIF(${visitor_count_control}, 0) ;;
}
measure: enabled_conversion_ratio {
type: number
sql: 100.00 * ${user_count_enabled} / NULLIF(${visitor_count_enabled}, 0) ;;
}
</details>
**Packaging it all up in a single `experiment_result` measure**
We wanted the analysis to be really convenient to digest, so combined the above dimensions and measures with a case statement and some html conditional formatting and returned a single measure that reflects the current result.
The experiment result text and background color changes based on the four scenarios:
- enabled group has a higher rate of success and is significant (green background color)
- enabled group has a higher rate of success but is not significant (yellow background color)
- control group has a higher rate of success and is significant (orange background color)
- control group has a higher rate of success but is not significant (red background color)
Here is what the LookML looks like:
<details open>
<summary>Old LookML</summary>
- measure: experiment_result
sql: |
CASE WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN ‘control_sig’
WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN ‘control_notsig’
WHEN ${control_conversion_ratio} = ${enabled_conversion_ratio} THEN ‘tied’
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN ‘enabled_sig’
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN ‘enabled_notsig’
ELSE NULL END
html: |
{% if value == “enabled_sig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the enabled test group has the highest conversion rate, and the result is statistically significant
{% elsif value == “enabled_notsig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the enabled test group has the highest conversion rate, BUT the result is NOT statistically significant
{% elsif value == “control_notsig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the control test group has the highest conversion rate, BUT the result is NOT statistically significant
{% elsif value == “control_sig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the control test group has the highest conversion rate, and the result is statistically significant
{% elsif value == “tied” %}
The conversion rate is the same for both the control and enabled test groups
{% else %}
{{ rendered_value }}
{% endif %}
</details>
<details>
<summary>New LookML</summary>
measure: experiment_result {
sql: CASE WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN ‘control_sig’
WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN ‘control_notsig’
WHEN ${control_conversion_ratio} = ${enabled_conversion_ratio} THEN ‘tied’
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN ‘enabled_sig’
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN ‘enabled_notsig’
ELSE NULL END
;;
html: {% if value == “enabled_sig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the enabled test group has the highest conversion rate, and the result is statistically significant
{% elsif value == “enabled_notsig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the enabled test group has the highest conversion rate, BUT the result is NOT statistically significant
{% elsif value == “control_notsig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the control test group has the highest conversion rate, BUT the result is NOT statistically significant
{% elsif value == “control_sig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the control test group has the highest conversion rate, and the result is statistically significant
{% elsif value == “tied” %}
The conversion rate is the same for both the control and enabled test groups
{% else %}
{{ rendered_value }}
{% endif %}
;;
}
</details>
When used in a Look, all that needs to be set are filters for the experiment name and the starting date of the experiment.
<img src='/uploads/default/original/2X/e/e741809e24bdbbb2f66e1acdbc712b2e5143d8bf.png'>
Our hope is that by surfacing the experiment results in this way, non-analyst members of the Buffer team are able to realize even faster feedback loops in build-measure-learn workflows.
**The Full LookML**
<details open>
<summary>Old LookML</summary>
- view: visitor_experiment_conversions
extends: visitor_experiment_events
fields:
- dimension: p_value
type: number
sql: round(experiment_result_p_value(${visitor_count_control}, ${user_count_control}, ${visitor_count_enabled}, ${user_count_enabled}), 3)
- dimension: is_significant
type: yesno
sql: ${p_value} < 0.05
- measure: experiment_result
sql: |
CASE WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN 'control_sig'
WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN 'control_notsig'
WHEN ${control_conversion_ratio} = ${enabled_conversion_ratio} THEN 'tied'
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN 'enabled_sig'
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN 'enabled_notsig'
ELSE NULL END
html: |
{% if value == "enabled_sig" %}
<b><p style="color: black; background-color: #AECE63; margin: 0; border-radius: 5px; text-align:center">For experiment '{{ visitor_experiment_conversions.experiment_name._value }}', the enabled test group has the highest conversion rate, and the result is statistically significant</p></b>
{% elsif value == "enabled_notsig" %}
<b><p style="color: black; background-color: #E8E47F; margin: 0; border-radius: 5px; text-align:center">For experiment '{{ visitor_experiment_conversions.experiment_name._value }}', the enabled test group has the highest conversion rate, BUT the result is NOT statistically significant</p></b>
{% elsif value == "control_notsig" %}
<b><p style="color: black; background-color: #F7B251; margin: 0; border-radius: 5px; text-align:center">For experiment '{{ visitor_experiment_conversions.experiment_name._value }}', the control test group has the highest conversion rate, BUT the result is NOT statistically significant</p></b>
{% elsif value == "control_sig" %}
<b><p style="color: black; background-color: #FF865E; margin: 0; border-radius: 5px; text-align:center">For experiment '{{ visitor_experiment_conversions.experiment_name._value }}', the control test group has the highest conversion rate, and the result is statistically significant</p></b>
{% elsif value == "tied" %}
<b><p style="color: black; margin: 0; border-radius: 5px; text-align:center">The conversion rate is the same for both the control and enabled test groups</p></b>
{% else %}
<font color="darkred">{{ rendered_value }}</font>
{% endif %}
- measure: visitor_count_control
type: count_distinct
sql: ${TABLE}.visitor_id
filters:
test_group: '%control%'
- measure: visitor_count_enabled
type: count_distinct
sql: ${TABLE}.visitor_id
filters:
test_group: '%enabled%'
- measure: user_count_control
type: count_distinct
sql: ${TABLE}.user_id
filters:
test_group: '%control%'
- measure: user_count_enabled
type: count_distinct
sql: ${TABLE}.user_id
filters:
test_group: '%enabled%'
- measure: control_conversion_ratio
type: number
sql: 100.00 * ${user_count_control} / NULLIF(${visitor_count_control}, 0)
- measure: enabled_conversion_ratio
type: number
sql: 100.00 * ${user_count_enabled} / NULLIF(${visitor_count_enabled}, 0)
</details>
<details>
<summary>New LookML</summary>
view: visitor_experiment_conversions {
extends: [visitor_experiment_events]
dimension: p_value {
type: number
sql: round(experiment_result_p_value(${visitor_count_control}, ${user_count_control}, ${visitor_count_enabled}, ${user_count_enabled}), 3) ;;
}
dimension: is_significant {
type: yesno
sql: ${p_value} < 0.05 ;;
}
measure: experiment_result {
sql: CASE WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN ‘control_sig’
WHEN (${control_conversion_ratio} > ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN ‘control_notsig’
WHEN ${control_conversion_ratio} = ${enabled_conversion_ratio} THEN ‘tied’
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} < 0.05) THEN ‘enabled_sig’
WHEN (${control_conversion_ratio} < ${enabled_conversion_ratio}) AND (${p_value} >= 0.05) THEN ‘enabled_notsig’
ELSE NULL END
;;
html: {% if value == “enabled_sig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the enabled test group has the highest conversion rate, and the result is statistically significant
{% elsif value == “enabled_notsig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the enabled test group has the highest conversion rate, BUT the result is NOT statistically significant
{% elsif value == “control_notsig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the control test group has the highest conversion rate, BUT the result is NOT statistically significant
{% elsif value == “control_sig” %}
For experiment ‘{{ visitor_experiment_conversions.experiment_name._value }}’, the control test group has the highest conversion rate, and the result is statistically significant
{% elsif value == “tied” %}
The conversion rate is the same for both the control and enabled test groups
{% else %}
{{ rendered_value }}
{% endif %}
;;
}
measure: visitor_count_control {
type: count_distinct
sql: ${TABLE}.visitor_id ;;
filters: {
field: test_group
value: "%control%"
}
}
measure: visitor_count_enabled {
type: count_distinct
sql: ${TABLE}.visitor_id ;;
filters: {
field: test_group
value: "%enabled%"
}
}
measure: user_count_control {
type: count_distinct
sql: ${TABLE}.user_id ;;
filters: {
field: test_group
value: "%control%"
}
}
measure: user_count_enabled {
type: count_distinct
sql: ${TABLE}.user_id ;;
filters: {
field: test_group
value: "%enabled%"
}
}
measure: control_conversion_ratio {
type: number
sql: 100.00 * ${user_count_control} / NULLIF(${visitor_count_control}, 0) ;;
}
measure: enabled_conversion_ratio {
type: number
sql: 100.00 * ${user_count_enabled} / NULLIF(${visitor_count_enabled}, 0) ;;
}
}
</details>