How to highlight and count matching patterns in a String or Array using Liquid HTML (Pseudo-Regex)

0 1 1,040
Knowledge Drop

Last tested: Jun 22, 2020
 

The Problem

Have you ever wanted to highlight a portion of text in a phrase based on some defined search criteria? Well, one customer needed the same where they needed to highlight a pattern in a list of Epoch times.

In a nut-shell, Can I highlight a pattern in a list and find the number of times it occurs in the list? YES

A Solution

image.png

By using liquid HTML and some basic programming:

Study the code below or look at the explanations below.

CODE

view: liquid_conditional_format {

derived_table: {

sql:

SELECT 1 as id, "[120, 300, 400]" as pattern, "[120, 300, 400]" as epoch_list, 5 as count UNION ALL

SELECT 2 as id, "[120, 300, 400]" as pattern, "[100, 120, 300, 400, 120, 300, 400]" as epoch_list, 5 as count UNION ALL

SELECT 3 as id, "[100, 250, 300]" as pattern, "[100, 250, 300, 400]" as epoch_list, 5 as count UNION ALL

SELECT 4 as id, "[350, 500, 600]" as pattern, "[100, 250, 300, 400]" as epoch_list, 5 as count UNION ALL

SELECT 5 as id, "[Hello]" as pattern, "[Hello, World]" as epoch_list, 5 as count

;;

}
 

dimension: id {

primary_key:yes

type:number

sql:${TABLE}.id;;

}

dimension: pattern {

type: string

sql:${TABLE}.pattern;;

html: <p style="text-align:center">{{value}}</p> ;;

}

dimension: matches {

type: string

sql:${TABLE}.pattern;;

html: {% assign _count = 0 %}

{% assign __pattern = value | remove: '['| remove: ']'%}

{% assign __epoch_list = epoch_list._value | remove: '['| remove: ']' | split: ', '%}

{% assign _epochlist_countseed = epoch_list._value %}

{% for item in __epoch_list %}

{% if _epochlist_countseed contains __pattern %}

{% assign _count = _count | plus: 1 %}

{% assign _epochlist_countseed = _epochlist_countseed | replace_first: __pattern, '' %}

{% endif %}

{% endfor %}

html: <p style="text-align:center"> {{ _count }}</p> ;;

}

dimension: epoch_list {

type: string

sql:${TABLE}.epoch_list;;

html: {% assign _pattern = pattern._value | remove: '['| remove: ']' | split: ', ' | join: "-" %}

{% assign _epoch_list = value | remove: '['| remove: ']' | split: ', ' | join: "-" %}

{% if _epoch_list contains _pattern %}

<p style="background-color: lightgreen; color: black; text-align:center">{{value}} <i class="fa fa-check" style="font-size:14px;color:green"></i></p>

{% else %}

<p style="background-color: orange; color: black; text-align:center">{{value}} <i class="fa fa-times" style="font-size:14px;color:red"></i></p>

{% endif %};;

}

dimension: epoch_list_highlighted {

type: string

sql:${TABLE}.epoch_list;;

html:

{% assign stripped_pattern = pattern._value | remove: '['| remove: ']' %}

{% assign highlight = stripped_pattern | prepend: '<mark>' | append: '</mark>' %}

{% if value contains stripped_pattern %}

<p style="color: black; text-align:center">{{value | replace: stripped_pattern, highlight }}</p>

{% else %}

<p style="color: black; text-align:center">{{value}}</p>

{% endif %};;

}

measure: count {

type: sum

sql:${TABLE}.count;;

}

}

Explanation of CODE

Remove the square braces to get text only (e.g. value = [120, 300, 400] becomes __pattern = 120, 300, 400)

{% assign __pattern = value | remove: '['| remove: ']'%}

Note: Assign and using liquid variables with the contains operator works best with the if statement.

 

{% if _epochlist_countseed contains __pattern %}

Trying to use the full expressions with the string filters and pararentheses (xxxxx) won't return a true result based on how the contains operator and liquid is processed. Feel free to test this out and notice how the if statement seems to return false when a match actually exists.

Dimension: matches

  1. With each loop, an item is stripped and the count is only incremented with the plus filter if a match is found.
  2. __epoch_list array generated using the split filter and ', ' was used for the Loop length in the matches dimension for this example with a short list of items. The size of the loop can be adjusted as desired.
  3. If we were to print out the value of _epochlist_countseed as we go through the loop, we w

[100, 120, 300, 400, 120, 300, 400] #<----- Recall Pattern was [120, 300, 400]

[100, , 120, 300, 400]

[100, , ]

[100, , ]

[100, , ]

[100, , ]

[100, , ]

2 #<----- count is two(2) based on two matches.

# You can see why the size of the size of the loop (7 items in the seed) would need adjusting to suit your scenario or be more efficient. For large datasets, it could be an expensive loop when the match is already completed. It worked for me as a goog upper limit so I left it as is as.

Dimension: epoch_list_highlighted

  1. This is where the actual match is visible. It was refined from the epoch_list dimension (The 1st attempt)
  2. epoch_list dimension was based on existing documentation for conditional formatting in Looker HTML Parameter.
  3. {% assign _pattern = pattern._value | remove: '['| remove: ']' | split: ', ' | join: "-" %}.
  4. Later realised that splitting the pattern by ', ' and joining them again using hyphens (join: "-") was an unnecessary extra step. (i.e. pattern._value = [120, 300, 400] becomes _pattern = 120-300-400)
  5. You can replace the prepend: '<mark>' and append: '</mark>' with your desired form of highlighting (*, <>, -, <b></b>, <i> ...)

Resources:

This content is subject to limited support.                

Comments
Ismaila
Staff
d7f2374d-3af3-490d-8e4d-92a95a25a6cf.png
Highlight sample

Highlight sample

Version history
Last update:
‎07-07-2021 01:15 PM
Updated by: