Help with JSON extract in BigQuery

Gui
Bronze 1
Bronze 1

Hi everyone!

I need some help with an issue I'm facing while working with JSON data.

Context:

  • I'm working with a column that stores JSON (string) data in the format of events.
  • Each event contains information such as agent work time, requester wait time, resolution time, etc.
  • The goal is to extract this data and analyze system performance.

Problem:

I've tried various methods to extract the data from the JSON, including using functions like JSON_QUERY, but haven't been successful. All attempts have only resulted in null values.

Possible causes:

  • Lack of a consistent pattern in the events (some have more keys than others).
  • Errors in my extraction code.

Examples of how the data comes in the metric_events column (JSON). 

 

{[{24041479452308, 0, agent_work_time, null, 123468, 2024-02-25T20:37:39Z, measure}, {24041540959252, 1, agent_work_time, null, 123468, 2024-02-25T20:37:39Z, activate}, {24041497645972, 1, agent_work_time, null, 123468, 2024-02-25T20:38:03Z, fulfill}, {24041497646356, 1, agent_work_time, {0, 0}, 123468, 2024-02-25T20:38:03Z, update_status}], [{24041479456916, 0, group_ownership_time, null, 123468, 2024-02-25T20:37:39Z, measure}, {24041479457300, 1, group_ownership_time, null, 123468, 2024-02-25T20:37:39Z, activate}, {24041464986516, 1, group_ownership_time, null, 123468, 2024-02-25T20:38:03Z, fulfill}, {24041464987540, 1, group_ownership_time, {0, 0}, 123468, 2024-02-25T20:38:03Z, update_status}], [{24041479452564, 0, pausable_update_time, null, 3743909, 2024-02-25T20:37:39Z, measure}], [{24041540962452, 0, periodic_update_time, null, 3743909, 2024-02-25T20:37:39Z, measure}], [{null, 24041479453076, 0, reply_time, null, null, 123468, 2024-02-25T20:37:39Z, measure}], [{null, 24041479452692, 0, requester_wait_time, null, null, 3743909, 2024-02-25T20:37:39Z, measure}, {null, 24041540960020, 1, requester_wait_time, null, null, 123468, 2024-02-25T20:37:39Z, activate}, {null, 24041464994324, 1, requester_wait_time, null, null, 123468, 2024-02-25T20:38:03Z, fulfill}, {null, 24041464994708, 1, requester_wait_time, null, {0, 0}, 123468, 2024-02-25T20:38:03Z, update_status}], [{null, 24041479453460, 0, resolution_time, null, null, 123468, 2024-02-25T20:37:39Z, measure}, {null, 24041479453588, 1, resolution_time, null, null, 3743909, 2024-02-25T20:37:39Z, activate}, {null, 24041540961172, 1, resolution_time, {true, {null, 17406820649492, SLA - Geral}, 720}, null, 123468, 2024-02-25T20:37:39Z, apply_sla}, {null, 24041497663124, 1, resolution_time, null, null, 123468, 2024-02-25T20:38:03Z, fulfill}, {null, 24041497666324, 1, resolution_time, null, {0, 0}, 123468, 2024-02-25T20:38:03Z, update_status}]}

 

The code i've originally tried to use:

 

WITH ranked_tickets AS (
  SELECT
    id,
    agent_work_time,
    group_ownership_time,
    pausable_update_time,
    periodic_update_time,
    reply_time,
    requester_wait_time,
    resolution_time
  FROM (
    SELECT
      id,
      JSON_QUERY(metric_events, '$.agent_work_time') AS agent_work_time,
      JSON_QUERY(metric_events, '$.group_ownership_time') AS group_ownership_time,
      JSON_QUERY(metric_events, '$.pausable_update_time') AS pausable_update_time,
      JSON_QUERY(metric_events, '$.periodic_update_time') AS periodic_update_time,
      JSON_QUERY(metric_events, '$.reply_time') AS reply_time,
      JSON_QUERY(metric_events, '$.requester_wait_time') AS requester_wait_time,
      JSON_QUERY(metric_events, '$.resolution_time') AS resolution_time
    FROM `company-dataviz-customer-serv.zendesk.tickets`
    
  ) AS parsed_json
)
SELECT *
FROM ranked_tickets
WHERE id = 123468

 

 Someone can help me out with it? 

0 1 214
1 REPLY 1

In your example, the input data is not JSON.   Here is the start of the data you have provided:

{[{24041479452308, 0, agent_work_time, null, 123468, 2024-02-25T20:37:39Z, measure},...

JSON data would be of the form:

{[{"a": 24041479452308, "b": 0, "c": "agent_work_time", "d": 123468, "e": "2024-02-25T20:37:39Z", "f": "measure"}, ...

Because the input string is not JSON, you can't use the JSON_QUERY function against it.