Proper case for BigQuery

jonf
Participant I

I was struggling trying to find a way to capitalize names in Looker with data coming from BigQuery, since BigQuery lacks a PROPER() function – only LOWER() and UPPER() are available.

Searched in some forums and pretty much all solutions involved a User Defined Function in BigQuery, which (as far as I know) is not supported in Looker.

So, I came up with this subquery to properly capitalize strings:

WITH test as (
  SELECT 1 as AuthorId, "john smith" as Author
  UNION ALL
  SELECT 2, "jane doe"
)
SELECT  
AuthorId,
(
  SELECT 
    STRING_AGG(
      CONCAT(
        UPPER(
          SUBSTR(authors, 1, 1)
        ), 
        LOWER(
          SUBSTR(authors,2)
        )
      ), ' ' ORDER BY pos
    ) 
  FROM 
    UNNEST(SPLIT(Author, " ")) authors  WITH OFFSET pos
)
FROM test
0 3 4,234
3 REPLIES 3

BigQuery supports user-defined functions (UDFs). A UDF enables you to create a function using another SQL expression or JavaScript. This way you can create functions that complement the standard set of functions. Here we have a very good example for this. Let’s create our version of a PROPER() function. Using the above script we are defining the function .f_proper(x). You have to replace with a real dataset from your db.

CREATE FUNCTION .f_proper(x STRING)
RETURNS STRING
AS
(
(SELECT
STRING_AGG(
CONCAT(
UPPER(
SUBSTR(w, 1, 1)
),
LOWER(
SUBSTR(w,2)
)
), ’ ’ ORDER BY pos
)
FROM
UNNEST(SPLIT(x, " ")) w WITH OFFSET pos
)
);

Now we can use this function in all our scripts.

Let’s test it:

WITH test as (
SELECT 1 as AuthorId, “john smith” as Author
UNION ALL
SELECT 2, “jane doe”
)
SELECT AuthorId, Author, .f_proper(Author) as Proper_Author
FROM test;

BigQuery supports user-defined functions (UDFs). A UDF enables you to create a function using another SQL expression or JavaScript. This way you can create functions that complement the standard set of functions.

Here we have a very good example for this. Let’s create our version of a PROPER() function. Using the above script we are defining the function dataset.f_proper(x). You have to replace dataset with a real dataset from your db.

CREATE FUNCTION <dataset>.f_proper(x STRING) 
               RETURNS STRING 
AS 
(
  (SELECT 
    STRING_AGG(
      CONCAT(
        UPPER(
          SUBSTR(w, 1, 1)
        ), 
        LOWER(
          SUBSTR(w,2)
        )
      ), ' ' ORDER BY pos
    ) 
   FROM 
    UNNEST(SPLIT(x, " ")) w  WITH OFFSET pos
  )
);

Now we can use this function in all our scripts.

Let’s test it:

WITH test as (
  SELECT 1 as AuthorId, "john smith" as Author
  UNION ALL
  SELECT 2, "jane doe"
)
SELECT AuthorId, Author, <dataset>.f_proper(Author) as ProperAuthor
FROM test;

FYI, those interested to do proper, this available in BigQuery as INITICAP:
 https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#initcap

Top Labels in this Space
Top Solution Authors