Question

Generating a Numbers Table in MySQL and Redshift

  • 24 February 2015
  • 10 replies
  • 2033 views

Userlevel 6
Badge

Postgres implements function generate_series, a function that returns a table of a set of integers. Unfortunately, MySQL lacks this function and Redshifts implementation is pretty useless (It only works on the leader node?? Really?).



In a combination with a CROSS JOIN, number tables are often used to build map tables, from attributes stored in lists in strings. They are also used to build date table (see below).



I’ve seen a few implementations of numeric series generating SELECT statements, but they are usually more complex and larger than this one.



The SQL code makes a bunch of two row tables and cross joins them, using them as binary counters. The code will generate all the numbers between 0 and 255 (2^8-1)



SELECT 

p0.n

+ p1.n*2

+ p2.n * POWER(2,2)

+ p3.n * POWER(2,3)

+ p4.n * POWER(2,4)

+ p5.n * POWER(2,5)

+ p6.n * POWER(2,6)

+ p7.n * POWER(2,7)

as number

FROM

(SELECT 0 as n UNION SELECT 1) p0,

(SELECT 0 as n UNION SELECT 1) p1,

(SELECT 0 as n UNION SELECT 1) p2,

(SELECT 0 as n UNION SELECT 1) p3,

(SELECT 0 as n UNION SELECT 1) p4,

(SELECT 0 as n UNION SELECT 1) p5,

(SELECT 0 as n UNION SELECT 1) p6,

(SELECT 0 as n UNION SELECT 1) p7



Its pretty easy to create a derived table for general use. Play with this Code on Learn.looker.com



- view: numbers

derived_table:

persist_for: 5 hours

indexes: [number]

sql: |

SELECT

p0.n

+ p1.n*2

+ p2.n * POWER(2,2)

+ p3.n * POWER(2,3)

+ p4.n * POWER(2,4)

+ p5.n * POWER(2,5)

+ p6.n * POWER(2,6)

+ p7.n * POWER(2,7)

as number

FROM

(SELECT 0 as n UNION SELECT 1) p0,

(SELECT 0 as n UNION SELECT 1) p1,

(SELECT 0 as n UNION SELECT 1) p2,

(SELECT 0 as n UNION SELECT 1) p3,

(SELECT 0 as n UNION SELECT 1) p4,

(SELECT 0 as n UNION SELECT 1) p5,

(SELECT 0 as n UNION SELECT 1) p6,

(SELECT 0 as n UNION SELECT 1) p7

fields:

- dimension: number

type: number



Another useful form of this table is to generate dates.



- view: dates

derived_table:

persist_for: 5 hours

indexes: [series_date]

sql: |

SELECT

DATE_ADD('2001-01-01', INTERVAL numbers.number DAY)

as series_date

FROM ${numbers.SQL_TABLE_NAME} AS numbers

fields:

- dimension: series_date

type: date


10 replies

 

For Redshift updating dateadd function,

SELECT dateadd(day, numbers.number, '2013-01-01') as series_date FROM ${numbers.SQL_TABLE_NAME}

I am trying this by I keep getting errors. The code is not that hard, so I’m running out of ideas what might be the error….

 

It works if I replace numbers.num with 1 for instance. Adding the whole range of the numbers table generates the error.

Try casting the number as an integer, I got an error that it wouldn’t work for type “double precision” but it worked once I did that. 


SELECT dateadd(day, numbers.number::integer, '2013-01-01') as series_date FROM ${numbers.SQL_TABLE_NAME}

For Redshift updating dateadd function,

SELECT dateadd(day, numbers.number, '2013-01-01') as series_date FROM ${numbers.SQL_TABLE_NAME}

I am trying this by I keep getting errors. The code is not that hard, so I’m running out of ideas what might be the error….

 

It works if I replace numbers.num with 1 for instance. Adding the whole range of the numbers table generates the error.

Has anyone had success creating a date table in MS SQL Server 2008+?

Userlevel 4

For BigQuery, a new native method (GENERATE_DATE_ARRAY):


SELECT date FROM UNNEST(GENERATE_DATE_ARRAY(DATE_SUB(CURRENT_DATE, INTERVAL 5 YEAR), CURRENT_DATE)) date

For Redshift updating dateadd function,


SELECT dateadd(day, numbers.number, '2013-01-01') as series_date FROM ${numbers.SQL_TABLE_NAME}

Userlevel 2

Snowflake for good measure: 😉



select seq8() as integer_sequence

, dateadd(day, seq8() + 1, current_date) as calendar_date

from table(generator(rowCount => 365))

Userlevel 6
Badge

@russ In BigQuery Standard SQL, you can use a built in function to do most of this:



SELECT(DATE_ADD(DATE(CURRENT_TIMESTAMP()), INTERVAL -1* n DAY)) 

FROM UNNEST(GENERATE_ARRAY(0,2000,1)) n

Userlevel 3

Since there were a few nuances that tripped me up, I figured I’d post working examples for both versions of Big Query (Both Adaptations on the code above):



Big Query Standard SQL:


Uses the row number window/analytic function to reduce complexity. The pattern can be extended to provide more rows by simply repeating the pattern in the from clause



SELECT

/* Have Me Look from today backward*/

DATE(TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 * (row_number() OVER ()) DAY)) AS dt

/* Have me look from a fixed date forward*/

-- DATE(TIMESTAMP_ADD(TIMESTAMP(2012-10-01 02:03:04), INTERVAL 1 * (row_number() OVER ()) DAY)) AS dt

FROM

(SELECT 0 as n UNION ALL SELECT 1),

(SELECT 0 as n UNION ALL SELECT 1),

(SELECT 0 as n UNION ALL SELECT 1),

(SELECT 0 as n UNION ALL SELECT 1),

(SELECT 0 as n UNION ALL SELECT 1),

(SELECT 0 as n UNION ALL SELECT 1),

(SELECT 0 as n UNION ALL SELECT 1),

/* I'm extensible, if you need more rows you can add more of me into the from clause (each addition will increase the number of rows by a power of two, 2 4 8 32 64 etc) */

(SELECT 0 as n UNION ALL SELECT 1) --,

ORDER BY 1 DESC



Big Query Legacy SQL:


In Legacy SQL the row number function didn’t want to cooperate, so I kept the power of two logic. Meaning that it’s still extensible, but you need to provide another from clause entry and make the corresponding addition to the math in the select clause. Also to note, there is no UNION operator in legacy so the from clause is explicitly cross joining the legacy syntax for union “,”.



SELECT

/* Have Me Look from today backward*/

DATE(DATE_ADD(CURRENT_TIMESTAMP(), -1 *

p0.n

+ p1.n * 2

+ p2.n * POW(2,2)

+ p3.n * POW(2,3)

+ p4.n * POW(2,4)

+ p5.n * POW(2,5)

+ p6.n * POW(2,6)

+ p7.n * POW(2,7)

, "DAY")) AS dt

/* Have me look from a fixed date forward*/

-- DATE(DATE_ADD(TIMESTAMP('2012-10-01 02:03:04'), 1 *

-- p0.n

-- + p1.n * 2

-- + p2.n * POW(2,2)

-- + p3.n * POW(2,3)

-- + p4.n * POW(2,4)

-- + p5.n * POW(2,5)

-- + p6.n * POW(2,6)

-- + p7.n * POW(2,7)

-- , "DAY")) AS dt

FROM

(SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p0,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p1,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p2,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p3,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p4,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p5,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p6,

CROSS JOIN (SELECT n as n FROM (select 0 as n) a , (select 1 as n) b) p7

/* I'm extensible too!, just create an entry for p8...x (each addition will increase the number of rows by a power of two, 2 4 8 32 64 etc) and make the corresponding change to

the select clause */

ORDER BY 1
Userlevel 6
Badge

@mmcloughlin Thanks for pointing that out.



I ended up refining it further sql dialects that support window functions 😄



https://learnbeta.looker.com/projects/lookml_design_patterns/files/numbers.view.lookml



    SELECT 

ROW_NUMBER() OVER () as number

FROM

(SELECT 0 as n UNION SELECT 1) p0,

(SELECT 0 as n UNION SELECT 1) p1,

(SELECT 0 as n UNION SELECT 1) p2,

...

This is way better than other similar solutions I’ve seen, thanks. I just wanted to point out a typo in your post: your query produces numbers 0 to 255, not 0 to 65535.

Reply