DATAFORM - Load a table with data from CSV files using Dataform

Hi,

Following this conversation, I decided to put my case as a new message here. My goal is to load a table with data from CSV files using Dataform. For that, I'm using a sqlx file like this::

 

 

LOAD DATA OVERWRITE
  [MY_PROJECT].ILS.test_david_ILS_categories ( category_id String,
    category_name String )
FROM FILES ( format = 'CSV',
    uris = ['gs://[PROJECT_WITH_DATA_SOURCES]/categories_2024_02*.csv']);
SELECT
  *
FROM
  [MY_PROJECT].ILS.test_david_ILS_categories

 

 

That is working fine. But when I add the config block, it fails when executing as a workflow. This is how it looks the sqlx file with the config block:

 

 

config {
    type: "table"
}

LOAD DATA OVERWRITE
  [MY_PROJECT].ILS.test_david_ILS_categories ( category_id String,
    category_name String )
FROM FILES ( format = 'CSV',
    uris = ['gs://[PROJECT_WITH_DATA_SOURCES]/categories_2024_02*.csv']);
SELECT
  *
FROM
  [MY_PROJECT].ILS.test_david_ILS_categories

 

 

In the details of the workflow, I'm getting this error: Syntax error: Unexpected keyword LOAD at [31:1]

Screenshot 2024-04-10 at 18.12.29.png

 

 

 

BEGIN
  CREATE SCHEMA IF NOT EXISTS `[MY_PROJECT].ILS` OPTIONS(location="US");
EXCEPTION WHEN ERROR THEN
  IF NOT CONTAINS_SUBSTR(@@error.message, "already exists: dataset") AND
    NOT CONTAINS_SUBSTR(@@error.message, "too many dataset metadata update operations") AND
    NOT CONTAINS_SUBSTR(@@error.message, "User does not have bigquery.datasets.create permission")
  THEN
    RAISE USING MESSAGE = @@error.message;
  END IF;
END;
    BEGIN
      DECLARE dataform_table_type DEFAULT (
  SELECT ANY_VALUE(table_type)
  FROM `[MY_PROJECT].ILS.INFORMATION_SCHEMA.TABLES`
  WHERE table_name = 'test_david_ILS_categories'
);
          IF dataform_table_type IS NOT NULL THEN
      IF dataform_table_type = 'VIEW' THEN DROP VIEW IF EXISTS `[MY_PROJECT].ILS.test_david_ILS_categories`;
ELSEIF dataform_table_type = 'MATERIALIZED VIEW' THEN DROP MATERIALIZED VIEW IF EXISTS `[MY_PROJECT].ILS.test_david_ILS_categories`;
END IF;
    END IF;
      BEGIN
        
            CREATE OR REPLACE TABLE `[MY_PROJECT].ILS.test_david_ILS_categories`
    
    
    OPTIONS(description='''A table description here''')
    AS (
      

LOAD DATA OVERWRITE
  [MY_PROJECT].ILS.test_david_ILS_categories ( category_id String,
    category_name String )
FROM FILES ( format = 'CSV',
    uris = ['gs://[PROJECT_WITH_SOURCE_DATA]/categories_2024_02*.csv']);
SELECT
  *
FROM
  [MY_PROJECT].ILS.test_david_ILS_categories

    );
        
      END;
    END;

 

 

I suppose that Stored Procedure was expecting a SELECT statement. Is there a way to accomplish the table creation from a CSV directly from Dataform?

--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost

Solved Solved
7 2 213
1 ACCEPTED SOLUTION

See the documentation, you should use `operations` type in the config block (with hasOutput=true). Or you could keep `table` and wrap the LOAD statement in the `pre_operations` block.

View solution in original post

2 REPLIES 2

See the documentation, you should use `operations` type in the config block (with hasOutput=true). Or you could keep `table` and wrap the LOAD statement in the `pre_operations` block.

I just figure that out, but thanks for your answer. The operations type did the trick!

--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost