Error while reading table: error message: Data between close quote character (") and field separator

Hi Folks,

I have a file sample.txt with the below content,

"501286 - Contract Info" "No Chain" "12/03/2023" "Products" "Fee Adj - Manual" 5000.00
"114173 - Contract Info" "No Chain" "12/03/2023" "Products" "Flat Fee Adj - Manual" 863.03
"114174 - Contract Info" "No Chain" "12/03/2023" "Products" "Flat Fee Adj - Manual" 95.89

I'm trying to read the file as an external table with one column raw_data_txt but I'm getting below error,

Error while reading table:
dba_raw_sample_t_external_tables, error message: Data between close quote
character (") and field separator.; line_number: 2 byte_offset_to_start_of_line:
7 column_index: 0 column_name: "raw_data_txt" column_type: STRING value: "501286
- Contract..." File: gs://ad-us-est4-01-gcs/invoker/app/que/file/sample.txt
Failure details:
- gs://adh-nz-dwh-us-
est4-01-gcs/invokerjob/apps/queue/files/sample.txt: Error
while reading data, error message: Data between close quote
character (") and field separator.; line_number: 2
byte_offset_to_start_of_line: 7 column_index: 0 column_name:
"raw_data_txt" column_type: STRING value: "501286 - Contract..."

 

Any suggestions to fix this are much appreciated.

Solved Solved
0 3 1,888
1 ACCEPTED SOLUTION

The problem is the existence of the quote character in your data.  The LOAD DATA SQL (by default) looks for quote characters in a special way.  You can disable that.  The following SQL worked for me to load your data:

 

load data overwrite `test.onecol` (
  raw STRING
)
from files(
  format="csv",
  uris=["gs://kolban-csv1/test1.csv"],
  quote="\000"
)

and for an external table:

create or replace external table `test.onecol` (
  raw STRING
)
options (
  format="csv",
  uris=["gs://kolban-csv1/test1.csv"],
  quote="\000"
)

 

View solution in original post

3 REPLIES 3

The problem is the existence of the quote character in your data.  The LOAD DATA SQL (by default) looks for quote characters in a special way.  You can disable that.  The following SQL worked for me to load your data:

 

load data overwrite `test.onecol` (
  raw STRING
)
from files(
  format="csv",
  uris=["gs://kolban-csv1/test1.csv"],
  quote="\000"
)

and for an external table:

create or replace external table `test.onecol` (
  raw STRING
)
options (
  format="csv",
  uris=["gs://kolban-csv1/test1.csv"],
  quote="\000"
)

 

Kolban, 

Thank you so much for providing your suggestion. I have tried several ways to escape those double character but wasn't able to and the option you have suggested worked for me but there is a problem with the datatype as the column i'm creating in the external table is STRING. I'm getting the below error,

Option quote value has type
INT64 which cannot be coerced to expected type STRING at [4:9]

CREATE OR REPLACE EXTERNAL TABLE $DBID.${RAWTABLENAME}_external_tables (raw_data_txt string)
OPTIONS (
format='CSV',
quote= '\"',
field_delimiter = CHR(1),
uris=['${GCP_LOAD_FILE}${LOADFILE}']
);

This piece of code is being executed as a part of shell-script execution.

Any ways to workaround this?

@kolban As I said I was executing this statement in a shell script because of which the quote statement was resolving to 000. Now I have re-executed it by changing as quote=\"\\000\" which escaped backslashes and it worked. Thank you so much for your help. Much appreciated.