Redshift COPY - err_reason Invalid value err_code 1216 - json

After attempting to run my Redshift COPY command:
copy my_schema.my_table
from 's3://my-bucket/my_json_file.json.gz'
iam_role 'my_role'
JSON 'auto';
I get the following
[XX000] ERROR: Load into table 'my_table' failed. Check 'stl_load_errors' system table for details.
So when querying
select raw_line, err_code, err_reason
from stl_load_errors;
I get:
+----------------------------+--------+--------------+
|raw_line |err_code|err_reason |
+----------------------------+--------+--------------+
|.......c..my_json_file.json |1216 |Invalid value.|
+----------------------------+--------+--------------+
which doesn't really help me to understand what the problem is. Any help?

As I'm loading GZIP compressed data I had to inform the COPY command of such compression format with the following
copy my_schema.my_table
from 's3://my-bucket/my_json_file.json.gz'
iam_role 'my_role'
GZIP
JSON 'auto';

Related

bigquery error: "Could not parse '41.66666667' as INT64"

I am attempting to create a table using a .tsv file in BigQuery, but keep getting the following error:
"Failed to create table: Error while reading data, error message: Could not parse '41.66666667' as INT64 for field Team_Percentage (position 8) starting at location 14419658 with message 'Unable to parse'"
I am not sure what to do as I am completely new to this.
Here is a file with the first 100 lines of the full data:
https://wetransfer.com/downloads/25c18d56eb863bafcfdb5956a46449c920220502031838/f5ed2f
Here are the steps I am currently taking to to create the table:
https://i.gyazo.com/07815cec446b5c0869d7c9323a7fdee4.mp4
Appreciate any help I can get!
As confirmed with OP (#dan), the error encountered is caused by selecting Auto detect when creating a table using a .tsv file as the source.
The fix for this is to manually create a schema and define the data type for each column properly. For more reference on using schema in BQ see this document.

Special characters are causing an issue while loading a CSV file into db2 table

My file has data as below
abc|def|I completed by degreeSymbol®|210708
My load/import statement is below which is run in linux shell script. The Lang environment variable value is= en_US.UTF-8
load client from filename of del MODIFIED BY CHARDEL timestampformat="YYYYMMDD" coldel| usedefaults fastparse messages logfilename insert into tablename nonrecoverable;
In table the data is getting loaded as
abc def (null) 210708
And also when I run a select query the get the below error in the db2
com.ibm.db2.jcc.am.SqlException: Caught java.io.CharConversionException

Error while reading data, error message: CSV table references column position 15, but line starting at position:0 contains only 1 columns

I am new in bigquery, Here I am trying to load the Data in GCP BigQuery table which I have created manually, I have one bash file which contains bq load command -
bq load --source_format=CSV --field_delimiter=$(printf '\u0001') dataset_name.table_name gs://bucket-name/sample_file.csv
My CSV file contains multiple ROWS with 16 column - sample Row is
100563^3b9888^Buckname^https://www.settttt.ff/setlllll/buckkkkk-73d58581.html^Buckcherry^null^null^2019-12-14^23d74444^Reverb^Reading^Pennsylvania^United States^US^40.3356483^-75.9268747
Table schema -
When I am executing bash script file from cloud shell, I am getting following Error -
Waiting on bqjob_r10e3855fc60c6e88_0000016f42380943_1 ... (0s) Current status: DONE
BigQuery error in load operation: Error processing job 'project-name-
staging:bqjob_r10e3855fc60c6e88_0000ug00004521': Error while reading data, error message: CSV
table
encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection
for more details.
Failure details:
- gs://bucket-name/sample_file.csv: Error while
reading data, error message: CSV table references column position
15, but line starting at position:0 contains only 1 columns.
What would be the solution, Thanks in advance
You are trying to insert wrong values to your table per the schema you provided
Based on table schema and your data example I run this command:
./bq load --source_format=CSV --field_delimiter=$(printf '^') mydataset.testLoad /Users/tamirklein/data2.csv
1st error
Failure details:
- Error while reading data, error message: Could not parse '39b888'
as int for field Field2 (position 1) starting at location 0
At this point, I manually removed the b from 39b888 and now I get this
2nd error
Failure details:
- Error while reading data, error message: Could not parse
'14/12/2019' as date for field Field8 (position 7) starting at
location 0
At this point, I changed 14/12/2019 to 2019-12-14 which is BQ date format and now everything is ok
Upload complete.
Waiting on bqjob_r9cb3e4ef5ad596e_0000016f42abd4f6_1 ... (0s) Current status: DONE
You will need to clean your data before upload or use a data sample with more lines with --max_bad_records flag (Some of the lines will be ok and some not based on your data quality)
Note: unfortunately there is no way to control date format during the upload see this answer as a reference
We had the same problem while importing data from local to BigQuery. After researching the data we saw that there data which starting \r or \s enter image description here
After implementing ua['ColumnName'].str.strip() and ua['District'].str.rstrip(). we could add data to Bg.
Thanks

Importing csv file into Cassandra

I am using COPY command to load the data from csv file into Cassandra table . Following error occurs while using the command.
**Command** : COPY quote.emp(alt, high,low) FROM 'test1.csv' WITH HEADER= true ;
Error is :
get_num_processess() takes no keyword argument.
This is caused by CASSANDRA-11574. As mentioned in the ticket comments, there is a workaround:
Move copyutil.c somewhere else. Same thing if you also have a copyutil.so.
You should be able to find these files under pylib/cqlshlib/.

JSON to append in Big Query CLI using write_disposition=writeAppend fails

I could not make BQ shell to append JSON file using the keyword --write_disposition=WRITE_APPEND.
load --sour_format=NEWLINE_DELIMITED_JSON --write_disposition=WRITE_APPEND dataset.tablename /home/file1/one.log /home/file1/jschema.json
I have file named one.log and its schema jschema.json.
While executing the script, it says
FATAL flags parsing error : unknown command line flag 'write_dispostion'
RUN 'bq.py help' to get help.
I believe Big query is append only mode, there should be possibility of appending data in table, I am unable to get workaround, any assistance please.
I believe the default operational mode is WRITE_APPEND using the BQ tool.
And there is no --write_disposition switch for the BQ shell utility.
But there is a --replace should set the write_disposition to truncate.