using sequence in SQL loader - sql-loader

I have created table as
CREATE TABLE TEST2
(Seq varchar2(255 CHAR),
ID varchar2(255 CHAR),
NAME VARCHAR2 (255 CHAR),
DOB TIMESTAMP(3)
);
my control file is
load data
infile 'C:\Users\sgujar\Documents\CDAR\test2.csv'
append into table TEST2
fields terminated by ","
(ID,
NAME,
DOB "TO_TIMESTAMP (:DOB, 'YYYY-MM-DD HH24:MI:SS.FF')",
seq"TEST2_seq.nextval"
)
I am not able to use sequence in sql loader.
Can you please help

Although not a particularly pretty solution, it does what you ask:
CREATE OR REPLACE
FUNCTION get_test2_seq RETURN INTEGER
IS
BEGIN
RETURN TEST2_seq.nextval;
END;
/
And then your control file would be
load data
infile 'C:\Users\sgujar\Documents\CDAR\test2.csv'
append into table TEST2
fields terminated by ","
(
ID,
NAME,
DOB "TO_TIMESTAMP (:DOB, 'YYYY-MM-DD HH24:MI:SS.FF')",
SEQ "get_test2_seq()"
)

This will work for sure
options (DIRECT=TRUE,readsize=4096000,bindsize=4096000,skip=1,errors=1,rows=50000)
LOAD DATA
CHARACTERSET AL32UTF8 LENGTH SEMANTICS CHARACTER
INFILE /path/test.csv'
BADFILE '/path/file.bad'
INSERT INTO TABLE test_table
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'TRAILING NULLCOLS
(
Col1 sequence(1,1),
Col2 constant "N",
)

Related

MySQL SQL Error [1411] [HY000]: Incorrect datetime value: '' for function str_to_date

I'am trying to import data from csv i have a date data type which in the csv file is saved as %d-%b-%y (ex. 12-Aug-20)
The table
create table shows(
ShowID int unique,
Title varchar(255),
TypeID int,
Director varchar(255),
Cast blob,
DateAdded date,
year year,
Violence varchar(255),
Duration varchar(255),
Description blob
);
I tried running this script to populate:
INTO TABLE shows
FIELDS TERMINATED BY ','
optionally enclosed by '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(#ShowID,#type1,shows.Title,shows.Director,shows.`Cast`,#country,#date_added,shows.`year`,shows.Violence,shows.Duration,#c,shows.Description)
SET shows.ShowID =#ShowID ,
shows.TypeID = (select TypeID from typecountry t where t.`Type`=#type1 and t.Country=#country),
shows.DateAdded= STR_TO_DATE(#date_added , '%e-%b-%y');
and this error shows up :
SQL Error [1411] [HY000]: Incorrect datetime value: '' for function str_to_date
You have value that are empty i your table, sp you need to repalce them
LOAD DATA INFILE 'c:/tmp/myfile.csv'
INTO TABLE shows
FIELDS TERMINATED BY ','
optionally enclosed by '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(#ShowID,#type1,shows.Title,shows.Director,shows.`Cast`,#country,#date_added,shows.`year`,shows.Violence,shows.Duration,#c,shows.Description)
SET shows.ShowID =#ShowID ,
shows.TypeID = (select TypeID from typecountry t where t.`Type`=#type1 and t.Country=#country),
shows.DateAdded= STR_TO_DATE(IF (#date_added = '','01-01-1999', #date_added) , '%e-%b-%y');

Error Code: 1366. Incorrect integer value: '#N/A' for column 'Length' at row 21

In my dataset, there are many nonsense values of "#N/A". I create a new table and load the local CSV file into my new table. But because of the nonsense value, it shows the error as shown in the headline. How can I load a dataset successfully without the nonsense values?
Here is my code:
CREATE TABLE movies (
Yearnum INT NOT NULL,
Length INT,
Title VARCHAR(255) NOT NULL,
Subjct VARCHAR(255) NOT NULL,
Actor VARCHAR(255) NOT NULL,
Actress VARCHAR(255) ,
Director VARCHAR(255) ,
Popularity VARCHAR(255),
Awards VARCHAR(255) NOT NULL
);
LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/movies - movies.csv'
INTO TABLE movies
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
Here is my link url of dataset:https://docs.google.com/spreadsheets/d/1J17LYPJZaW5QWQuQVJQonpJXGUbPGOJjg5bxj3SMupQ/edit?usp=sharing
You can work around this by using a column list with variables in the place of columns whose data might not be valid in the CSV file; the column values can then be made valid using a SET clause. For your example:
LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/movies - movies.csv'
INTO TABLE movies
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(Yearnum, #length, Title, Subjct, Actor, Actress, Director, Popularity, Awards)
SET Length = CASE WHEN #length = '#N/A' THEN 0 ELSE #length END
See the manual for more details.

How to fix "invalid syntax for integer : 'NUM'

I have this CSV file and I want to copy it to the table I created but pgadmin outputs:
ERROR: invalid input syntax for integer: "NUM" CONTEXT: COPY tickets,
line 1, column num: "NUM" SQL state: 22P02
The COPY code :
copy TICKETS(NUM,KIND,LOCATIONS,PRICE,DATES,CAT)
FROM 'C:\tmp\tickets.csv' DELIMITER ',' CSV
The CSV file:
Why don't you try this way:
create table TICKETS(
NUM INT,
KIND INT,
LOCATION VARCHAR(100),
PRICE INT,
DATE DATE,
CAT CHAR(1)
)
LOAD DATA INFILE 'C:/tmp/tickets.csv'
INTO TABLE TICKETS
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
The important point is the last line IGNORE 1 ROWS excludes the titles, and no error raises.

Hive external table pointing to CSV file with embedded double quotes

I am trying to create an external Hive table pointing to a CSV file.
My CSV file has a column(col2) that could have double quotes and comma as part of the column value.
Data in each column:
Col1 : 150
Col2 : BATWING, ABC "D " TEST DATA
Col3 : 300
Row in CSV:
150,"BATWING, ABC ""D "" TEST DATA",300
Create table DDL :
CREATE EXTERNAL TABLE test (
col1 INT,
col2 STRING,
col3 INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
ESCAPED BY '"'
LOCATION 's3://test-folder/test-file.csv'
When I query the table, I see null values in col3.
What am I missing here while creating the table? Any help is appreciated
Use OpenCSVSerde. Here is an example
Create table
CREATE TABLE bala (col1 int, col2 string, col3 int)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES(
"separatorChar" = ",", "escapeChar"='\"'
);
Load data
hive>LOAD DATA INPATH '/../test.csv' INTO TABLE bala
Loading data to table bala
Table testing.bala stats: [numFiles=1, totalSize=40]
OK
Time taken: 0.514 seconds
Check if it has loaded
hive> select * from bala;
OK
150 BATWING, ABC "D " TEST DATA 300
Time taken: 0.288 seconds, Fetched: 1 row(s)
Create hive external table:
DROP TABLE IF EXISTS ${hiveconf:dbnm}.tblnm ;
CREATE EXTERNAL TABLE ${hiveconf:dbnm}.tblnm (
C1 string,
C2 string
)
PARTITIONED BY (C3 string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = '|' (change it to your separator)
,"quoteChar" = '\"'
)
STORED AS TEXTFILE
LOCATION '/hdfspath'
--tblproperties ("skip.header.line.count"="1")
;
MSCK REPAIR TABLE ${hiveconf:dbnm}.tblnm;

How do I find and replace in a CSV I'm importing using mysql

I'm importing a CSV file into Heidi SQL. I've created the table using this code:
create table all_data
(
Keyword varchar(1000),
Position int,
Previous_Position int,
Search_Volume int,
KW_Difficulty float,
URL varchar(1000),
Post_title varchar(1000),
Post_URL varchar(1000),
Genre varchar(1000),
Location varchar(1000),
Avg_Daily_Visitors float,
pageviews int
)
;
but in the Avg_Daily_visitors column it has "\N" where there is no value. I've been importing the data with this code:
load data local infile 'C:/filepath.../All_Data.csv'
replace into table all_data
fields terminated by ','
enclosed by '"'
escaped by '"'
lines terminated by "\r\n"
ignore 1 rows
set
Avg_Daily_Visitors = replace(Avg_Daily_Visitors,"\N",0),
pageviews = replace(pageviews,"\N", 0)
;
but it's not replacing the values with 0, which is what I want to achieve. How do I make Heidi SQL replace "\N" with "0" on import?
Thanks.
First assign the value you read to a variable, then work on that variable. For this you specify the columns of your destination table, but a variable instead of the column where you want to replace.
load data local infile 'C:/filepath.../All_Data.csv'
replace into table all_data
fields terminated by ','
enclosed by '"'
escaped by '"'
lines terminated by "\r\n"
ignore 1 rows
(column_1, column_2, #variable1, #variable2, column_5)
set
Avg_Daily_Visitors = replace(#variable1,"\N",0),
pageviews = replace(#variable2,"\N", 0)
;