How do I find and replace in a CSV I'm importing using mysql - mysql

I'm importing a CSV file into Heidi SQL. I've created the table using this code:
create table all_data
(
Keyword varchar(1000),
Position int,
Previous_Position int,
Search_Volume int,
KW_Difficulty float,
URL varchar(1000),
Post_title varchar(1000),
Post_URL varchar(1000),
Genre varchar(1000),
Location varchar(1000),
Avg_Daily_Visitors float,
pageviews int
)
;
but in the Avg_Daily_visitors column it has "\N" where there is no value. I've been importing the data with this code:
load data local infile 'C:/filepath.../All_Data.csv'
replace into table all_data
fields terminated by ','
enclosed by '"'
escaped by '"'
lines terminated by "\r\n"
ignore 1 rows
set
Avg_Daily_Visitors = replace(Avg_Daily_Visitors,"\N",0),
pageviews = replace(pageviews,"\N", 0)
;
but it's not replacing the values with 0, which is what I want to achieve. How do I make Heidi SQL replace "\N" with "0" on import?
Thanks.

First assign the value you read to a variable, then work on that variable. For this you specify the columns of your destination table, but a variable instead of the column where you want to replace.
load data local infile 'C:/filepath.../All_Data.csv'
replace into table all_data
fields terminated by ','
enclosed by '"'
escaped by '"'
lines terminated by "\r\n"
ignore 1 rows
(column_1, column_2, #variable1, #variable2, column_5)
set
Avg_Daily_Visitors = replace(#variable1,"\N",0),
pageviews = replace(#variable2,"\N", 0)
;

Related

MySQL SQL Error [1411] [HY000]: Incorrect datetime value: '' for function str_to_date

I'am trying to import data from csv i have a date data type which in the csv file is saved as %d-%b-%y (ex. 12-Aug-20)
The table
create table shows(
ShowID int unique,
Title varchar(255),
TypeID int,
Director varchar(255),
Cast blob,
DateAdded date,
year year,
Violence varchar(255),
Duration varchar(255),
Description blob
);
I tried running this script to populate:
INTO TABLE shows
FIELDS TERMINATED BY ','
optionally enclosed by '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(#ShowID,#type1,shows.Title,shows.Director,shows.`Cast`,#country,#date_added,shows.`year`,shows.Violence,shows.Duration,#c,shows.Description)
SET shows.ShowID =#ShowID ,
shows.TypeID = (select TypeID from typecountry t where t.`Type`=#type1 and t.Country=#country),
shows.DateAdded= STR_TO_DATE(#date_added , '%e-%b-%y');
and this error shows up :
SQL Error [1411] [HY000]: Incorrect datetime value: '' for function str_to_date
You have value that are empty i your table, sp you need to repalce them
LOAD DATA INFILE 'c:/tmp/myfile.csv'
INTO TABLE shows
FIELDS TERMINATED BY ','
optionally enclosed by '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(#ShowID,#type1,shows.Title,shows.Director,shows.`Cast`,#country,#date_added,shows.`year`,shows.Violence,shows.Duration,#c,shows.Description)
SET shows.ShowID =#ShowID ,
shows.TypeID = (select TypeID from typecountry t where t.`Type`=#type1 and t.Country=#country),
shows.DateAdded= STR_TO_DATE(IF (#date_added = '','01-01-1999', #date_added) , '%e-%b-%y');

Error Code: 1366. Incorrect integer value: '#N/A' for column 'Length' at row 21

In my dataset, there are many nonsense values of "#N/A". I create a new table and load the local CSV file into my new table. But because of the nonsense value, it shows the error as shown in the headline. How can I load a dataset successfully without the nonsense values?
Here is my code:
CREATE TABLE movies (
Yearnum INT NOT NULL,
Length INT,
Title VARCHAR(255) NOT NULL,
Subjct VARCHAR(255) NOT NULL,
Actor VARCHAR(255) NOT NULL,
Actress VARCHAR(255) ,
Director VARCHAR(255) ,
Popularity VARCHAR(255),
Awards VARCHAR(255) NOT NULL
);
LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/movies - movies.csv'
INTO TABLE movies
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
Here is my link url of dataset:https://docs.google.com/spreadsheets/d/1J17LYPJZaW5QWQuQVJQonpJXGUbPGOJjg5bxj3SMupQ/edit?usp=sharing
You can work around this by using a column list with variables in the place of columns whose data might not be valid in the CSV file; the column values can then be made valid using a SET clause. For your example:
LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/movies - movies.csv'
INTO TABLE movies
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(Yearnum, #length, Title, Subjct, Actor, Actress, Director, Popularity, Awards)
SET Length = CASE WHEN #length = '#N/A' THEN 0 ELSE #length END
See the manual for more details.

How to fix "invalid syntax for integer : 'NUM'

I have this CSV file and I want to copy it to the table I created but pgadmin outputs:
ERROR: invalid input syntax for integer: "NUM" CONTEXT: COPY tickets,
line 1, column num: "NUM" SQL state: 22P02
The COPY code :
copy TICKETS(NUM,KIND,LOCATIONS,PRICE,DATES,CAT)
FROM 'C:\tmp\tickets.csv' DELIMITER ',' CSV
The CSV file:
Why don't you try this way:
create table TICKETS(
NUM INT,
KIND INT,
LOCATION VARCHAR(100),
PRICE INT,
DATE DATE,
CAT CHAR(1)
)
LOAD DATA INFILE 'C:/tmp/tickets.csv'
INTO TABLE TICKETS
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
The important point is the last line IGNORE 1 ROWS excludes the titles, and no error raises.

Backslash throws unexpected beginning of statement error

So I have this SQL Script for bulk loading my data:
INSERT INTO DeliveryMethod (deliveryMethod)
VALUES ('Bicycle');
INSERT INTO DeliveryMethod (deliveryMethod)
VALUES ('Car');
INSERT INTO DeliveryMethod (deliveryMethod)
VALUES ('Van');
INSERT INTO DeliveryMethod (deliveryMethod)
VALUES ('None');
SET foreign_key_checks = 0;
# Load data into categories
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/CategoryData.txt'
REPLACE INTO TABLE CourierDB.Category
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(name)
SET categoryID = NULL; # Should trigger auto increment (or \r)
# Load data into package
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/PackageData.txt'
REPLACE INTO TABLE CourierDB.Package
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(price, itemName, category)
SET packageID = NULL; # Should trigger auto increment (or \r)
# Load data into address
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/AddressData.txt'
REPLACE INTO TABLE CourierDB.Address
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(buildingName, streetName, county, postcode)
SET addressID = NULL;
# Load data into packages
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/PackagesData.txt'
REPLACE INTO TABLE CourierDB.Packages
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(packagesID, package);
# Load data into branch
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/BranchData.txt'
REPLACE INTO TABLE CourierDB.Branch
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(branchName, address, headOfficeID, managerID, deliveryMethods)
SET branchID = NULL;
# Load data into consignment
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/ConsignmentData.txt'
REPLACE INTO TABLE CourierDB.Consignment
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(dispatchDate, consignmentType, branch, deliveryAddressID, returnAddressID, packages)
SET trackingID = NULL;
# Load data into DeliveryMethods
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/DeliveryMethodsData.txt'
REPLACE INTO TABLE CourierDB.DeliveryMethods
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(deliveryMethodID, deliveryMethod);
# Load data into Employee
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/EmployeeData.txt'
REPLACE INTO TABLE CourierDB.Employee
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(NIN, firstName, lastName, dateOfBirth, emailAddress, mobileNo, salary, branchID, supervisorID, address)
SET staffNo = NULL;
# Load data into CustomerConsignments
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/CustomerConsignmentsData.txt'
REPLACE INTO TABLE CourierDB.CustomerConsignments
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(consignmentsID, consignment);
# Load data into Customer
LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/CustomerData.txt'
REPLACE INTO TABLE CourierDB.Customer
FIELDS TERMINATED BY ','
LINES STARTING BY '(' TERMINATED BY ')\r'
(firstName, lastName, dateOfBirth, emailAddress, mobileNo, customerBranchID, address, consignments)
SET customerID = NULL;
SET foreign_key_checks = 1;
However with this said... I'm getting this error here for some reason when i put it into my relational schema:
Static analysis:
3 errors were found during analysis.
Unexpected character. (near "\" at position 0) Unexpected beginning of
statement. (near "\" at position 0) Unexpected beginning of statement.
(near "r" at position 1) SQL query:
\r LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4/PackageData.txt'
REPLACE INTO TABLE CourierDB.Package FIELDS TERMINATED BY ',' LINES
STARTING BY '(' TERMINATED BY ')\r' (price, itemName, category) SET
packageID = NULL
MySQL said: Documentation
1064 - You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use
near '\r LOAD DATA LOCAL INFILE
'C:/Users/ryank/OneDrive/Documents/GitHub/DBMS-CW/R4' at line 1
Any help would be greatly appreciated! Thanks

using sequence in SQL loader

I have created table as
CREATE TABLE TEST2
(Seq varchar2(255 CHAR),
ID varchar2(255 CHAR),
NAME VARCHAR2 (255 CHAR),
DOB TIMESTAMP(3)
);
my control file is
load data
infile 'C:\Users\sgujar\Documents\CDAR\test2.csv'
append into table TEST2
fields terminated by ","
(ID,
NAME,
DOB "TO_TIMESTAMP (:DOB, 'YYYY-MM-DD HH24:MI:SS.FF')",
seq"TEST2_seq.nextval"
)
I am not able to use sequence in sql loader.
Can you please help
Although not a particularly pretty solution, it does what you ask:
CREATE OR REPLACE
FUNCTION get_test2_seq RETURN INTEGER
IS
BEGIN
RETURN TEST2_seq.nextval;
END;
/
And then your control file would be
load data
infile 'C:\Users\sgujar\Documents\CDAR\test2.csv'
append into table TEST2
fields terminated by ","
(
ID,
NAME,
DOB "TO_TIMESTAMP (:DOB, 'YYYY-MM-DD HH24:MI:SS.FF')",
SEQ "get_test2_seq()"
)
This will work for sure
options (DIRECT=TRUE,readsize=4096000,bindsize=4096000,skip=1,errors=1,rows=50000)
LOAD DATA
CHARACTERSET AL32UTF8 LENGTH SEMANTICS CHARACTER
INFILE /path/test.csv'
BADFILE '/path/file.bad'
INSERT INTO TABLE test_table
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'TRAILING NULLCOLS
(
Col1 sequence(1,1),
Col2 constant "N",
)