Loading a huge CSV file with high performance in Oracle table - sql-loader

I have a CSV File which its size is about 20 Gig. The file has three Persian columns. I want to load it to Oracle Table. I searched and found that sql loader has high performance. But, when I load the file in the table, Persian data is not loaded in the right order. In fact, it is because Persian data is the right to left language.
I use this control file:
OPTIONS (SKIP=0, ERRORS=500, PARALLEL=TRUE, MULTITHREADING=TRUE, DIRECT=TRUE,
SILENT=(ALL))
load data
CHARACTERSET UTF8
infile '/home/oracle/part.csv'
APPEND
into table Fact_test
fields terminated by ','
trailing nullcols(
A_ID INTEGER,
T_ID,
G_ID,
TRTYPE,
ORRETURNED,
MECH,
AMN,
TRAM INTEGER,
USERID INTEGER,
USERS INTEGER,
VERID INTEGER,
TRSTAMP CHAR(4000),
OPR_BRID INTEGER
)
File is like this:
A_ID,T_ID,g_id,TrType,ORRETURNED,Mech,Amn,Tram,UserID,UserS,VerID,TRSTAMP,OPR_BRID
276876739075,154709010853,4302,بروفق,اصلی,غیر سبک,بررسی,86617.1,999995,NULL,NULL,1981-11-16 13:23:16,2516
When I export the table in excel format, I receive this, some numbers become negative:
(A_ID,T_ID,g_id,TrType,ORRETURNED,Mech,Amn,Tram,UserID,UserS,VerID,TRSTAMP,OPR_BRID) values (276876739075,'154709010853',411662402610,'4302','غیر بررسی','اصلي','سبک',-1344755500,-1445296167,-1311201320,909129772,'77.67',960051513);
The problem is when the data loaded, some columns have negative number and order of some columns change.
Would you please guide me how to solve the issue?
Any help is really appreciated.

Problem solved:
I change the control file to this one:
load data
CHARACTERSET UTF8
infile '/home/oracle/test_none.csv'
APPEND
into table Fact_test
FIELDS terminated BY ','
trailing nullcols(
A_ID CHAR(50),
T_ID CHAR(50),
G_ID CHAR(50),
TRTYPE,
ORRETURNED,
MECH,
AMN,
TRAM CHAR(50),
USERID,
USERS CHAR(50),
VERID CHAR(50),
TRSTAMP,
OPR_BRID CHAR(50)
)

Related

Error in saving csv to MySQL table due to NULL values

I have the data saved in my csv excel. The data have some null value in row and column. I want to save this data into my database in MySQL. But null value is causing problem in saving the data to MySQL. This is the query for creating the table -
create table student (
Std_ID int,
Roll_NO int,
First_Name varchar(10) NOT NULL,
Last_Name varchar(10),
Class int,
constraint test_student primary key (Std_ID)
);
...and it ran successfully. Now I want to save my data from csv to this table using the query -
load data infile 'C:\\ProgramData\\MySQL\\MySQL Server 8.0\\Uploads\\new.csv' into table student fields terminated by ',' lines terminated by '\n' ignore 1 lines;
...and is giving me the error msg -
ERROR 1366 (HY000): Incorrect integer value: '' for column 'XXX' at row X.
For the reference you can use this data.
The same can be find below.
LOAD DATA INFILE 'C:\\ProgramData\\MySQL\\MySQL Server 8.0\\Uploads\\new.csv'
INTO TABLE student
FIELDS TERMINATED BY ','
LINES TERMINATED by '\n'
IGNORE 1 LINES
-- specify columns, use variables for the columns where incorrect value may occur
(Std_ID, #Roll_NO, First_Name, Last_Name, #Class)
-- use preprocessing, replace empty string with NULL but save any other value
SET Roll_NO = NULLIF(#Roll_NO, ''),
Class = NULLIF(#Class, '');
If some column in CSV is empty string then NULL value will be inserted into according column of the table.
Std_ID is not preprocessed because it is defined as PRIMARY KEY, and it cannot be NULL.
UPDATE
OP provides source file sample. Viewing it in HEX mode shows that the file is Windows-style text file, and hence the lines terminator is '\r\n'. After according edition the file is imported successfuly.

importing csv file into mysql server but only 1st row is inserting and with wrong entries

Goal: I have to import CSV file into MySQL server
Problem: Only 1st row is inserting and that too with wrong entry.
formant of my csv file is:
lifelock,LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD,b
lifelock,LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD,a
lifelock,LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD,c
mycityfaces,MyCityFaces,7,web,Scottsdale,AZ,1-Jan-08,50000,USD,seed
flypaper,Flypaper,,web,Phoenix,AZ,1-Feb-08,3000000,USD,a
query to create sql table:
create table fund(permalink varchar(20), company varchar(20), numEmps int, category varchar(15),
city varchar(20),state varchar(15),fundedDate Date,raisedAmt int, raisedCurrency longtext,round longtext);
query to import csv file:
load data infile '/var/lib/mysql-files/TechCrunchcontinentalUSA.csv' into table fund fields terminated by ','
lines terminated by '\n' (permalink, company, #numEmps , category , city ,state, #funded, #raised, raisedCurrency ,round )
set numEmps=cast(#emps as unsigned), raisedAmt = cast(#raised as unsigned), fundedDate =STR_TO_DATE(#funded, '%d-%b-%Y') ;
Output image
Is your primary key set to AUTO_INCREMENT? Also, check if your csv file has newline separator for each line. Sometimes, csv column data can be too big to fit in to your defined db column say your are defining your city column to hold a value of varchar(20) but in csv file you have an instance where city is more than 20 characters.
You can always validate your csv using this online tool (make sure you don't have any sensitive data as it is a third party tool)
Alternatively, try this too if your csv file and table name are same:
mysqlimport --lines-terminated-by='\n' --fields-terminated-by=',' --fields-enclosed-by='"' --verbose --local db_name TechCrunchcontinentalUSA.csv
Provide username and passwords flags -uroot -proot if you have one setup.
sql query
load data infile '/var/lib/mysql-files/TechCrunchcontinentalUSA.csv' into table fund fields terminated by ','
lines terminated by '\r' (permalink, company, #numEmps , category , city ,state, #funded, #raised, raisedCurrency ,round )
set numEmps=cast(#emps as unsigned),fundedDate =STR_TO_DATE(#funded, '%d-%b-%y'), raisedAmt = cast(#raised as unsigned) ;
I have changed the line terminator from \n to \r and changed the date from %d-%b-%y to %d-%b-%y because in date Y takes 4 digits as year i.e., 2007 as year and in y it takes 2 digit year i.e., 07.
For date format specifier visit here.

Copy data from one table to another and change datatype

Hello I would like to copy data from one table to another table but I would like to convert a varchar column to a integer column the table columns are below I would like to convert the pop column to integer. Thank You
ea_id int,
yr int,
age varchar,
sex varchar,
eattain varchar,
income varchar,
pop varchar,
--I just created another table and tried loading data like this, my new
--table has a pop data type of integer, I am still getting a syntax error
load data infile 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads
/file.csv'
into table ca_pop.educ_att
fields terminated by ','
enclosed by '"'
ignore 1 lines
(ea_id, #yr, age, sex, eattain,income, #pop)
set year = YEAR(STR_TO_DATE(#year,'%m/%d/%Y %k:%i'))
set pop =nullif(pop,'')
);

Load csv file into mysql database

I'm trying to load a big CSV file into mysql but couldn't find out why it fails.
My CSV file looks like this:
_id,"event","unit","created","r1","r2","r3","r4","space_id","owner_id","name","display_name","users__email"
565ce313819709476d7eaf0e,"create",3066,"2015-12-01T00:00:19.604Z","563f592dd6f47ae719be8b38","3","13","7","55ecdd4ea970e6665f7e3911","55e6e3f0a856461404a60fc1","household","household","foo.bar#ace.com"
565ce350819709476d7eaf0f,"complete",3067,"2015-12-01T00:01:19.988Z","21","","","","55e6df3ba856461404a5fdc9","55e6e3f0a856461404a60fc1","Ace","Base","foo.bar#ace.com"
565ce350819709476d7eaf0f,"delete",3067,"2015-12-01T00:01:19.988Z","21","","","","55e6df3ba856461404a5fdc9","55e6e3f0a856461404a60fc1","Ace","Base","foo.bar#ace.com"
565ce350819709476d7eaf0f,"update",3067,"2015-12-01T00:01:19.988Z","21","","","","55e6df3ba856461404a5fdc9","55e6e3f0a856461404a60fc1","Ace","Base","foo.bar#ace.com"
And my code to load the file into mysql is this one:
CREATE DATABASE IF NOT EXISTS analys;
USE analys;
CREATE TABLE IF NOT EXISTS event_log (
_id CHAR(24) NOT NULL,
event_log VARCHAR(255),
unit CHAR(4),
created VARCHAR(255),
r1 VARCHAR(255),
r2 VARCHAR(255),
r3 VARCHAR(255),
r4 VARCHAR(255),
space_id VARCHAR(255),
owner_id VARCHAR(255),
name VARCHAR(255),
display_name VARCHAR(255),
users__email VARCHAR(255),
PRIMARY KEY (_id)
)
LOAD DATA INFILE 'audits.export.csv'
INTO TABLE event_log
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n\r'
IGNORE 1 ROWS;
Everything is fine, including the query but I get NULL in every column (only one row).
Here is the Action Output:
22:31:21 LOAD DATA INFILE 'audits.export.csv' INTO TABLE event_log FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n\r' IGNORE 1 ROWS 0 row(s) affected Records: 0 Deleted: 0 Skipped: 0 Warnings: 0 0.156 sec
I tried to tweak the table and the load query but it doesn't work
I'm on Windows 7, using Mysql 5.6 and Workbench.
I heard about GUI solution or Excel connectors here (Excel Connector) but I prefer to do it programmaticaly as I need to reuse the code.
Any help? I couldn't solve the problem with similar posts on Stackoverflow.
This doens't seem a valid newline:
LINES TERMINATED BY '\n\r'
change to:
LINES TERMINATED BY '\r\n'
Or only one of them (could be single \n or \r depending on the system, or the software that created the csv file).
Because some might come here wanting to know how to do this with MySQL Workbench:
Save CSV data to a file, for example foo.csv
Open MySQL Workbench
Choose/Open a MySQL Connection
Right-click on a schema in the Object Browser (left Navigator)
Choose "Table Data Import Wizard"
Select the CSV file, which is foo.csv in our example
Follow the wizard; many configuration options are available, including the separator
When finished, the CSV data will be in a new or existing table (your choice)
For additional information, see the documentation titled Table Data Export and Import Wizard.
I just tested this in the example data provided in the question, and it worked.

Using LOAD DATA INFILE to upload csv into mysql table

I'm using LOAD DATA INFILE to upload a .csv into a table.
This is the table I have created in my db:
CREATE TABLE expenses (entry_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(entry_id),
ss_id INT, user_id INT, cost FLOAT, context VARCHAR(100), date_created DATE);
This is some of the sample data I'm trying to upload (some of the rows have data for every column, some are missing the date column):
1,1,20,Sandwiches after hike,
1,1,45,Dinner at Yama,
1,2,40,Dinner at Murphys,
1,1,40.81,Dinner at Yama,
1,2,1294.76,Flight to Taiwan,1/17/2011
1,2,118.78,Grand Hyatt # Seoul,1/22/2011
1,1,268.12,Seoul cash withdrawal,1/8/2011
Here is the LOAD DATA command which I can't get to work:
LOAD DATA INFILE '/tmp/expense_upload.csv'
INTO TABLE expenses (ss_id, user_id, cost, context, date)
;
This command completes, uploads the correct number of rows into the table but every field is NULL. Anytime I try to add FIELDS ENCLOSED BY ',' or LINES TERMINATED BY '\r\n' I get a syntax error.
Other things to note: the csv was created in MS Excel.
If anyone has tips or can point me in the right direction it would be much appreciated!
First of all I'd change FLOAT to DECIMAL for cost
CREATE TABLE expenses
(
entry_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
ss_id INT,
user_id INT,
cost DECIMAL(19,2), -- use DECIMAL instead of FLOAT
context VARCHAR(100),
date_created DATE
);
Now try this
LOAD DATA INFILE '/tmp/sampledata.csv'
INTO TABLE expenses
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n' -- or \r\n
(ss_id, user_id, cost, context, #date_created)
SET date_created = IF(CHAR_LENGTH(TRIM(#date_created)) > 0,
STR_TO_DATE(TRIM(#date_created), '%m/%d/%Y'),
NULL);
What id does:
it uses correct syntax for specifying fields and columns terminators
since your date values in the file are not in a proper format, it first reads a value to a user/session variable then if it's not empty it converts it to a date, otherwise assigns NULL. The latter prevents you from getting zero dates 0000-00-00.
Here is my advice. Load the data into a staging table where all the columns are strings and then insert into the final table. This allows you to better check the results along the way:
CREATE TABLE expenses_staging (entry_id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(entry_id),
ss_id varchar(255),
user_id varchar(255),
cost varchar(255),
context VARCHAR(100),
date_created varchar(255)
);
LOAD DATA INFILE '/tmp/expense_upload.csv'
INTO TABLE expenses_staging (ss_id, user_id, cost, context, date);
This will let you see what is really being loaded. You can then load this data into the final table, doing whatever data transformations are necessary.