load data infile separating words by "_" - mysql

I'm currently importing a dictionary in mysql which has words that are seperated by _. I wanted to know how to specify the words are seperated by _. For example the words are as such:
Super_Mario
Stack_Overflow
Another_Word
so each row would then be stored as :
Super Mario
Stack Overflow
Another Word
I have this query right now:
LOAD DATA LOCAL INFILE
C:/upload/dictionary.csv
INTO TABLE dictionary
fields terminated by ',' lines terminated by '\n'
would I have to use fields terminated by '_'?

No, you just use the SET clause (just like in an UPDATE) to set the field's value with the result of a string REPLACE() operation that replaces underscores with spaces.
LOAD DATA LOCAL INFILE
C:/upload/dictionary.csv
INTO TABLE dictionary (#var1)
SET your_column_name = REPLACE(#var1, '_', ' ')
The (#var1) bit after INTO TABLE dictionary just means "there's only one column in the file I'm interested in, and I want to store it in #var1 so I can use it later in my SET clause instead of putting it directly into a column." Do a Ctrl+F in for "SET clause" in the documentation for LOAD DATA INFILE to see how to use a SET clause when your input file has more than one column.

fields terminated by '_' will interpret every word separated by _ as a new column
so Super_Mario, Stack_Overflow and Another_Word would each end up as two columns in each row. If your entire dictionary is made up of two words and your dictionary table has two columns, it'll work, but I have the feeling not every word in your file is going to be two words
If you want to store each line in the file as a single column, but with all the _s replaced with spaces, you could do something like this after the import (or do what #Jordan said and do it as part of the import which sounds better to me)
UPDATE dictionary SET columnname = replace(columnname,'_',' ')

Related

How can I load blank/NULL values with LOAD DATA INFILE from the MySQL command line

I am using the following command from the MySQL command line to try and import a csv file into one of my tables:
LOAD DATA INFILE 'file path' INTO TABLE table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(BASEID, BIGID, StartDate, EndDate)
Some of my rows have no value for EndDate which I want to be represented as NULL values in the database.
Unfortunately when I execute the above command I get the following error:
for column 'EndDate' at row 141lue:
If I remove the rows with blank cells the command works, so it is clearly the blank values for EndDate which are causing the problem.
I have also tried changing my csv file so that the blank cells say NULL or \N. I have also tried the following command instead but I still get the same error message:
LOAD DATA INFILE 'file path' INTO TABLE Table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(BASEID, BIGID, StartDate, #EndDate)
SET EndDate = nullif(#EndDate, ' ')
How can I load csv files which have some blank values? The suggest solutions I have seen on other posts don't seem to work as outlined above.
Is the issue that the value for the end date is missing, or that the column itself is missing? These are not the same thing. For the former case, I think LOAD DATA should be able to handle this, assuming that the target column for the end date can tolerate missing/null values.
What I suspect here is that some of your input lines look like this:
1,1,'2020-10-03'
That is, there is no fourth column present at all. If this be the case, then the most prudent thing to do here might be to run a simple regex over your input CSV flat file to fix these missing fourth column edge cases. You may try:
Find: ^([^,]+,[^,]+,'[^,]+')$
Replace: $1,
This would turn the sample line above into:
1,1,'2020-10-03',
Now, the date value is still missing, but at least LOAD DATA should detect that the line has four columns, instead of just three.

Load data from text file to DB

Data:
1|\N|"First\Line"
2|\N|"Second\Line"
3|100|\N
\N represents NULL in MYSQL & MariaDB.
I'm trying to load above data using LOAD DATA LOCAL INFILE method into a table named ID_OPR.
Table structure:
CREATE TABLE ID_OPR (
idnt decimal(4),
age decimal(3),
comment varchar(100)
);
My code looks like below:
LOAD DATA LOCAL INFILE <DATA FILE LOCATION> INTO TABLE <TABLE_NAME> FIELDS TERMINATED BY '|' ESCAPED BY '' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n';
Problem with this code is it aborts with error Incorrect decimal value: '\\N' For column <Column name>.
Question:
How to load this data with NULL values in second decimal column and also without loosing \(Backslash) from third string column?
I'm trying this is MariaDB which is similar to Mysql in most case.
Update:
The error i have mentioned appears like a warning and the data is actually getting loaded into table. But the catch here is with the text data.
For example: Incase of the third record above it is being loaded as \N itself into string column. But i want it to be NULL.
Is there any way to make the software to recognize this null value? Something like decode in oracle?
You can't have it both ways - either \ is an escape character or it is not. From MySQL docs:
If the FIELDS ESCAPED BY character is empty, no characters are escaped and NULL is output as NULL, not \N. It is probably not a good idea to specify an empty escape character, particularly if field values in your data contain any of the characters in the list just given.
So, I'd suggest a consistently formatted input file, however that was generated:
use \\ if you want to keep the backslash in the strings
make \ an escape character in your load command
OR
make strings always, not optionally, enclosed in quotes
leave escape character empty, as is
use NULL for nulls, not \N
BTW, this also explains the warnings you were experiencing loading \N in your decimal field.
Deal with nulls with blanks. that should fix it.
1||"First\Line"
2||"Second\Line"
3|100|
Thats how nulls are handled on CSVs and TSVs. And don't expect decimal datatype to go null as it stays 0, use int or bigint instead if needed. You should forget about "ESCAPED BY"; as long as string data is enclosed by "" that deals with the escaping problem.
we need three text file & 1 batch file for Load Data:
Suppose your file location 'D:\loaddata'
Your text file 'D:\loaddata\abc.txt'
1. D:\loaddata\abc.bad -- empty
2. D:\loaddata\abc.log -- empty
3. D:\loaddata\abc.ctl
a. Write Code Below for no separator
OPTIONS ( SKIP=1, DIRECT=TRUE, ERRORS=10000000, ROWS=5000000)
load data
infile 'D:\loaddata\abc.txt'
TRUNCATE
into table Your_table
(
a_column POSITION (1:7) char,
b_column POSITION (8:10) char,
c_column POSITION (11:12) char,
d_column POSITION (13:13) char,
f_column POSITION (14:20) char
)
b. Write Code Below for coma separator
OPTIONS ( SKIP=1, DIRECT=TRUE, ERRORS=10000000, ROWS=5000000)
load data
infile 'D:\loaddata\abc.txt'
TRUNCATE
into table Your_table
FIELDS TERMINATED BY ","
TRAILING NULLCOLS
(a_column,
b_column,
c_column,
d_column,
e_column,
f_column
)
4.D:\loaddata\abc.bat "Write Code Below"
sqlldr db_user/db_passward#your_tns control=D:\loaddata\abc.ctl log=D:\loaddata\abc.log
After double click "D:\loaddata\abc.bat" file you data will be load desire oracle table. if anything wrong check you "D:\loaddata\abc.bad" and "D:\loaddata\abc.log" file

Convert Numerical Fields in MySQL

I have imported a CSV file where a specific column has a decimal number.
In the original excel file (before saving it to a CSV), the first number of the column shows up as 218,790. When I choose the cell, the number shows up as 218790.243077911.
In the CSV file the number shows up as 218790 and when I choose the cell it is 218,790.
When I import the file on mySQL and show the table I created, the number shows up as 218.000000000.
Here is the code I used:
create table Apolo_Test(
Leads decimal (15,9)
);
LOAD DATA LOCAL INFILE 'C:/Users/SCRIPTS/file.csv'
INTO TABLE Apolo_Test
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 7 ROWS
;
I tried updating the format with this :
update Apolo_Test set Leads = format(Leads, 10, 'de_DE');
but it did not work. I have never had a case where files had a comma before. I guess it is the UK version of numerical fields.
How is it possible to make it work on mySQL without using any MACROS in excel?
UPD:
It works but I get some warnings although I double checked the csv file and the fields :
create table Apolo_Test(
Ad_Group varchar(50),
Impacts int,
Leads decimal (10,3)
);
LOAD DATA LOCAL INFILE 'C:/Users/me/Desktop/SCRIPTS/11/Adalyser.csv'
INTO TABLE Apolo_Test
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 7 ROWS
(Ad_Group, Impacts, #Leads)
SET Leads = replace(#Leads, ',', '');
;
alter table Apolo_Test ADD IPL decimal (10,6) after Leads;
update Apolo_Test set IPL=Impacts/Leads;
select * from Apolo_Test;
You have to use this syntax:
LOAD DATA LOCAL INFILE 'C:/path/to/mytable.txt' IGNORE
INTO TABLE mytable
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\r\n'
(int_col, #float_col)
SET float_col = replace(#float_col, ',', '.');
For more information read here
The thousands-separator should not matter when moving data around -- Excel internal values and CSV files and MySQL internal values do not include it. Only "formatted" output includes it. And you should not use formatted output for moving numbers around.
Be careful with locale, such as de_DE.
The German "218.790" is the same as English "218,790".
"218790.243077911" is likely to be what Excel had internally for the number.
"218,790" is likely to be the English representation on the screen; note the English thousands separator.
In the CSV file the number shows up as 218790 and when I choose the cell it is 218,790.
What do you mean? Perhaps that there no comma or dot in the file, itself? But what you mean by "choose the cell"?
I can't see how to get "218.000000000" without truncation going on somewhere.

Row does not contain data for all columns

Im trying to import a text file containing:
http://pastebin.com/qhzrq3M7
Into my database using the command
Load data local infile 'C:/Users/Gary/Desktop/XML/jobs.txt'
INTO Table jobs
fields terminated by '\t';
But I keep getting the error Row 1-13 doesn't contain data for all columns
Make sure the last field of each row ends with \t. Alternatively, use LINES TERMINATED BY
LOAD DATA LOCAL INFILE 'C:/Users/Gary/Desktop/XML/jobs.txt' INTO TABLE jobs COLUMNS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r';
\r is a carriage return character, similar to the newline character (i.e. \n)
I faced same issue. How i fixed the issue:
Try to open the CSV file using Notepad++ (text editor)
I've seen a blank line at the end of my file, I've deleted it.
-- Hurrah, it resolved my issue.
Below URL also can help you out to resolve the issue.
http://www.thoughtspot.com/blog/5-magic-fixes-most-common-csv-file-problems
If you're on Windows, make sure to use the LINES TERMINATED BY \r\n as explained by the mariadb docs
sounds like load data local infile expects to see a value for each column.
You can edit the file by hand (to delete those rows -- could be blank lines), or you can create a temp table, insert the rows into a single column, and write a mysql command to split the rows on tab and insert the values into the target table
Make sure there are no "\"s at the end of any field. In the csv viewed as text this would look like "\," which is obviously a no-no, since that comma will be ignored so you won't have enough columns.
(This primarily applies when you don't have field encasings like quotes around each field.)

Import CSV to MySQL

I have created a database and a table. I have also created all the fields I will be needing. I have created 46 fields including one that is my ID for the row. The CSV doesn't contain the ID field, nor does it contain the headers for the columns. I am new to all of this but have been trying to figure this out. I'm not on here being lazy asking for the answer, but looking for directions.
I'm trying to figure out how to import the CSV but have it start importing data starting at the 2nd field, since I'm hoping the auto_increment will fill in the ID field, which is the first field I created.
I tried these instructions with no luck. Can anyone offer some insight?
The column names of your CSV file must match those of your table
Browse to your required .csv file
Select CSV using LOAD DATA options
Check box 'ON' for Replace table data with file
In Fields terminated by box, type ,
In Fields enclosed by box, "
In Fields escaped by box, \
In Lines terminated by box, auto
In Column names box, type column name separated by , like column1,column2,column3
Check box ON for Use LOCAL keyword.
Edit:
The CSV file is 32.4kb
The first row of my CSV is:
Test Advertiser,23906032166,119938,287898,,585639051,287898 - Engager - 300x250,88793551,Running,295046551,301624551,2/1/2010,8/2/2010,Active,,Guaranteed,Publisher test,Maintainer test,example-site.com,,All,All,,Interest: Dental; custom geo zones: City,300x250,-,CPM,$37.49 ,"4,415","3,246",3,0,$165.52 ,$121.69 ,"2,895",805,0,0,$30.18 ,$37.49 ,0,$0.00 ,IMPRESSIONBASED,NA,USD
You can have MySQL set values for certain columns during import. If your id field is set to auto increment, you can set it to null during import and MySQL will then assign incrementing values to it. Try putting something like this in the SQL tab in phpMyAdmin:
LOAD DATA INFILE 'path/to/file.csv' INTO TABLE your_table FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' SET id=null;
Please look at this page and see if it has what you are looking for. Should be all you need since you are dealing with just one table. MYSQL LOAD DATA INFILE
So for example you might do something like this:
LOAD DATA INFILE 'filepath' INTO TABLE 'tablename' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' (column2, column3, column4);
That should give you an idea. There are of course more options that can be added as seen in the above link.
be sure to use LOAD DATA LOCAL INFILE if the import file is local. :)