I Have the following table:
CREATE TABLE `tmp_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`t` bit(1) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1$$
And an xml file called "data.xml" that contains 1 line:
<list><row t="0" /></list>
When I run the following command:
LOAD XML LOCAL INFILE 'c:/temp/data.xml' INTO TABLE `tmp_table`
After running this command I get one row with a value of "1" for column t and a warning:
LOAD XML LOCAL INFILE 'c:/temp/data.xml' INTO TABLE `tmp_table` 1 row(s) affected, 1 warning(s):
1264 Out of range value for column 't' at row 1
Records: 1 Deleted: 0 Skipped: 0 Warnings: 1 0.000 sec
How can I load a 0 for a bit field in an xml document?
MySQL suggests to do next:
BIT values cannot be loaded using binary notation (for example, b'011010'). To work around this, specify the values as regular integers and use the SET clause to convert them so that MySQL performs a numeric type conversion and loads them into the BIT column properly:
http://dev.mysql.com/doc/refman/5.5/en/load-data.html
I have tried this query:
LOAD XML LOCAL INFILE 'data.xml' INTO TABLE `tmp_table`
ROWS IDENTIFIED BY '<row>'
(#var1)
SET t = CAST(#var1 AS SIGNED);
...and I got stange warning message - 'Column 't' cannot be null'.
Hope this will work for you; otherwise, I think, you should write a request to bugs.mysql.com
LOAD XML INFILE seems to be not that good at importing data from arbitrary XML.
I have a blog post about using LOAD DATA INFILE to import from XML. since the approach uses a user-defined variable to hold the group, you can add an additional function to cast the value.
Alternatively, you can try to export data from MySQL in XML, look at how it represents bit values and adjust your xml before loading with XSLT.
My post was actually inspired by the question: LOAD XML LOCAL INFILE with Inconsistent Column Names
Related
I'm new to Mysql and am using it to make use of several CSV files I have that are very large (some have over a million rows). I'm on Win7-64 Ultimate. I have installed MySql Workbench v. 6.3.6 build 511 64 bit. I read a similar question however I cannot comment since I am new. I am getting a different error anyway.
I have set up a database called crash0715, and created a table called driver_old with five columns. The first column is a report number (set up as INT(20)) that will be keyed to other files. It contains some duplicates depending upon the data in the other columns. The next four columns contain numeric data that is either 1 or 2 digits.
I set up the report_number column as INT(20), primary key, not null.
The other 4 were set up as INT or INT(2)
When I tried to import a little over 1 million rows in a 5-column CSV file (named do.csv in my c:\ root) via the GUI, the program hung. I had let it run over 12 hours and my task manager showed the program was using 25% cpu.
I next tried the command line. After switching to the database, I used
LOAD DATA LOCAL INFILE 'c:/do.csv' INTO TABLE driver_old FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
I had removed the header row from the CSV before trying both imports.
I got the following message:
QUERY OK, 111 rows affected, 65535 warnings <3.97 sec> Records: 1070145 Deleted: 0 Skipped: 1070034 Warnings: 2273755
I read the first few lines of SHOW WARNINGS and they were as follows:
1264 Out of range value for column 'report_number' for row 1.
1261 Row 1 doesn't contain data for all columns
These two repeated for all of the other lines.
There was also a
1062 Duplicate entry '123456789' for key 'primary' (123456789 is a representative value)
It also reoccurred with the other two codes.
The CSV file has no blanks on the first column, however there are a few in the other ones.
Any idea what I'm doing wrong here?
i solved this by save and export sql insert statement
I would use bigint insted of int!
Inserting ignore or replace may help with duplicate primary key values!
LOAD DATA LOCAL INFILE 'c:/do.csv' ignore/replace INTO TABLE driver_old FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
I cannot comment on this question,but it would be great if you could post an url to a picture showing few lines from csv file and code how you created table and inserted data ! That would be very helpful for answering the question!
I have now successfully imported the 1045767 records. As suggested by another member here, I imported a small 100 row file that gave the same errors. I then opened the csv in Libre Office and saved it. I was able to import it OK.
The problem was the spreadsheet program, GS-Calc. When saving csv files, it gives three options: UTF-8, UTF-16, and ANSI/OEM/ISO. I had initially saved it as UTF-8 and it returned the error.
I saved it as ANSI/OEM/ISO and it was able to be imported OK. I hope this helps others with large csv files in the future.
i change the separator default in mysql by comma
Have to move a table from MS SQL Server to MySQL (~ 8M rows with 8 coloumns). One of the coloumns (DECIMAL Type) is exported as empty string with "bcp" export to a csv file. When I'm using this csv file to load data into MySQL table, it fails saying "Incorrect decimal value".
Looking for possible work arounds or suggestions.
I would create a view in MS SQL which converts the decimal column to a varchar column:
CREATE VIEW MySQLExport AS
SELECT [...]
COALESCE(CAST(DecimalColumn AS VARCHAR(50)),'') AS DecimalColumn
FROM SourceTable;
Then, import into a staging table in MySQL, and use a CASE statement for the final INSERT:
INSERT INTO DestinationTable ([...])
SELECT [...]
CASE DecimalColumn
WHEN '' THEN NULL
ELSE CAST(DecimalColumn AS DECIMAL(10,5))
END AS DecimalColumn,
[...]
FROM ImportMSSQLStagingTable;
This is safe because the only way the value can be an empty string in the export file is if it's NULL.
Note that I doubt you can cheat by exporting it with COALESCE(CAST(DecimalColumn AS VARCHAR(50)),'\N'), because LOAD INFILE would see that as '\N', which is not the same as \N.
I have twenty pipe-delimited text files that I would like to convert into a MySQL database. The manual that came with the data say
Owing to the difficulty of displaying data for characters outside of
standard Latin Character Sets, all data is displayed using Unicode
(UCS-2) character encoding. All CSV files are structured using
commercial standards with the preferred format being pipe delimiter
(“|”) and carriage return + line feed (CRLF) as row terminators.
I am using MySQL Workbench 6.2.5 on Win 8.1, but the manual provides example SQL Server scripts to create the twenty tables. Here's one.
/****** Object: Table [dbo].[tbl_Company_Profile_Stocks] Script Date:
12/12/2007 08:42:05 ******/
CREATE TABLE [dbo].[tbl_Company_Profile_Stocks](
[BoardID] [int] NULL,
[BoardName] [nvarchar](255) NULL,
[ClientCompanyID] [int] NULL,
[Ticker] [nvarchar](255) NULL,
[ISIN] [nvarchar](255) NULL,
[OrgVisible] [nvarchar](255) NULL
)
Which I adjust as follows for MySQL.
/****** Object: Table dbo.tbl_Company_Profile_Stocks Script Date:
12/12/2007 08:42:05 ******/
CREATE TABLE dbo.tbl_Company_Profile_Stocks
(
BoardID int NULL,
BoardName varchar(255) NULL,
ClientCompanyID int NULL,
Ticker varchar(255) NULL,
ISIN varchar(255) NULL,
OrgVisible varchar(255) NULL
);
Because the manual says that the flat files are UCS-2, I set the dbo schema to UCS-2 default collation when I create it. This works fine AFAIK. It is the LOAD INFILE that fails. Because the data are pipe-delimited with CRLF line endings I try the following.
LOAD DATA LOCAL INFILE 'C:/Users/Richard/Dropbox/Research/BoardEx_data/unzipped/Company_Profile_Stocks20100416.csv'
INTO TABLE dbo.tbl_company_profile_stocks
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;
But in this case now rows are imported and the message is 0 row(s) affected Records: 0 Deleted: 0 Skipped: 0 Warnings: 0. So I try \n line endings instead. This imports something, but my integer values become zeros and the text becomes very widely spaced. The message is 14121 row(s) affected, 64 warning(s): 1366 Incorrect integer value: <snip> Records: 14121 Deleted: 0 Skipped: 0 Warnings: 28257.
If I open the flat text file in Sublime Text 3, the Encoding Helper package suggests that the file has UTF-16 LE with BOM encoding. If I repeat the above with UTF-16 default collation when I create the dbo schema, then my results are the same.
How can I fix this? Encoding drives me crazy!
Probably the main problem is that the LOAD DATA needs this clause (see reference):
CHARACTER SET ucs2
In case that does not suffice, ...
Can you get a hex dump of a little of the csv file? I want to make sure it is really ucs2. (ucs2 is very rare. Usually text is transferred in utf8.) If it looks readable when you paste text into this forum, then it is probably utf8 instead.
There is no "dbo" ("database owner"), only database, in MySQL.
Please provide SHOW CREATE TABLE tbl_Company_Profile_Stocks
(just a recommendation) Don't prefix table names with "tbl_"; it does more to clutter than to clarify.
Provide a PRIMARY KEY for the table.
#Rick James had the correct answer (i.e., set the encoding for LOAD DATA with the CHARACTER SET option). But in my case this didn't work because MySQL doesn't support UCS-2.
Note
It is not possible to load data files that use the ucs2 character set.
Here are a few approaches that work here. In the end I went this SQLite rather than MySQL, but the last solution should work with MySQL, or any other DB that accepts flat files.
SQLiteStudio
SQLiteStudio was the easiest solution in this case. I prefer command line solutions, but the SQLiteStudio GUI accepts UCS-2 encoding and any delimiter. This keeps the data in UCS-2.
Convert to ASCII in Windows command line
The easiest conversion to ASCII is in the Windows command line with TYPE.
for %%f in (*.csv) do (
echo %%~nf
type "%%~nf.csv" > "%%~nf.txt"
)
This may cause problems with special characters. In my case it left in single and double quotes that caused some problems with the SQLite import. This is the crudest approach.
Convert to ASCII in Python
import codecs
import glob
import os
for fileOld in glob.glob('*.csv'):
print 'Reading: %s' % fileOld
fileNew = os.path.join('converted', fileOld)
with codecs.open(fileOld, 'r', encoding='utf-16le') as old, codecs.open(fileNew, 'w', encoding='ascii', errors='ignore') as new:
print 'Writing: %s' % fileNew
for line in old:
new.write(line.replace("\'", '').replace('"', ''))
This is the most extensible approach and would allow you more precisely control which data you convert or retain.
I am trying to do a load data infile to a database. The only issue I have is I am getting a Error 1261. I was getting an incorrect datetime value error earlier but i solved that with the code in the load data infile below (set date_time = ). My problem now is it says that I don't have enough data for all columns. I know that you are supposed to name the columns after the name of the table but I can't seem to get it to work.
There is one table and it has 15 columns. The first column is the primary key, the other fourteen are regular columns.
Here is the load file statement:
load data infile 'c:/proj/test.csv' into table base (#var1,event,failure,ue,mc,mn,cell,durat,cause,ne,ims,hier,hier3,hier32)
set date_time = STR_TO_DATE(#var1, '%Y%m%d %H%i%s')
;
Additional notes: pk column is called dataId and is an INT
It is auto increment.
Here is the data from the csv file:
2013-03-20 14:55:22,4098,1,21060800,344,930,4,1000,0,11B,344930000000011,4809532081614990000,8226896360947470000,1150444940909480000
Try this
load data infile 'c:/proj/test.csv' into table base (#var1,event,failure,ue,mc,mn,cell,durat,cause,ne,ims,hier,hier3,hier32)
set date_time = STR_TO_DATE(#var1, '%Y%m%d %H%i%s')
character set latin1
fields terminated by '\t' enclosed by '' escaped by '\\'
lines terminated by '\n' starting by ''
ignore 1 lines;
Take a look at here
I also had met the similar problems.
The error message is:
load data infile 'L:/new_ncbi' into table ncbi
fields terminated by '\t'
lines terminated by '\r\n' 1973 row(s) affected, 3 warning(s):
1261 Row 1629 doesn't contain data for all columns
1261 Row 1630 doesn't contain data for all columns
1261 Row 1630 doesn't contain data for all columns
Records: 1973 Deleted: 0 Skipped: 0 Warnings: 3 0.281 sec
so I come back to see the data what I load.
I find at line 1639-1630 in the file, I find this problem:
Sphingomonas phage PAU NC_019521 "Sphingomonas paucimobilis
" species
yes, as you see.
The two line would like to be one line, but it is not.
By the way, I state that my data is stored by a excel file.
While I need to handle my data, I transfer my data from excel file to a normal file.
One line data in excel file will be two just because this line my contain a spacial character like CRLF or others.
So I suggest you can copy your data from csv to a normal file and check whether having similar problems.
Maybe my English is bad, but I still hope be helpful.
I'm trying to load data into a table (obviously?). My table looks like this:
CREATE TABLE IF NOT EXISTS `condensed` (
`id` bigint(20) NOT NULL,
`src` enum('C_X','C_AH','C_AO','C_B','H_X','H_AH','H_AO','H_B') NOT NULL,
`hash` int(11) default NULL,
`ihash` int(11) default NULL,
`last_updated` datetime default NULL,
PRIMARY KEY (`id`,`src`),
UNIQUE KEY `covering` (`id`,`src`,`hash`)
) ENGINE=MyISAM DEFAULT CHARSET=ascii;
I've got data files with look like this:
320115816,'C_X',692983698,854142703,20120216220954
320124536,'C_X',588472049,1059436251,20100527232845
320120196,'C_X',452117509,855369958,20101118105505
...
But when I load it using
LOAD DATA INFILE '/path/to/data.csv'
IGNORE
INTO TABLE `condensed`
(id, src, hash, ihash, last_updated);
it only loads the first two columns (hash, ihash and last_updated are null).
320115816,'C_X',NULL,NULL,NULL
320124536,'C_X',NULL,NULL,NULL
320120196,'C_X',NULL,NULL,NULL
...
I do get a lot of warnings (presumably because mysql is discarding the 3 columns from the input set and assigning defaults)
Query OK, 20 rows affected, 100 warnings (0.00 sec)
Records: 20 Deleted: 0 Skipped: 0 Warnings: 100
(I intend to load several milion records - not just 20)
I get the same problem using mysqlimport.
Omitting an explicit field list from the LOAD DATA statement (same fields and order in database as in files) resulted in the same outcome.
MySQL version is 5.0.22, there are no non-printable characters in the input file.
Help!
Just add some improvement to Thilo's answer if you are using Windows.
LOAD DATA INFILE '/path/to/data.csv' IGNORE
INTO TABLE `condensed`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '\''
LINES TERMINATED BY '\r\n' --Windows right line terminator
(id, src, hash, ihash, last_updated)
It worked for me. It solved all my truncating problems on Windows. Also have a look at this :
http://forums.mysql.com/read.php?79,76131,76871
Never managed to resolve this. I ended up writing a wee php script to map the data into the db.
It's worth noting that, if the field MySQL is complaining about happens to be the final one in the table, there's a chance that you need to fix the FIELDS TERMINATED BY. On Windows I had to tell it \n instead of \r\n.
I was having similar problems with a CSV file created on an IBM mainframe that was moved to a Windows file server before being loaded. I was getting a truncation warning on all rows except the last. Mainframe file looked okay. Adding '\r\n' cleared the problem.
I think the quotes around only the enum field are confusing the import. Try this:
LOAD DATA INFILE '/path/to/data.csv' IGNORE
INTO TABLE `condensed`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '\''
(id, src, hash, ihash, last_updated)