Import CSV files into HSQLDB - csv

I am trying to convert a set of CSV files into a HSQLDB database. My first attempt was to fire up DatabaseManagerSwing and execute the following code:
* *DSV_COL_SPLITTER = ;
\mq /home/michael/workspaces/rds-surveyor/lt/it/NAMES.DAT
commit;
Which gets rejected with the error message:
java.sql.SQLSyntaxErrorException: unexpected token: *
In order to get at least some response from HSQLDB, I tried removing the first line, but this gives just a slightly different error:
java.sql.SQLSyntaxErrorException: unexpected token:
I then came across sqltool, and after overcoming its various pitfalls (you need the sqltool JAR, as well as the hsqldb JAR of the same version in the same path or somewhere in your classpath) I ran the full code here. The first line got processed as expected, but the \mq command fails with a similar error:
SEVERE Cause: SQLSyntaxErrorException: unknown token:
The file I am trying to import looks like this (first few lines shown):
CID;LID;NID;NAME;NCOMMENT
25;1;165;Europa;
25;1;167;Italia;
25;1;169;Abruzzo;
25;1;171;Chieti;
25;1;173;Passo Di Lanciano;
25;1;175;Valico Castiglione Messer Marino;
25;1;177;Valico Della Forchetta;
What's going wrong here?

The command you are trying to execute belongs to SqlTool, which is a separate command line client for HSQLDB and is in a separate jar in the zip package. The guide is here: http://hsqldb.org/doc/2.0/util-guide/sqltool-chapt.html
In DatabaseManagerSwing, you can use a different method of creating TEXT tables for CSV files. http://hsqldb.org/doc/2.0/guide/texttables-chapt.html

I have now abandoned the import path for other reasons and am instead doing the CSV import in my application.
While working with the CREATE TABLE statements, which I built from the first row of the CSV file copied in, I got the same error message for my SQL code. Closer analysis of the SQL file with a hex editor revealed a byte-order marker (BOM) at the beginning of the pasted column name. After eliminating the BOM, the SQL code would run without any further nagging.
I remember that some of the files I am trying to import start with a BOM (which has given me quite a headache earlier) – therefore I assume the BOM was the "unknown token" HSQLDB was complaining about all the time. Since the BOM is a nonprinting character, it explains why no token was shown in the error message.
Lesson learned: An "invalid token" error with no character shown in the message is likely due to a BOM, control character or other nonprinting stuff in the offending input. A hex editor will reveal that.

Related

MySQL import - CSV - file refuses to be properly imported

I'm trying to import the following file into a MySQL Db:
https://drive.google.com/drive/folders/1WbRdNgqVre3wN4DpJZ-08jtGkJtCDJNQ?usp=sharing
Using the "data import wizard" on MySql Workbench, for some reason I'm getting "218\223 lines imported successfully", whereas the file contains close to 100K.
I tried looking for special chars around lines 210-230, also removing all of them, but still the same happens.
The file is a CSV of Microsoft Bing's geo locations, used in Microsoft Advertising campaigns, downloaded from Microsoft's website (using an ad account there).
I've been googling, reading, StackOverflowing, playing with the file and different import options...
I tried cutting the file into small bits, and the newly created file was completely corrupt somehow...
Encoding seems to be UTF-8, line breaks all "\n". I tried changing them all into "\r\n" using notepad++, but still the same happens.
File opens normally in Excel, looks normal, passes CSVlint.io...
The only weird thing is that the file contains quotes on some of the values but not on the rest (e.g. line 219. Yeah I know it sounds like this would be the problem, but I removed it, and all the rest of the lines with quotes, and it still happens... Also tried loading with ENCLOSED BY ", see below).
I also tried using SQL statements to import:
LOAD DATA LOCAL INFILE 'c:\\Users\\Gilad\\Downloads\\GeoLocations.csv'
INTO TABLE aw_geo_map_bmsl
FIELDS TERMINATED BY ','
(tried also with: ENCLOSED BY '"')
LINES TERMINATED BY '/n'
IGNORE 1 ROWS;
(had to add OPT_LOCAL_INFILE=1 to the connection on Advanced for MySQL Workbench to be allowed access to local files on my computer)
This gives 0 rows affected.
Help?
Epilogue: In the end I just gave up on all these import wizards and did it the old "make your SQL statements from Excel" way.
I imported the CSV data into Excel. Watch out: in this case I found I needed to use a data import wizard from Excel (but that one worked perfectly) to be able to change the encoding to UTF, which Excel 2010 chose as "windows" which was wrong.
After processing the data a bit to my liking, I used the following Excel code:
=CONCATENATE("INSERT INTO aw_geo_map_bmsl (`Location Id`,Name,`Canonical Name`,`Location Type`,Status,`Adwords Location Id`)
VALUES (",
A2,
",""",B2,"""",
",""",C2,"""",
",""",D2,"""",
",""",E2,"""",
",",F2,");")
to generate INSERT statements for every line, then copy-pasted and pasted only values, then pasted into an editor, removed additional quotes that Excel adds, and ran it in MySQL Workbench, which runs it line by line (takes some time), and you can see the progress.
Saved me hours of unsuccessfully playing around with "automatic tools" which fail for unknown reasons and don't give proper logs ootb.
Warning: do NOT do this for unsanitized code as it's vulnerable to SQL injection. In this case it was data from Microsoft so I know it's fine.

Import fails from CSV file into SQL Server 2012 table

I am trying to import a rather large (520k rows) .CSV file into a SQL Server 2012 table. The file uses a delimiter of ;.
Please do not edit my delimiter. It is ";" I know that may seem strange, but that is what they used. It is not just a semicolon.
I don't think the delimiter is the issue because I replaced it with a tab and it seemed to be okay. When I try importing the file, I get a text truncation error, but I set the column to 255 just to be sure it had plenty of room.
Even when I delete the row, the next row causes the error. I don't see any offending characters in the data, so I am at a loss as to what the issue is.
I ended up using the EOL Conversion and selected Windows format in Notepad++ and then created a script to import the data.

Hive query - FAILED SemanticException invalid path

Here is my problem:
I have just gotten my initial Azure subscription converted to a Pay-As-You-Go subscription (first was a 30-day trial) after it was shut down when I used up the first set of free credits. Now all is working fine again - I still have the same old resource group under which I establish a new cluster. The files with my CSV-data are all still present in the container I created last time (not the default container but one that was established earlier). The only thing I had to recreate was the Hive table needed to load the data into. Also that table I was able to establish again. However when I then try to run a Hive query to actually load data into the Hive table from the CSV-file as follows...
LOAD DATA INPATH '/container1/HdiSamples/user/data-file.csv' OVERWRITE INTO TABLE default.hive_table;
...I am constantly receiving "Failed" as an error message (I use Data Lake tools for VS to upload blobs and run the queries). In the specificerror log the line beginning with 'FAILED: SemanticException etc.' stands out each time... (this despite of using different locations for the file upload).
16/12/01 04:16:25 WARN conf.HiveConf: HiveConf of name hive.log.dir does not exist
FAILED: SemanticException Line 1:17 Invalid path ''/container1/HdiSamples/user/data-file.csv'': No files matching path wasb://container1#resourcegroup.blob.core.windows.net/container1/HdiSamples/user/data-file.csv
Here is my question:
Can anyone tell me why it doesn't find and load the file at/from the location where the file actually resides...?
I just don't get the cause for this error...
Although it's been a while since I asked this question, I worked out a solution to the issue myself which I thought, I'd share with others...
I had problems for about a week, being unable to load data into the Hive tables from the Azure Blob Storage. I had two CSV-files called data-file.csv and data-file-extended-1.CSV in my blob. Please note the capitals in the file extension here!
Hive and Hadoop do NOT accept these files unless...
a) the filename is spelled exactly the same way including the capitals in the file-extension
AND
b) the filename is shortened drastically and without the hyphens and numbers (in my case I used only 6 conjoined letters, i.e. "datfil" and "datfix")
Shockingly, there isn't any mention of these issues in neither the official Azure documentation nor did I find anything on the web. However, these two adjustments will resolve the error message.
Just to let people know...

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO"; gives an error

I have problem with mysql database. I can't import a database from my friend.
I need some help.
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";
ERROR:
Unexpected beginning of statement. (near "phpMyAdmin" at position 0)
Unrecognized statement type. (near "SQL" at position 11)
#1064 - Something is wrong in your syntax obok 'phpMyAdmin SQL Dump
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO"' w linii 1
There´s nothing wrong with you syntax, but probably with your file:
most likely the file was edited and the text-editor (of course Windows notepad.exe) was too clever and added a BOM on saving.
Remove the first 3 bytes (HEX: EF BB BF), save the file without it (either use a hex editor or use PSPad and switch format to UNIX), and the importer should have no problem anymore.
The BOM fools the importer, the first - gets eaten and the importer no longer recognizes the first comment as such.
Wikipedia about BOM:
File comparison (w/o BOM)
I encountered exactly the same problem. Apparently, you use a version of phpMyAdmin which has bugs in the import module (in my case it was phpMyAdmin 4.5.5.1 packaged in Wamp 3.0.4). More precisely, it interprets comments (valid syntax with space after --) as SQL code. This is the case at the beginning of a dump created by phpMyAdmin: it typically starts with
-- phpMyAdmin SQL Dump
which explains your error message.
The import module of phpMyAdmin 4.5.5.1 was not able to parse escaped single quotes either (see https://github.com/phpmyadmin/phpmyadmin/issues/11721).
There are many possible workarounds to this problem:
Update phpMyAdmin
Use another tool to import your DB dump, for example MySQL Command Line or MySQL workbench
Less advisable: execute the contents of the .sql file as a query in your current version of phpMyAdmin (it has fewer bugs)
Less advisable: strip all comments from your .sql file
windows notepad and other editor, change encoding of file.
for change it to utf-8 open your file with "notepad++" and use Encoding menu then select UTF-8
now save your file

SQL Error while importing Data From Excel [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I am importing Data from excel sheet. I am struggling with the following problems -
Executing (Error) Messages Error 0xc020901c: Data Flow Task 1: There was an error with output column "Intelligence" (21) on output "Excel
Source Output" (9). The column status returned was: "Text was
truncated or one or more characters had no match in the target code
page.". (SQL Server Import and Export Wizard)
Error 0xc020902a: Data Flow Task 1: The "output column "Intelligence" (21)" failed because truncation occurred, and the
truncation row disposition on "output column "Intelligence" (21)"
specifies failure on truncation. A truncation error occurred on the
specified object of the specified component. (SQL Server Import and
Export Wizard)
Error 0xc0047038: Data Flow Task 1: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Source - MainSheetData$" (1) returned error code 0xC020902A. The component returned a failure code when the pipeline engine called PrimeOutput().
The meaning of the failure code is defined by the component, but the
error is fatal and the pipeline stopped executing. There may be error
messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
I was banging my head against the wall with this same exact error.
Try importing into MS Access and then importing into SQL Server.
turns out it only checks first 8 rows or so of the Excel sheet..so if it decides length is 225 and later on encounters more than 225 chars an error occurs , what I did to solve the problem was make a first fake row containing the worst scenario (max of everything) and problem solved !
The first error is telling you that your source data for the Intelligence column is either longer than your target column or contains charachers that your target column cannot accept.
The second error is telling you that the Intelligence column is longer than your target column and therefore its failing. I expect this is the true issue.
You can either
expand the size of your target column to cover the larger input
or
switch the Error Output of the component to "Ignore failure" on Truncation
I was having the very same issue, and although I tried numerous suggestions from searching here, the option that worked for me was to convert the excel file to a CSV and use a bulk insert command instead.
This bypassed the need to edit mappings which wasn't working for me. I had a field that was not updating when I changed the field type.
Code below from this answer:
BULK INSERT TableName
FROM 'C:\SomeDirectory\my table.txt'
WITH
(
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO
Importing using CSV is difficult as the import process doesn't know the max length for any field. Therefore when it hits a row longer than the initial column length it errors.
Simply save your csv file as a excel workbook and re import. You'll need to delete an existing tables that were created before failute last time.
As it's excel, it can obtain the correct field length when creating the table.
I was getting the same error while importing from Excel to SQL Server 2008. I was able to do it by exporting from xlsx to csv and then importing the csv file into Sql Server. Yes, I had to adjust the columns length by hand but it worked just fine!
I was having the same problem and had to manually go through Excel to find the problem. One time saver, if you click Report -> View Report at the bottom, it will open up a new window. If you scroll all the way to the bottom of the report, it will tell you how many rows were processed. It doesn't necessarily mean that the problem is in the next row, but at least you can skip going through all the rows before that.
What I did next in Excel was take only the amount of characters that would fit into SQL (i.e. LEFT([Column], 255) and truncate the rest.
It is not ideal, but it worked in my case.
Export
You need to change "On Error" option to Ignore and "On Truncation" option to Ignore in Review Data Type Mapping.
This will solve the problem.
I am not sure, if anyone has tried this or not:
Copy the content of the file from excel .xls or whatever excel format it is in currently and paste it into new excel file as value. Save the file in .xlsx format and try importing again with sql server.
It will be a success!!
It is enough to place the biggest length in the first row. Then it functions.