SSIS CSV Commas in fields - sql-server-2008

Problem:
I am attempting to load a CSV file into our SQL Server 2008 database and having difficulty parsing the fields.
I have found instructions on reading the file using a Flat File Source and using double quotes as the text qualifier in order to count the fields with commas as a single string value, and have been able to successfully do this.
Data:
This file has three line items per account record. All account records have lines 1, 2, and 3, and 1's have 35 columns, 2's have 22, 3's have 37 columns. The columns are not a 1:1 relationship, so for example column 6 on record 1 is NOT the same data (or data type) as column 6 on record 2. Additionally, each line item needs to go into separate tables, so I have to process them individually.
My first step is to separate the 1, 2, and 3 records with a Derived Column transformation. The second step would be to parse the fields of each record type into their individual fields and this is where I encounter issues.
The biggest problem is that one (sometimes two) of the fields in the 1 record will SOMETIMES have an enclosed comma and sometimes not. So... sometimes SSIS parses the 1 records as 36 fields, sometimes 35, sometimes 34.
Attempted solutions:
If I use Flat File Source with quotes as text qualifier to import the file, then the preview doesn't recognize the {CR}{LF} between lines 1 & 2 and the columns don't line up right in the preview (columns 2 & 3 get run together) and I get errors when running. Alternatively I have tried using a Ragged Right file source, reading all data into one 4000-length string column and then parsing using FINDSTRING and SUBSTRING and counting delimiters, but this is where I encounter issues with the enclosed commas.
If I use Flat File, it doesn't seem to recognize the {CR}{LF} terminators correctly. If I use Ragged Right, I can't seem to count the delimiters correctly because of the enclosed commas. It's a Catch-22!
Any insight???

If I correctly understand that your flat file looks something like this:
"This Is Record 1, line 1, 34 other fields...
This Is Record 1, line 2, 22 other fields...
This Is Record 1, line 3, 37 other fields..."
"This Is Record 2, line 1, 34 other fields...
This Is Record 2, line 2, 22 other fields...
This Is Record 2, line 3, 37 other fields..."
Then you are not going to be able to handle this with a Flat File Source. You will need to use a Script Transformation source and write custom code to parse each line of the file and send it to the appropriate destination table.

Related

mysql load data from csv which contains line feed

I am dumping some csv data into mysql with this query
LOAD DATA LOCAL INFILE 'path/LYRIC.csv' INTO TABLE LYRIC CHARACTER SET euckr FIELDS TERMINATED BY '|';
When I did this, I can see follow logs from console.
...
[2017-09-13 11:24:10] ' for column 'SONG_ID' at row 3
[2017-09-13 11:24:10] [01000][1261] Row 3 doesn't contain data for all columns
[2017-09-13 11:24:10] [01000][1261] Row 3 doesn't contain data for all columns
...
I think csv got some line feed as a column data so it breaks all parsing process.
A single record in csv looks like ...
000001|2014-11-17 18:10:00|2014-11-17 18:10:00|If I were your September
I''d whisper a sunset to fly through
And if I were your September
|0|dba|asass|2014-11-17 18:10:00||||2014-11-17 18:10:00
So LOAD DATA pushes line 1 as a record and then try line 2 and so on, even if this is a single data.
How can I fix it? Should I request different type of the file to the client?
P.S. I am so new with this csv work.
Multiline fieds in csv should be surrounded with double quotes, like this:
000001|2014-11-17 18:10:00|2014-11-17 18:10:00|"If I were your September
I''d whisper a sunset to fly through
And if I were your September
"|0|dba|asass|2014-11-17 18:10:00||||2014-11-17 18:10:00
And any double quote inside that field should be escaped with another double quote.
Of course, the parser has to support (and maybe be instructed to use) multiline fields.

Skip invalid data when importing .tsv using MySQL LOAD DATA IFILE

I am trying to import a bunch of .tsv files into a MySQL database. However, in some of the files, there are errors in some of the rows (these files were generated from another system where data is manually inputted, so these errors are human errors). When I try to use LOAD DATA INFILE to import them, when the command gets to that row of bad data, the command writes NULL values for that field and then proceeds to stop the command, whereas I need it to keep going.
The bad rows look like this:
value1, value 2, value 3
bob, 3, st
john, 4, rd
dianne4ln
jack, 7, cir
I've made sure the line terminators are correct, and use Ignore and Replace parameters to no avail.
Use IGNORE in your query to skip error lines and proceed. See here and here

Importing csv file in Cassandra Database throws error Record #0 (line 1) has the wrong number of fields 1 instead of 7

AM using the copy method for copying the CSV file into the Cassandra tables.. But am getting records error of has wrong number of fields .
Query is ---COPY activity FROM 'Detail.csv' with HEADER=TRUE
i have my activity as column family with 7 fields
but in my csv file everything is separated by semicolon
Error is Record #0 (line 1) has the wrong number of Fields (1 instead of 7)
Above image is Screen Shot of CSV file
in my csv file everything is separated by semicolon
The default behavior of the COPY command uses a comma as a delimiter. Since your file is (apparently) semi-colon-delimited, it will see the entire row as one field (unless the data contains commas). Try setting the DELIMITER option in your WITH clause.
COPY activity FROM 'Detail.csv' WITH HEADER=TRUE AND DELIMITER=';';
And as a suggestion, I have always had more luck getting COPY to work properly when listing-out the columns to import:
COPY airplanes (name, manufacturer, year, mach) FROM 'temp.csv';

Mysql error 1261 (doesn't contain data for all columns) on last row with no empty values

I'm doing a lad data infile in MySQL through MySQL Workbench. I'm pretty new to SQL in general, so this may be a simple fix, but I can't get it to work. It is throwing a a 1261 Error (doesn't contain data for all columns) on the last row, but the last row (like the rest of the CSV) doesn't have any blank or null values.
I've looked around for help and read the manual, but everything I've seen has been about dealing with null values.
I exported the CSV from Excel, to the extent that maters.
The code I'm using to import is (I've changed the field, file, and table names to be more generic):
load data infile '/temp/filename.csv'
into table table1
fields terminated by ","
lines terminated by '\r'
ignore 1 lines
(Col1,Col2,Col3,Col4,Col5,col6,col7,Col8,Col9);
The first two columns are varchar and char, respectively with the remaining columns all formatted as double.
Here's the last few lines of the csv file:
364,6001.009JR,43.96,0,0,0,0,0,0
364,6001.900FM,0,0,0,0,0,0,0
364,6001.900JR,0,0,0,0,0,0,0
The only thing I can think of is that I'm supposed to have some signal after the last line to indicate that the file is finished, but I haven't found anything to indicate what that would be.
Any help would be appreciated
When I've had similar errors, it's because there were unexpected newlines inside my data (a newline in one row would look like two too-short rows, upon import).

Importing data from a csv file to mysql db

I am trying to load data from a csv file that is still in excel. So far this is the statement that I have as my sql query:
LOAD DATA LOCAL INFILE 'C:\\Documents and Settings\\J03299\\Desktop\\TMETER.csv'
INTO TABLE edata
COLUMNS TERMINATED BY ',' ENCLOSED BY "" LINES TERMINATED BY '\n'
(Year,Month,Day,MJD,xpiles,xstacks,Utilites);
The file is called TMETER and it has 7 columns. It has 366 rows. I am able to only read the first row and only the first four columns(till MJD) but everything else is null after that. Secondly in the second row it puts all the columns from my file (TMETER.csv) in row 124 into the first column of the second row in my edata table. I am confused as to
Why doesn't it read the data from column piles,stacks ,utilites? (mind you the column names in the csv file are weird and not the same as my edata table e.g in database table it is piles while in actual csv file it is x(piles), stacks in table but y(stacks) in csv file. Mysql doesn't not allow me to create etable names with this format so I had to improvise. Could this be why it is not reading and mapping from the csv file to the table in mysql?
Why is my statement putting the first row in my csv file in the first row in mysql table but then skipping all the down to row 124 and then inserting all columns from csv file into first column of mysql database table?
Sorry my English is not good.
Any assistance will be greatly appreciated.
When you run the command, it should give a message like Records: 1 Deleted: 0 Skipped: 0 Warnings: 0. If there are any warnings, type SHOW WARNINGS to see what they were.
This sounds like behavior I sometimes get when the line ending is wrong. Try opening your document in a code editor and checking to make sure your lines actually end with \n. If that's inconvenient, you could also just try dropping the table and reimporting with LINES TERMINATED BY '\r' or LINES TERMINATED BY '\r\n' and see if that fixes it.
Regarding field names: This command ignores field names in your text file. All it does is match the first column to the first field indicated in parentheses (in your case, Year) the second column to the second field in parentheses (Month), and so on. If you have field names at the top of your file, you should skip over them by adding IGNORE 1 LINES just before the parentheses with the list of fields.