Data Services CSV Flat File there should be a column delimiter after column [n] - csv

I'm really struggling with this one. Data services (v14.2.3.549) keeps flagging up an error saying "A column delimiter was seen after column number <80> for row number <1> in file " it says this for what looks like every row it processes.
I've used the same settings as all the last files I imported, which are also CSV files. The files are exported from a web front end as excel then saved as csv. I tried opening the file with excel, clearing empty columns after end of data, in case there was anything in them, and rerunning to no avail.
I don't really know what to look for in the file so can anyone help me find out what I should be looking for so I can map my way to the problem. It seems that this problem is throughout this collection of files, as if I try importing using wild card on end of file name it comes up with same errors in other files.
Many thanks
Andrew

I used "Adaptable Schema" set to "yes" in the file format definition to get around this error.

Related

extract sub-headers in R or Python

desperate newbie here. I have a question for which I just cannot find the right solution. I received a dta file from which I want to extract the sub-headers of each column. Unfortunately, I am not versed in Stata or have access to it. I read my dta file into R and changed it to a data frame and also data table. It displays the column names and sub-headers well. However I cannot extract the sub-headers and they also disappear when I save the data frame or table as a csv or excel file locally. When I call colnames(df) or names(df), I only receive the column names and not the sub-headers. I also tried it with python without luck. Unfortunately, I am not allowed to share the data. So I hope my problem is understandable without an example. Thank you in advance!

Pipeline unable to read field of plain text file

Using Apache Hop latest version I'm trying to read in a plain text file. This text file is old and basically only structured by its lines (it has no delimiter, no seperator, no enclosure, etc.). I would like to read and process the lines of this file as rows in my transformation.
I use the "Text file input" transformation to read the file. Apparently reading it works, but I seem have no field available when trying to retrieve the fields. It simply states that no fields were found.
When I run the "preview records" I do get empty records equal to the number if lines in the file, so that is good. However there is no data shown as there is no field detected.
Curiously enough, when I press "Show file content" I DO get the desired content, nicely structured in the rows as desired, so I know the file is being read correctly.
Does anyone know how to best read these kind of files?
PS: The files can be anywhere from 10 to 100000 lines.
When there is no header row with field names or Hop is not able to detect any fields you can also create a field in the fields tab and it will put content in there.
As we just use a position based approach and split the content using the specified delimiter everything should go in "field1" when no delimiter is found in the data.
Figured it out. The naming is a bit misleading, but you can use the "CSV File input" and then set a TAB as delimited. Then use preview on your file and you should find that the lines are actually being parsed.

SSIS Errors for simple CSV Data Flow

Sorry to darken your day with my troubles, but SSIS has broken me! I am new to SSIS and I just seem to be misunderstanding it.
For background: I have a few versions of a basic package that includes a Foreach Loop container and a Data Flow with a few Derived Columns that imports CSV files into a SQL Server Staging table. It is very straightforward and does include an Execute SQL task and a File Move but those work fine. The issues are with the Foreach loop and the Data Flow.
I have one version of this package (let’s call it “A”) that seemed to be working fine. It would process multiple files in a folder, insert records into the staging table, properly execute the SQL Statements, and move the files to Archive. Everything seemed fine until I carefully QA’d the process. Turns out it was duplicating the data from one file, and never importing the data from a second Source File! Yet, the second/dupe round of data included the Source Filename (via a derived column) of the second file (but the data from the first). So it looked like I had successfully processed BOTH files until I looked at the actual data and saw that none of the values from the second source file were ever written to the Staging table.
Once I discovered this, I figured that the problem was in the Foreach loop and how I setup the different file path & name variables. So, I decided to try to make a new version of the package. I started by copying package A and created package B. In B, I deleted the Source Connection manager and created a new Connection Manager along with all new file & path variables. I then tried to cleanup/fix/replace various elements in my Data Flow and Foreach loop. In the process, I discovered that the Advanced Mappings from A – which DID work – were virtually all setup as String (even the Currency and Date columns). That did not seem right, so I modified each source money column by changing to data type Currency, and changed each date-related column to data type Date.
What followed has been dozens and dozens of Errors and I cannot get Package B to run. I have even changed all of the B data types back to String (mirroring the setup in Package A which DID work). But, still no joy.
This leads me to ask a few questions to those of you smarter than I:
1) Why can’t SSIS interpret Source CSV data using the proper data type? I.e. why do I need to set every Input column as a STRING when some columns are clearly & completely Numeric, Currency or Dates? (Yes, the Source CSV files are VERY clean – most don’t even have NULLS)
a. When I do change the Advanced mapping for a date-related Source column to Date, I get the ever present error message: [Flat File Source [30]] Error: Data conversion failed. The data conversion for column "Settle Date" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
2) When I reset the data types back to String in package B, I still get errors – usually Truncation errors (and Yes – I have adjusted the length to 250 in one of these columns).
a. Error Message: "The value could not be converted because of a potential loss of data.".
b. When I reset the Mappings to ignore the column (as a test), it throws a similar error at the next column.
3) Any ideas why Package A would dupe a file’s data and not process the second file, yet throw no errors and move both to Archive?
4) Why does the Data Viewer appear to have parsing errors (it shows data in the wrong columns) but when you use the Copy data feature in the data viewer and paste it into Excel, all of the data lines up perfectly?
5) Are there any tips & tricks that a rookie SSIS user needs to understand and which might not be apparent through the documentation and searching web articles as well as this site?
I can provide further details if they will help, but these packages are really very simple and should not be causing me this much frustration.
THANKS for any insights.
DGP
Wow seems like you have a lot of ssis issues... I think the reason for the same file being extracted is because of the the way your 'variable mappings' is defined.
Have you had a look and followed this guide:
https://www.simple-talk.com/sql/ssis/ssis-basics-introducing-the-foreach-loop-container/
Hope this helps.
Shaheen
Thanks Tab & Shaheen,
To all SSIS rookies - please learn from my mistakes!
It appears that my issue was actually in how I identified the TEXT QUALIFIER in the Connection Manager. I had entered "" and that was causing problems with how my columns were being parsed. The parsing issues caused unexpected values to appear in some of the columns and that was causing the errors in the package.
When I tried changing the the Text Qualifier to only ONE double quote - " - the whole thing worked!
As I mentioned - and as Shaheen suspected - my initial issues with the duplicate processing was probably due to how I setup the foreach loop. I had already fixed that, bit was still getting errors until I fixed the Text Qualifier.
I have only tested it a few times but it looks like that was the issue.
Thanks for the contributions.
DGP

Ms-Access trying to use "transfer text" to create a csv file with a unique filename

I am trying to use an automated macro to export a Ms-Access table to a csv file. I want the destination file to have a unique name, and I reckoned that using now()yyyymmddhhnn would be a good way to achieve this.
I have got transfer text working ok from my macro, and I have set up an export file spec for the transfer.
I am using ="C:\batchfile_" & Format(Now(),"yyyymmddhhnn") & ".csv" in the filename argument in the macro. This bit works.
But when I try to run the macro, it tells me that the filename doesn't exist and then the export doesn't complete. I am not sure why this is, but I think it is because the export file specification is expecting the destination file to have the same filename and column structure as the source table.
Does anyone know a way around this?
Eric
This is very old thread, I am posting my solution so that it may be usefull for some one else
transfer text works fine, as long as variables are supplied properly, you can check for other options other than filename, datasource alternatively create using file open statement
by opening text file and convert recordset data into CSV format.

Is there any way to reorder fields in an SSIS flat file source?

I have an SSIS package using a tab delimited flat file source with a TON of fields. Recently the provider of the tab delimited flat file has decided to change the format of the flat file by sprinkling a couple dozen new fields at random into the file. Needless to say, this hosed the package.
Rather than rebuild another flat file source and redefine all the fields, types, and lengths all over again, is there a way to reorder the fields in the flat file source? Sure would have been nice if Microsoft allowed you to move the fields around in the Advanced Columns pane, but noooooo.
Any help is appreciated.
If you only need to add columns to your file, you can do that in the Flat File connection editor. In the advanced window, you can select the field next to the new one and click the chevron next to the New button. It will give you the choice insert before or insert after.
If you truly have to move things around, you'll need to edit the XML source. If you use the existing file definition as a guide, you can build the new one in Excel or T-SQL relatively easily. Easier than typing everything in all over again at least.
I had a similar issue: I needed to change the order of columns in my flat file destination. The time-saving approach I settled on:
Delete the FF destination and FF connection manager (note down file name/location!),
Clear the check boxes that enable output columns in the source component
Re-enable the columns in the order you want
Add a new FF destination and FF connection right from the FF destination's connection manager drop-down.
Review/sanity check column sizes in FF connection, as usual
Not a direct answer to the question, but I came here looking for advice on "how to rearrange flat file destination columns", perhaps this will help someone.
I haven't seen an solution for that problem. SSIS isn't very strong in changing metadata. You could try to do it in notepad, but that is very tricky and very buggy. I would not recommand that to you.
In the connection managers below of your IDE you can double click your file name and edit everything you want.
This is still a "feature" of SSIS. To work around this I create a flat file connection called "NULL" with a single column named "NULL". Use the "New" button to add the column. I change the default column name from "Column 0" to "NULL". This column name must not match any column name in the list to be re-populated. If you have a real column named "NULL", pick something else for the column name that's not in use. You can keep the "NULL" flat file connection in the project for later use. (I expect to need it a few more times in this project.)
For this example, I use a flat file destination. Change the Flat File Destination to use the NULL connection.
Check the mapping to see there are no columns mapped. Saving this resets the metadata stored for the mapping.
Finally, change the Flat File Destination back to the correct connection to get a new mapping without metadata interference.
My example is a flat file destination. It should work for a flat file source for resetting the metadata. It is similar to the trick of changing a query to "select 1 as [NULL]" and back to purge metadata when using a ODBC source or such.
you could probably try something, but i havent tested.. use expressions to set everything for your flat file source? turn design time validation off