I have a csv comma separated file containing hundreds of thousands of records in the following format:
3212790556,1,0.000000,,0
3212790557,2,0.000000,,0
Now using the SQL Server Import Flat file method works just dandy. I can edit the sql so that the table name and column names are something meaningful. Plus I also edit the data type from the default varchar(50) to int or decimal. This all works fine and sql import is able to import successfully.
However I am unable to do this same task using the Bulk Insert Query which is as follows:
BULK
INSERT temp1
FROM 'c:\filename.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
This query returns the following 3 errors which I have no idea how to resolve:
Msg 4866, Level 16, State 1, Line 1
The bulk load failed. The column is too long in the data file for row 1, column 5. Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
The purpose of my application is that there are multiple csv files in a folder that all need to go up in a single table so that I can query for sum of values. At the moment I was thinking of writing a program in C# that will execute the BULK insert in a loop (according for the number of files) and then return back with my results. I am guessing I dont need to write a code and that I can just write a script that does all of this - any one can guide me to the right path :)
Many thanks.
Edit: just added
ERRORFILE = 'C:\error.log'
to the query and I am getting 5221 rows inserted. Some times its 5222 some times its 5222 but it just fails beyond this point. Dont know whats the issue??? The CSV is perfectly fine.
SOB. WTF!!!
I cant believe that replacing \n with "0x0A" in the ROWTERMINATOR worked!!! I mean seriously. I just tried it and it worked. WTF moment!! Totally.
However what is a bit interesting is that the SQL Import wizard too only about 10 something seconds to import. The import query took well over a minute. Any guesses??
Related
I am trying to import a psv (pipe delimited csv) into Microsoft SQL Server 2008R2 Express database table.
There are only two fields in the psv, each field has more than 1000 characters.
In the import wizard, I have the following settings:
Double checked in the mapping:
Note I set the option of Ignore on fail/truncate:
and as usual, I get an error:
Error 0xc02020a1: Data Flow Task 1: Data conversion failed. The data
conversion for column "Comm" returned status value 4 and status text
"Text was truncated or one or more characters had no match in the
target code page.". (SQL Server Import and Export Wizard)
UPDATE:
So, following #Marc's suggestion, though very/extremely reluctant, I spent 3 hours or so to finally get SQL2014 installed on my computer and am hoping to import the psv. As expected, error shows up again:
I really cannot understand why company like Microsoft did not do thorough QAT on their products?!
After being tortured by Microsoft for the whole morning, I finally got this task done, for the future readers, you can follow the steps below to import a csv/psv data source into your sql:
Import the CSV/PSV to an Access Database. Note, must be saved to the mdb type (yes, the type from 20th century), you might want to read my story here: how to import psv data into Microsoft Access
In your SQL (mine is 2014), start the Import Wizard and select the data source type (ACCESS) and the file. Why you have to use mdb type of access database? Here you will see there is no option in SQL 2014 for accdb type of access database.
DO NOT forget to select the right Destination (yes, even though you started the wizard by right click on the destination database and chose Import), you want to select the last option: SQL Native Client 11.0. That will show up the SQL2014 and the database.
Now that the import can be completed as expected.
Thanks to the great design logic in this SQL (2014? No, essentially no change compared to 2008), what a humble expectation and requirement!!! it costs me 4-5 hours to complete.
Alternatively, you can use bulk insert to import any flat file.
if (object_id('dbo.usecase1') is not null)
drop table dbo.usecase1
go
create table dbo.usecase1
(
Descr nvarchar(2000) null,
Comm nvarchar(2000) null
)
go
bulk insert dbo.usecase1
from 'C:\tmp\usecase0.psv'
with (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
go
BULK INSERT (Transact-SQL)
It's my first time asking a question on here, please so bear with me
I am trying to create a data pipeline to upload a CSV file in an S3-Bucket to a MySQL database table(Production1) using the template provided by aws, but fails when executing RdsMySqlTableCreateActivity.
The sql statement that I'm using(all column names match the CSV file) in the myRDSTableInsertSql parameter:
INSERT INTO `Production1` (`API`, `Normalized Month`, `DATE`, `Monthly Liquid`, `Cum Oil`, `BOPD`, `Monthly Gas Mcf/Month`, `Cum Gas`, `MCFPD`) VALUES(?,?,?,?,?,?,?,?,?);
The RdsMySqlTableCreateActivity error:
errorId
ActivityFailed:SQLException
errorMessage
No value specified for parameter 1
errorStackTrace
amazonaws.datapipeline.taskrunner.TaskExecutionException:
private.com.amazonaws.services.datapipeline.redshift.QueryStatementException: Exception No value specified for
parameter 1 while executing INSERT INTO `Production1` (`API`, `Normalized Month`, `DATE`, `Monthly Liquid`, `Cum Oil`, `BOPD`, `Monthly Gas Mcf/Month`, `Cum Gas`, `MCFPD`) VALUES(?,?,?,?,?,?,?,?,?);...
I ran the insert command on MySQL workbench, replacing the (?,?,?,?,?,?,?,?,?) with (1,2,3,4,5,6,7,8,9), and it worked. The CSV file that I'm using only has 2 rows the column names and values 1-9 for each column respectively. Really not sure what it means by No value specified for parameter 1, any help/guidance would really be appreciated!!!
For anyone that runs into the same issue using the "Load S3 data into RDS MySQL table" template
My values for each parameter were the following
myRDSTableInsertSql:
INSERT INTO tableName(`col_name1`, `col_name2`, `col_name3`, `col_name4`, `col_name5`, `col_name6`, `col_name7`, `col_name8`, `col_name9`) VALUES(?,?,?,?,?,?,?,?,?);
myRDSTableName: tableName
myRDSCreateTableSql:
CREATE TABLE tableName(`col_name1` type, `col_name2` type, `col_name3` type, `col_name4` type, `col_name5` type, `col_name6` type, `col_name7` type, `col_name8` type, `col_name9` type);
The main issue was with the actual CSV file format, you have to make sure there is no header, and that the types are exactly the same. Also make sure that you're separators are "," and each value is not quoted within your CSV file.
This template is a good starting point but form more detailed/complex CSV files making your own datapipeline is a must!
I'm getting bit trying to dumpdata from a legacy db i've recently did a reverse engineering using django's inspectdb...
Other than this every query works fine. In MySQL workbench the column exists.
But when trying to export the data I get:
CommandError: Unable to serialize database: no such column: af_datper.locnac
Using traceback doesn't reveal any of my lines affecting (pasted here not to pollute http://dpaste.com/1DASN1V).
The model field already admits null values for that column and the column does exists in the database (among seeing it with the workbench, inspectdb wouldn't have picked it up...
I honestly don't know what else to do. Any takers?
A bit of digging into your traceback, I see this:
File "venv/lib/python3.5/site-packages/django/db/backends/utils.py",
line 64, in execute
return self.cursor.execute(sql, params) File "venv/lib/python3.5/site-packages/django/db/backends/sqlite3/base.py",
line 323, in execute
return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such column: af_datper.locnac
From the SQLite documentation:
If a schema-name is specified, it must be either "main", "temp", or
the name of an attached database. In this case the new table is
created in the named database. If the "TEMP" or "TEMPORARY" keyword
occurs between the "CREATE" and "TABLE" then the new table is created
in the temp database. It is an error to specify both a schema-name and
the TEMP or TEMPORARY keyword, unless the schema-name is "temp". If no
schema name is specified and the TEMP keyword is not present then the
table is created in the main database.
In short, sqlite doesn't support foo.bar as a column name, unless foo is the name of the database or one of main or temp.
I've created an SSIS package that executes inline SQL queries from our database and is supposed to output the contents to a text file. I originally had the text file comma delimited, but changed to pipe delimted after researching the error further. I also did a substring of the FirstName field and ensure that the SSIS placeholder fields matched in length. The error message is as follows:
[Customers Flat File [196]] Error: Data conversion failed. The data conversion for
column "FirstName" returned status value 4 and status text "Text was truncated or one or more
characters had no match in the target code page.".
The SQL statement I'm using in my OLE DB Source is as follows:
SELECT
dbo.Customer.Email, SUBSTRING(dbo.Customer.FirstName, 1, 100) AS FirstName,
dbo.Customer.LastName, dbo.Customer.Gender,
dbo.Customer.DateOfBirth, dbo.Address.Zip, dbo.Customer.CustomerID, dbo.Customer.IsRegistered
FROM
dbo.Customer INNER JOIN
dbo.Address ON dbo.Customer.CustomerID = dbo.Address.CustomerID
What other fixes should I put in place to ensure the package runs without error?
Have you tried to run this query in SSMS? If so, did you get a successful result?
If you havent tried it yet, paste this query in a new SSMS window and wait for it to complete.
If the Query completes, then we dont have a problem with the query. Something could be off inside the package.
But if the query does not finish up and fails, you know where to look.
EDIT
On second thoughts, is your Customer source a flat file or something? It looks like there is a value in the Customer table/file which does not match with the output metadata of the source. Check your source again.
I have a linked server (Sybase) set up in SQL Server from which I need to draw data. The Sybase server sits on the other side of the world, and connectivity is pretty shoddy. I would like to insert data into one of the SQL Server tables in manageable batches (e.g. 1000 records at a time). I.e I want to do;
INSERT IN [SQLServerTable] ([field])
SELECT [field] from [LinkedServer].[DbName].[dbo].[SybaseTable]
but I want to fetch 1000 records at a time and insert them.
Thanks
Karl
I typically use python with the pyodbc module to perform batches like this against a SQL server. Take a look and see if it is an option, if so I can provide you an example.
You will need to modify a lot of this code to fit your particular situation, however you should be able to follow the logic. You can comment out the cnxn.commit() line to rollback the transactions until you get everything working.
import pyodbc
#This is an MS SQL2008 connection string
conn='DRIVER={SQL Server};SERVER=SERVERNAME;DATABASE=DBNAME;UID=USERNAME;PWD=PWD'
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rowCount=cursor.execute('SELECT Count(*) from RemoteTable').fetchone()[0]
cnxn.close()
count=0
lastID=0
while count<rowCount:
#You may want to close the previous connection and start a new one in this loop. Otherwise
#the connection will be open the entire time defeating the purpose of performing the transactions in batches.
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rows=cursor.execute('SELECT TOP 1000 ID, Field1, Field2 FROM INC WHERE ((ID > %s)) ' % (lastID)).fetchall()
for row in rows:
cursor.execute('INSERT INTO LOCALTABLE (FIELD1, FIELD2) VALUES (%s, %s)' % (row.Field1, row.Field2))
cnxn.commit()
cnxn.close()
#The [0] assumes the id is the first field in the select statement.
lastID=rows[len(rows)-1][0]
count+=len(rows)
#Pause after each insert to see if the user wants to continue.
raw_input("%s down, %s to go! Press enter to continue." % (count, rowCount-count))