S3 to MySQL AWS Data Pipeline Insert table error - mysql

It's my first time asking a question on here, please so bear with me
I am trying to create a data pipeline to upload a CSV file in an S3-Bucket to a MySQL database table(Production1) using the template provided by aws, but fails when executing RdsMySqlTableCreateActivity.
The sql statement that I'm using(all column names match the CSV file) in the myRDSTableInsertSql parameter:
INSERT INTO `Production1` (`API`, `Normalized Month`, `DATE`, `Monthly Liquid`, `Cum Oil`, `BOPD`, `Monthly Gas Mcf/Month`, `Cum Gas`, `MCFPD`) VALUES(?,?,?,?,?,?,?,?,?);
The RdsMySqlTableCreateActivity error:
errorId
ActivityFailed:SQLException
errorMessage
No value specified for parameter 1
errorStackTrace
amazonaws.datapipeline.taskrunner.TaskExecutionException:
private.com.amazonaws.services.datapipeline.redshift.QueryStatementException: Exception No value specified for
parameter 1 while executing INSERT INTO `Production1` (`API`, `Normalized Month`, `DATE`, `Monthly Liquid`, `Cum Oil`, `BOPD`, `Monthly Gas Mcf/Month`, `Cum Gas`, `MCFPD`) VALUES(?,?,?,?,?,?,?,?,?);...
I ran the insert command on MySQL workbench, replacing the (?,?,?,?,?,?,?,?,?) with (1,2,3,4,5,6,7,8,9), and it worked. The CSV file that I'm using only has 2 rows the column names and values 1-9 for each column respectively. Really not sure what it means by No value specified for parameter 1, any help/guidance would really be appreciated!!!

For anyone that runs into the same issue using the "Load S3 data into RDS MySQL table" template
My values for each parameter were the following
myRDSTableInsertSql:
INSERT INTO tableName(`col_name1`, `col_name2`, `col_name3`, `col_name4`, `col_name5`, `col_name6`, `col_name7`, `col_name8`, `col_name9`) VALUES(?,?,?,?,?,?,?,?,?);
myRDSTableName: tableName
myRDSCreateTableSql:
CREATE TABLE tableName(`col_name1` type, `col_name2` type, `col_name3` type, `col_name4` type, `col_name5` type, `col_name6` type, `col_name7` type, `col_name8` type, `col_name9` type);
The main issue was with the actual CSV file format, you have to make sure there is no header, and that the types are exactly the same. Also make sure that you're separators are "," and each value is not quoted within your CSV file.
This template is a good starting point but form more detailed/complex CSV files making your own datapipeline is a must!

Related

SAS pass through - Extract from MySQL does not work

I'm trying to build a Data Integration job uses pass through to extract data from a view in a MySQL database.
Wev'e been using pass through a lot in the project, mostly extracting data from Redshift,
however with MySQL I was not able to do make it work properly.
It keeps complaining a table is missing even though when pass through is off, view is found and data is extracted...
tried every trick I know, starting from enabling case-sensitive DBMS object names, to manually remove single/double quotes from the statement just in case MySQL confuses confuses it with something else...
No luck.
ODBC driver is [MySQL][ODBC 5.3(a) Driver][mysqld-5.5.53].
Ran on a Windows environment.
Any idea how to solve this?
Thank you in advance.
EDIT
So, first of all, one correction (even though not that important - I extract from a view, not a table).
This is the code generated by SAS Create Table transformation, pass through enabled. I only put an asterisk instead of the full list of columns:
proc sql;
connect to ODBC
(
READBUFF=10000 DATASRC="cmp.web_api" AUTHDOMAIN="MYSQL_CMP_Auth"
);
create table work."W7ZZZKOC"n as
select
*
from connection to ODBC
(
select
V_BI_ACCOUNT.ACCOUNT_NAME,
V_BI_ACCOUNT.ACQUISITION_SOURCE__C,
V_BI_ACCOUNT.ZUORA__ACTIVE__C,
V_BI_ACCOUNT.ADDRESS_LINE_1__C,
V_BI_ACCOUNT.ADDRESS_LINE_2__C,
V_BI_ACCOUNT.ADDRESS_LINE_3__C,
V_BI_ACCOUNT.AGREEMENT_DATE,
V_BI_ACCOUNT.AGREEMENT_LEGAL_CLAUSE_1__C,
V_BI_ACCOUNT.AGREEMENT_LEGAL_CLAUSE_2__C,
V_BI_ACCOUNT.PERSONBIRTHDATE,
V_BI_ACCOUNT.BLOCKED_REASON__C,
V_BI_ACCOUNT.BRAND__C,
V_BI_ACCOUNT.CPN__C,
V_BI_ACCOUNT.ACCCREATEDBYID,
V_BI_ACCOUNT.ACCCREATEDDATE,
V_BI_ACCOUNT.CURRENCY_PREFERENCE__C,
V_BI_ACCOUNT.CUSTOMER_FULL_NAME__PC,
V_BI_ACCOUNT.ACCOUNTID,
V_BI_ACCOUNT.ZUORA__CUSTOMERPRIORITY__C,
V_BI_ACCOUNT.DELIVERY_SALUTATION__C,
V_BI_ACCOUNT.DISPLAY_NAME,
V_BI_ACCOUNT.PERSONEMAIL,
V_BI_ACCOUNT.EMAILKEY__C,
V_BI_ACCOUNT.FACEBOOKKEY,
V_BI_ACCOUNT.FIRSTNAME,
V_BI_ACCOUNT.GENDER__C,
V_BI_ACCOUNT.PHONE,
V_BI_ACCOUNT.ACCLASTACTIVITYDATE,
V_BI_ACCOUNT.ACCLASTMODIFIEDDATE,
V_BI_ACCOUNT.LASTNAME,
V_BI_ACCOUNT.OTHER_EMAIL__C,
V_BI_ACCOUNT.PI_TYPE__C,
V_BI_ACCOUNT.ACCPARENTID,
V_BI_ACCOUNT.POSTCODE__C,
V_BI_ACCOUNT.PRIMARY_ACCOUNT_OF_THIS_CUSTOMER,
V_BI_ACCOUNT.ACCPRIMARY__C,
V_BI_ACCOUNT.ACCREASON_FOR_STATUS__C,
V_BI_ACCOUNT.ZUORA__SLA__C,
V_BI_ACCOUNT.ZUORA__SLASERIALNUMBER__C,
V_BI_ACCOUNT.SALUTATION,
V_BI_ACCOUNT.ACCSYSTEMMODSTAMP,
V_BI_ACCOUNT.PERSONTITLE,
V_BI_ACCOUNT.ZUORA__UPSELLOPPORTUNITY__C,
V_BI_ACCOUNT.X_CODE__C,
V_BI_ACCOUNT.ZUORA__ACCOUNT_ID__C,
V_BI_ACCOUNT.ZUORA__PAYMENTMETHODID__C,
V_BI_ACCOUNT.CITY,
V_BI_ACCOUNT.ORIGINAL_CREATED_DATE,
V_BI_ACCOUNT.SOURCE_SYSTEM_ID,
V_BI_ACCOUNT.STATUS,
V_BI_ACCOUNT.ZUORA__CONTACT_ID,
V_BI_ACCOUNT.ACCISDELETED,
V_BI_ACCOUNT.BILLING_ACCOUNT_NAME,
V_BI_ACCOUNT.ACZCREATEDDATE,
V_BI_ACCOUNT.ACZSYSTEMMODSTAMP,
V_BI_ACCOUNT.ACZLASTACTIVITYDATE,
V_BI_ACCOUNT.ZUORA__ACCOUNT__C,
V_BI_ACCOUNT.ZUORA__ACCOUNTNUMBER__C,
V_BI_ACCOUNT.ZUORA__AUTOPAY__C,
V_BI_ACCOUNT.ZUORA__BALANCE__C,
V_BI_ACCOUNT.ZUORA__CREDITCARDEXPIRATION__C,
V_BI_ACCOUNT.ZUORA__CURRENCY__C,
V_BI_ACCOUNT.ZUORA__MRR__C,
V_BI_ACCOUNT.ZUORA__PAYMENTTERM__C,
V_BI_ACCOUNT.ZUORA__PURCHASEORDERNUMBER__C,
V_BI_ACCOUNT.ZUORA__LASTINVOICEDATE__C,
V_BI_ACCOUNT.COUNTRY_NAME,
V_BI_ACCOUNT.COUNTRY_CODE,
V_BI_ACCOUNT.FAVOURITE_FOOTBALL_CLUB,
V_BI_ACCOUNT.COUNTY
from
web_api.V_BI_ACCOUNT as V_BI_ACCOUNT
);
%rcSet(&sqlrc);
disconnect from ODBC;
quit;
And again, when I extract data without pass through - works successfully,
I found out the problem was a column name exceeds 32 positions.
As SAS supports up column names up to 32,
the query fails to find PRIMARY_ACCOUNT_OF_THIS_CUSTOMER as the original column name is PRIMARY_ACCOUNT_OF_THIS_CUSTOMER__C.
EDIT
One more thing I found out is, MySQL doesn't like specifying schema name nor aliases.
Therefore,
From clause to only specify table name i.e : 'from v_bi_account' rather than 'web_api.v_bi_account'
and do not use aliases i.e use 'from v_bi_account' rather than 'from v_bi_account as v_bi_account'
Thank you guys so much for your help.

SSIS Import a date and time from a txt to a table datetime

So I want to import a datetime from a txt:
2015-01-22 09:19:59
into a table using a data flow. I have my Flat Source File and my destination DB set up fine. I changed the data type for the txt input for that column in the advanced settings and the input and output properties to:
database timestamp [DT_DBTIMESTAMP]
This is the same data type as the DB used for the table so this should work.
However, when I execute the package I get a error saying the data conversion failed... How do I make this possible?
[Import txt data [1743]] Error: Data conversion failed. The data conversion for column "statdate" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
[Import txt data [1743]] Error: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The "output column "statdate" (2098)" failed because error code 0xC0209084 occurred, and the error row disposition on "output column "statdate" (2098)" specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure.
[Import txt data [1743]] Error: An error occurred while processing file "C:\Program Files\Microsoft SQL Server\MON_Datamart\Sourcefiles\tbl_L30T1.txt" on data row 14939.
On the row he is giving the error the datetime is filled up with spaces. that is why on the table the "allow nulls" is checked but my SSIS package gives the error for some reason... can I somewhere tell the package to allow nulls aswell?
I suggest you import the data in to a character field and then parse it after entry.
The following function should help you:
SELECT IsDate('2015-01-22 09:19:59')
, IsDate(Current_Timestamp)
, IsDate(' ')
, IsDate('')
The IsDate() function returns a 1 when it thinks the value is a date and a 0 when it is not.
This would allow you to do something like:
SELECT value_as_string
, CASE WHEN IsDate(value_as_string) = 1 THEN
Cast(date_as_string As datetime)
ELSE
NULL
END As value_as_datetime
FROM ...
I solved it Myself. Thank you for your suggestion gvee but the way I did it is way easier.
In the Flat File Source when making a new connection in the advanced tab I fixed all the data types according to the table in the database EXCEPT the column with the timestamp (in my case it was called "statdate")! I changed this data type to a STRING because otherwise my Flat File Source would give me a conversion error even before any scripts would have been able to be executed and the only way arround this was setting the error output to ignore failure wich I don't want. (You still have to change the data type after you set it to a string in the advanced settings by right clicking the flat file source -> show advanced editor -> going to the output colums and changing the data type there from Date to string.)
After the timestamp was set to a string I added a Derived Column with this expression to delete all the spaces and give it then "NULL" value:
TRIM(<YourColumnName>) == "" ? (DT_STR,4,1252)NULL(DT_STR,4,1252) : <YourColumnName>
Next I added a Data Conversion to set the string back to a timestamp. The Data conversion is finally connected to the OLE DB Destination.
I hope this helps anyone with the same problem in the future.
End result: Picture of data flow

SQL Bulk Insert CSV

I have a csv comma separated file containing hundreds of thousands of records in the following format:
3212790556,1,0.000000,,0
3212790557,2,0.000000,,0
Now using the SQL Server Import Flat file method works just dandy. I can edit the sql so that the table name and column names are something meaningful. Plus I also edit the data type from the default varchar(50) to int or decimal. This all works fine and sql import is able to import successfully.
However I am unable to do this same task using the Bulk Insert Query which is as follows:
BULK
INSERT temp1
FROM 'c:\filename.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
This query returns the following 3 errors which I have no idea how to resolve:
Msg 4866, Level 16, State 1, Line 1
The bulk load failed. The column is too long in the data file for row 1, column 5. Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
The purpose of my application is that there are multiple csv files in a folder that all need to go up in a single table so that I can query for sum of values. At the moment I was thinking of writing a program in C# that will execute the BULK insert in a loop (according for the number of files) and then return back with my results. I am guessing I dont need to write a code and that I can just write a script that does all of this - any one can guide me to the right path :)
Many thanks.
Edit: just added
ERRORFILE = 'C:\error.log'
to the query and I am getting 5221 rows inserted. Some times its 5222 some times its 5222 but it just fails beyond this point. Dont know whats the issue??? The CSV is perfectly fine.
SOB. WTF!!!
I cant believe that replacing \n with "0x0A" in the ROWTERMINATOR worked!!! I mean seriously. I just tried it and it worked. WTF moment!! Totally.
However what is a bit interesting is that the SQL Import wizard too only about 10 something seconds to import. The import query took well over a minute. Any guesses??

SSIS Package Fails on Status Code 4

I've created an SSIS package that executes inline SQL queries from our database and is supposed to output the contents to a text file. I originally had the text file comma delimited, but changed to pipe delimted after researching the error further. I also did a substring of the FirstName field and ensure that the SSIS placeholder fields matched in length. The error message is as follows:
[Customers Flat File [196]] Error: Data conversion failed. The data conversion for
column "FirstName" returned status value 4 and status text "Text was truncated or one or more
characters had no match in the target code page.".
The SQL statement I'm using in my OLE DB Source is as follows:
SELECT
dbo.Customer.Email, SUBSTRING(dbo.Customer.FirstName, 1, 100) AS FirstName,
dbo.Customer.LastName, dbo.Customer.Gender,
dbo.Customer.DateOfBirth, dbo.Address.Zip, dbo.Customer.CustomerID, dbo.Customer.IsRegistered
FROM
dbo.Customer INNER JOIN
dbo.Address ON dbo.Customer.CustomerID = dbo.Address.CustomerID
What other fixes should I put in place to ensure the package runs without error?
Have you tried to run this query in SSMS? If so, did you get a successful result?
If you havent tried it yet, paste this query in a new SSMS window and wait for it to complete.
If the Query completes, then we dont have a problem with the query. Something could be off inside the package.
But if the query does not finish up and fails, you know where to look.
EDIT
On second thoughts, is your Customer source a flat file or something? It looks like there is a value in the Customer table/file which does not match with the output metadata of the source. Check your source again.

inserting into userdefined data in sql server

Hi I am using SQL server 8.0 for my database. I dont know how to insert user define data.
This is my table.
column name: data type length allow_nulls
study_type_id int 4
study_type_name UD_NAME(varchar) 150
study_type_abbrev UD_NAME_SHORT(varchar) 50
order int
UD_NAME and UD_NAME_SHORT are user defined data types in sql enterprise manager. base type is varchar.
when i used insert command as below,
INSERT into study_type VALUES (15, 'test', 'TT',100)
It gives me "Implicit_conversion_error" I could not see ASP webpage link to that table.
And
INSERT into study_type (study_type_id, study_type_name, study_type_abbrev, order)
VALUES (15,CAST('test' as UD_NAME),CAST('TT' as UD_NAME_SHORT),100)
Then it said "type UD_NAME is not defined system type."
You cannot use user defined types in CAST and CONVERT functions