DataPipeline: Use only first 4 values from CSV in pipeline - csv

I have a CSV, which has a variable structure, which I only want to take the first 4 values from. The CSV stored in S3 has between 7 and 8 fields in it, and I would like to take just the first 4. I have attempted to use the following prepared statement:
INSERT INTO locations (timestamp, item_id, latitude, longitude) VALUES (?, ?, ?, ?);
However I am getting:
Parameter index out of range (5 > number of parameters, which is 4).
Which I believe means that it is attempting to load in the other variables in the CSV. Is it possible to just take the first 4 values? Or otherwise deal with a variable length CSV?

Use transformSql option. You didn't mention what you are loading into, from redshift docs :
The SQL SELECT expression used to transform the input data. When you
copy data from DynamoDB or Amazon S3, AWS Data Pipeline creates a
table called staging and initially loads it in there. Data from this
table is used to update the target table. If the transformSql option
is specified, a second staging table is created from the specified SQL
statement. The data from this second staging table is then updated in
the final target table. transformSql must be run on the table named
staging and the output schema of transformSql must match the final
target table's schema.

Related

Trying to insert data into table with lookup output of no match

I'm extracting data from an Excel spreadsheet and transforming the data types inside my SSIS package to prepare the data for a specific destination table. The input data does not have a unique field, and when executing this task more than one time, the data from excel is pushed again into the table. I am using lookup transformation and inserting only unmatched records to table, but in my case while the lookup is successful the data is not inserting into the table.
I've checked source and destination data types, length, and mapping, but insertion is not happening. Please help me to resolve the issue.

Apache Drill Hbase Query - Audit Columns

How can I write queries to extract data on condition based on audit columns like TimeRange, etc.
I am actually looking to filter data based on timestamp i.e. basically extracting data for a specific version
As per Write Drill query output to csv (or some other format)
use dfs.tmp;
alter session set `store.format`='csv';
create table dfs.tmp.my_output as select * from cp.`employee.json`;

How to insert matlab data table into MySQL database

I have matlab data table T in my matlab with more than 40,000 rows. I want to insert this table into MySQL database. This table T has columns with different data types(char, date, integer). I tried following:
fastinsert(conn,'tablename',colnames2,T)
I even tried with "Insert" and datainsert". I converted table to cellarray, but still it didn't work. Then I tried to convert that cellarray into mat, but i couldn't convert it to matrix It says all the content should be of same data type to convert it into matrix.
Is there any way i can insert my data table present in matlab to MySQL database?
Instead of converting your data from a cell array to a matrix have you tried converting it to a table using cell2table() and then using insert(). There is an example in the MATLAB documentation that can be found here.
The linked example uses multiple data types in a cell and then converts them to a table (instead of a matrix) which can then be written to the database with mixed data types.

How to replace a Column simultaneously with LOAD INFILE in MySQL

Suppose we have table with a DECIMAL column with values, for example: 128.98, 283.98, 21.20.
I want to import some CSV Files to this table. However, in the columns of these files, I have values like 235,69, 23,23, with comma instead of points.
I know I can REPLACE that column, but is there some way of doing that before LOAD INFILE?
I do not believe you can simultaneously replace that column and load the data. Looks like you will have to do multiple steps to get the results you want.
Load the data first into a raw table using the LOAD INFILE command. This table can be identical to the main table. You can use the Create Table like command to create the table.
Process the data (i.e. change the comma to a . where applicable) in the raw table.
select the data from the raw table and insert into main table either with row by row processing or bulk insert.
This can all be done in a stored procedure (SP) or by a 3rd party script written in python, php, etc...
If you want to know more about SP's in Mysql, Here is a useful link.

Load xml data into sql database using phpmyadmin

I know this is a really basic question but I am struggling on my first import of data from an xml file. I have created the table "Regions" which has just two columns - ID and Name. The xml file contains the same column names.
In order to bulk import the data, I am using the following SQL command:
LOAD XML LOCAL INFILE 'C:\Users\Dell\Desktop\regions.xml'
INTO TABLE Regions (ID, Name)
but I am getting the error #1148 - The used command is not allowed with this MySQL version
Now having researched the internet, to allow this command requires a change in one of the command files but my service provider doesn't allow me access to it. Is there an alternative way to write the SQL code and do exactly the same thing as the code above which is basically just import the data from an xml file?
Many thanks
Since LOAD DATA INFILE isn't enabled for you, it appears you have only one more option and that's to create a set of INSERT statements for each row. If you converted your XML file to CSV using Excel, that's an easy step. Assuming you have a rows of data like this
A | B
-----|-------------------------
1 | Region 1
2 | Region 2
I would create a formula like this in column C
=CONCATENATE("INSERT INTO Regions(ID,Name) VALUES(",A1,",'",B1,"');")
This will result in INSERT INTO Regions(ID,Name) VALUES(1,'Region 1'); for your first row. File this down to the last row of your spreadsheet. Select all the insert statements and copy them into a Query text box inside PHPMyAdmin and you should be able to insert your values.
I've used this method many times when I needed to import data into a database.