Read csv file in SQL - mysql

I have a csv file which I want to directly use without creating a table. Is there a way to read and manipulate it directly?

As long as you can connect to the server, you will be able to create temp table.
For Microsoft SQL;
declare #TempTable csvtable
(
firstCol varchar(50) NOT NULL,
secondCol varchar(50) NOT NULL
)
BULK INSERT #TempTable FROM 'PathToCSVFile' WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')
GO
For MySQL;
CREATE TEMPORARY TABLE csvtable
LOAD DATA INFILE 'PathToCSVFile'
INTO TABLE csvtable
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

I know this is tagged as MySQL, but I think the question is you just want to run SQL queries on a CSV. If you are open to using Python, you can use FugueSQL to do that.
A sample Python snippet would be:
from fugue_sql import fsql
query = """
df = LOAD "/path/to/myfile.csv"
SELECT *
FROM df
WHERE col > 1
PRINT
"""
fsql(query).run()
and this will use Pandas to run the query by default. There is also a SAVE keyword so you can save the output to another file.

Related

Insert records from array in MySQL query

I have an array of 1500000 records in it as below
Array = ["2","3","6","7","A5057",......]
How would I be able to insert all this records in a table(which has only one field(XXX_id)) directly from the MySQL command I tried with the below query
INSERT INTO TABLE_NAME (XXX_id) VALUES (["2","3","6","7","A5057",......]);
If we have to insert from php script, no doubt we can follow this question form community.
You can create array format like this
INSERT INTO `TABLE_NAME`(`XXX_id`) VALUES (1),(2),(3),(4)
Read more
There are actually two ways you can do this
1.LOAD INFILE
LOAD DATA LOCAL INFILE
'path/file.csv'
INTO TABLE giata_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
2.Use multi-row insert
INSERT INTO `TABLE_NAME`(`XXX_id`) VALUES (1),(2),(3),(4)
You can use multi-row insert to achieve what you want
Just format your array as VALUES (X), (X), ... with PHP
Put the array data into a file, with one ID per line. Then you can use LOAD DATA INFILE:
LOAD DATA INFILE 'filename'
INTO TABLE table_name (XXX_id)
Your Query should be like this:
INSERT INTO TABLE_NAME (XXX_id) VALUES (2),(3),(6),(7),(A5057),......;
But you are going to insert larege amount of rows with just one Query. So, it may exceeds to MySQL query limitation. Every query is limited by max_allowed_packet in general for MySQL.
1) Execute the following command in in MySQL to view default value for 'max_allowed_packet ':
show variables like 'max_allowed_packet';
2) Standard MySQL installation has a default value of 1048576 bytes (1MB). But this can be increased by setting the higher value to 500MB or may be more:
SET GLOBAL max_allowed_packet=524288000;
3) Check max_allowed_packet value again by following 1) command in MySQL.
Hope this helps you.
we can use LOAD DATA LOCAL INFILE
LOAD DATA LOCAL INFILE
'path/file.csv'
INTO TABLE giata_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
This will fetch all the records in a table we have to make sure that the field name in the table and the heading in the csv file matches the same.
If this didn't works for you then you can follow the #Plotisateur and #Aiyaz khorajia answer as
INSERT INTO `TABLE_NAME`(`XXX_id`) VALUES (1),(2),(3),(4)
$string = ' (" '.implode(' "),(" ',$array).' ") ';
$query = "INSERT INTO TABLE_NAME (XXX_id) VALUES {$string}";
IF you have large number of records, I would suggest you to make a bunch of 100-200 records and then insert:
for($i=0; $i<sizeof($array); $i++){
$string = $string==""?'('.$array[$i].')':$string.',('.$array[$i].')';
if($i % 100 == 0){
$query = INSERT INTO TABLE_NAME (XXX_id) VALUES ($string);
// execute the query
$string = "";
}
}
if($string != ""){
$query = INSERT INTO TABLE_NAME (XXX_id) VALUES ($string);
// execute the query
}
do the following stuff in your procedure to get rid of your problems:
-- Temporary table to hold splited row value of string
drop temporary table if exists temp_convert;
create temporary table temp_convert(split_data varchar(2056) );
-- dynamic query to break comma separated string as rows and
insert into column
set #sql = concat("
insert into temp_convert (split_data)
values ('", replace(
( select group_concat(in_string) as converted_data), ",", "'),('"),"'
);"
);
prepare stmt1 from #sql;
execute stmt1;

output hive query result as csv enclosed in quotes

I have to export data from a hive table in a csv file in which fields are enclosed in double quotes.
So far I am able to generate a csv without quotes using the following query
INSERT OVERWRITE DIRECTORY '/user/vikas/output'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
SELECT QUERY
The output generated looks like
1,Vikas Saxena,Banking,JL5
However, I need the output as
"1","Vikas Saxena","Banking","JL5"
I tried changing the query to
INSERT OVERWRITE DIRECTORY '/user/vikas/output'
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = ",",
"quoteChar" = "\"",
"escapeChar" = "\\"
)
SELECT QUERY
But it displays error
Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near 'ROW' 'FORMAT' 'SERDE'
Create an external table:
CREATE EXTERNAL TABLE new_table(field1 type1, ...)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = ",",
"quoteChar" = "\""
)
STORED AS TEXTFILE
LOCATION '/user/vikas/output';
Then select into that table:
insert into new_table select * from original_table;
Your CSV is then on disk at /user/vikas/output

Transfer data from MySQL table to a different table

I have a database, let's simply call it 'db', on my computer, with a few tables that have multiple columns and data inside those tables.
I have a software using this database to store configuration elements and some other stuff.
Now, I am releasing a new version of my software, with only slight modifications in the database, i.e. some columns may have been added to tables, or removed (but no column renamed).
I must keep all data, so I would like to transfer it to the new "version" of my database.
What I thought of :
Rename 'db' into 'db_old'.
Install the new database as 'db_new', with the default values in the new columns
For each table, get a list of all the columns from 'db_old' that are present in 'db_new'
Use a INSERT INTO ... SELECT to put that old stuff back into 'db_new'.
drop the old db and use my new db.
Do you think it can work ? Do you have any easy solution ?
Also, I'm absolutely not an SQL expert... And I tried this (without looking if the column has been removed or not yet) :
SELECT
GROUP_CONCAT(COLUMN_NAME
SEPARATOR ',')
INTO #colList FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'db_old'
AND TABLE_NAME = 'configuration';
INSERT
INTO db_new.configuration (SELECT #colList)
SELECT #colList FROM
db_old.configuration;
But it fails on replacing the second #colList by the effective list... Can you also help me on this issue ?
Thank you everyone and have a nice day !
You should first take a dump of your DB Database and create a .sql file. Depending upon on your DB Data, this file can even go in GBs. This SQL File will contain all your tables and all the data inside those tables. I will suggest you open and see the file.
Then you should use this new created file and use it to import all the data into new DB. It will put all those tables, data into this new DB.
Here is how to do that. First create SQL file:
mysqldump -h [SeverIpAddress] -u [UserName] -p[password] YourDbname > db_backup.sql
Use -h [SeverIpAddress] in case of Remote severs. In case, it resdies in your own system, you don't need to use this.
Then You should create your new DB, lets say DB_new. once created, switch to it using use command.
use DB_new
Once done, now import your .SQl file that we have created before using source command.
source YourSQLFilePath
In your case, source db_backup.sql
OK. If anyone ever encounter the same problem, here is the solution.
First, admit you have a database called 'myDatabase', with a table called 'myTable' that you want to "upgrade", i.e. you want to modify the table structure by adding/removing columns but keep the data inside.
First step is to drop foreign keys (if any) and to rename "myTable" :
USE `myDatabase`;
ALTER TABLE `myTable` DROP FOREIGN KEY `my_fk_constraint`;
ALTER TABLE `myTable` RENAME TO `old_myTable`;
Second step is to import the new table structure, by using SOURCE for example.
SOURCE C:/new_table_structure.sql
Third step is optional, but you may need this if your table has a lot of columns :
USE `myDatabase`;
SET GLOBAL group_concat_max_len = 4294967295;
Fourth step is to store the following routine :
delimiter //
DROP PROCEDURE IF EXISTS updateConf//
CREATE PROCEDURE updateConf(IN dbName TEXT, IN old_table TEXT, IN new_table TEXT, IN primary_key_name TEXT)
BEGIN
-- get column count in old table
SELECT count(*)
INTO #colNb
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = dbName
AND TABLE_NAME = old_table;
-- get string with all column names from old_table
SELECT GROUP_CONCAT(COLUMN_NAME)
INTO #colNames1
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = dbName
AND TABLE_NAME = old_table;
SET #colNames1 = CONCAT(#colNames1, ',');
-- get string with all column names from new_table
SELECT GROUP_CONCAT(COLUMN_NAME)
INTO #colNames2
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = dbName
AND TABLE_NAME = new_table;
-- variables initialization
SET #cpt = 1; -- column number counter
SET #pos = 1; -- position of column name first char
SET #vir = 1; -- next comma position
-- start of loop
label: LOOP
IF #cpt <= #colNb THEN
SET #vir = LOCATE(',',#colNames1,#pos); -- localize next comma
SET #colName = SUBSTRING(#colNames1, #pos, #vir - #pos); -- get column name
SET #pos = #vir + 1; -- update next column position
-- if column is in both tables
IF FIND_IN_SET(#colName, #colNames2) AND #colName != primary_key_name THEN
SET #execut = CONCAT("INSERT INTO ", new_table, " (", primary_key_name, ",", #colName, ") SELECT ", primary_key_name, ",", #colName, " FROM ", old_table, " ON DUPLICATE KEY UPDATE ", new_table, ".", #colName, " = ", old_table, ".", #colName);
PREPARE stmt FROM #execut;
EXECUTE stmt;
END IF;
SET #cpt = #cpt + 1; -- counter increment
-- when all columns parsed
ELSE
LEAVE label; -- end of loop
END IF;
END LOOP label;
END //
delimiter ;
Final step is to call the procedure on tables, and to drop the temporary table:
CALL updateConf( 'myDatabase', 'old_myTable', 'myTable', 'primaryKeyName' );
DROP TABLE `old_myTable`;
And voila ! Just don't forget to put back the foreign keys you dropped :)
It surely can be done in better ways, but i got this to work correctly.
Thank you everyone !

How to check number of rows before export the data in csv file from my sql table

this below query is working fine for me to export the data from table to csv file but i want handle like if query returns no record then in 'filename.csv' file should contain 'no data found' message for users
-- file name as timestamp
SET #fileName = DATE_FORMAT(NOW(),'%Y-%m-%d-%H:%i:%s');
SET #FOLDER = '/tmp/';
SET #EXT = '.csv';
SET #CMD = CONCAT("SELECT id,name,salary,salaryDate FROM emp1 where name ='some_name' INTO OUTFILE '"
,#FOLDER,#fileName,#EXT,
"' FIELDS ENCLOSED BY '\"' TERMINATED BY ',' ESCAPED BY '\"'",
" LINES TERMINATED BY '\n';");
PREPARE statement FROM #CMD;
EXECUTE statement;
where do i need to change ? Any one can help me ?
You should create stored procedure. Check row numbers using COUNT function, then output one of results you need, for example -
CREATE PROCEDURE procedure1()
BEGIN
IF (SELECT COUNT(*) FROM emp1 where name ='some_name') = 0 THEN
SELECT 'no data found' INTO OUTFILE 'file_name.csv';
ELSE
your code here - SELECT INTO OUTFILE
END IF;
END

Bulk Insert with Named Field Parameter

Through my ASP.NET / SQL Server 2008 app I need to do a bulk insert of records from a CSV file (maybe a million records). I will import them into a staging table first, so I can manipulate some of the data before moving it to a permanent table.
This will happen on a regular basis. And multiple imports may happen simultaneously. I also have to tell each import from the others.
My original plan was to use a column that had an Import_ID in it. But I see that Bulk Insert won't allow me to set a field value.
Doing a search, I see that I can do a Bulk Insert into a view. And I'm guessing that the view can have a named parameter (Import_ID). But I haven't really learned setting up parameters yet, so I don't know if this is possible, or how to do it.
Can someone please tell me how to do this, or let me know another solution?
Thanks
You could bulk insert into a temporary staging table, for example since you know your SPID (and assuming you can trust the schema of some static table) you can say something like this, specifying the #filepath for the CSV file and the #ImportID, creating a table with your session id as a suffix, and doing all your work in a single dynamic SQL batch:
DECLARE #sql NVARCHAR(MAX), #spid VARCHAR(12) = RTRIM(##SPID);
SET #sql = N'SELECT * INTO dbo.Stage' + #spid
+ ' FROM dbo.RealStagingTable WHERE 1 = 0;';
SET #sql += N'BULK INSERT dbo.Stage' + #spid + ' FROM ('''
+ #filepath + ''' WITH (options);'
SET #sql += N'INSERT dbo.RealTable(ImportID, other columns)
SELECT ' + RTRIM(#ImportID) + , other columns
FROM dbo.Stage' + #spid + ';';
SET #sql += N'DROP TABLE dbo.Stage' + #spid + ';'
EXEC sp_executesql #sql;