Import CSV Pulling One Column Field from Existing Table - mysql

I'm learning MySQL and PHP (running XAMPP and also using HeidiSQL) but have a live project for work that I'm trying to use it instead of the gazillion spreadsheets in which the information is currently located.
I want to import 1,000+ rows into a table (tbl_searches) where one of the columns is a string (contract_no). Information not in the the spreadsheet required by tbl_searches includes search_id (PK and is AUTO_INCREMENT) and contract_id. So the only field I am really missing is contract_id. I have a table (tbl_contracts) that contains contract_id and contract_no. So I think I can have the import use the string contract_no to reference that table to grab the contract_id for the contract_no, but I don't know how.
[EDIT] I forgot to mention I have successfully imported the info using HeidiSQL after I exported the tbl_contracts to Excel and then used it the Excel VLOOKUP function but that ended up yielding incorrect data somehow.

You can do it like this
LOAD DATA LOCAL INFILE '/path/to/your/file.csv'
INTO TABLE table1
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n' -- or '\r\n' if the file has been prepared on Windows
(#field1, #contract_no, #field2, #field3,...)
SET column1 = #field1,
contract_id = (SELECT contract_id
FROM tbl_contracts
WHERE contract_no = #contract_no
LIMIT 1),
column2 = #field2,
column3 = #field3
...

try something like this: (I am assuming that you have data in tbl_contracts)
<?php
$handle = fopen("data_for_table_searches.csv", "r");
while (($data = fgetcsv($handle,",")) !== FALSE) { // get CSV data from you file
$contract_id = query("SELECT contract_id FROM tbl_contracts WHERE contract_number = " . $data[<row for contract number>]); // whatever is the equivalent in heidi SQL, to get contract id
query("INSERT INTO tbl_searches values($contract_id, data[0], data[1], data[2],...)"); // whatever is the equivalent in heidi SQL, insert data, including contract id into tbl_searches
}
fclose($handle);
?>

Thanks for everyone's input. peterm's guidance helped me get the data imported. Rahul, I should have mentioned that I was not using PHP for this task, but rather just trying to get the data into the tables using HeidiSQL. user4035 asked for more detail and so that's here too.
I have three tables in the database.
tbl_status has two fields, status_ID (AUTO_INCREMENT) and status_name.
tbl_contracts has two columns, contract_ID (AUTO_INCREMENT) and contract_no (a string).
The last table (tbl_searches) will be the active(?) table in that this is where the users' actions will be recorded.
The first two of these tables were easily populated. tbl_status has 11 rows that will describe the status of the contract and these were just typed into an Excel spreadsheet and imported via CSV through HeidiSQL.
For the second table I had 1,000+ "contracts" to import and so I left the first column in Excel blank and the second column containing the string of the contract and imported them the same way.
The third table has seven fields: search_id (AUTO_INCREMENT), contract_id, contract_no, status_id, notes, initials and search_date (I forgot about that one until just now).
I wanted to insert the spreadsheet that had the search information on it into tbl_searches. It has the contract_no, but not the contract_id. I needed to insert the rows and have the query grab the contract_id from tbl_contracts. It took me a bit to get it right without errors and some unexpected results. (The following query omits the need for search_date.)
LOAD DATA LOCAL INFILE '\\\\PATH\\PATH\\PATH\\PATH\\FILENAME.csv'
INTO TABLE `hoa_work`.`tbl_searches`
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"' LINES TERMINATED BY '\r\n'
IGNORE 1 LINES --because the first row of the CSV has column headers
(#search_id, #contract_id, #contract_no, #status_id, #notes, #initials)
SET
search_id = NULL, --is an AUTO_INCREMENT field
contract_id = (SELECT contract_id
FROM tbl_contracts
WHERE contract_no = #contract_no
LIMIT 1),
contract_no = #contract_no,
status_id = #status_id,
notes = #notes,
initials = #initials;
/* Affected rows: 1,011 Found rows: 0 Warnings: 0 Duration for 1 query: 0.406 sec. */
I learned here that the #blah are user variables. If I run the following query it will tell me how the variable is defined. Since I was inserting 1,000+ rows from the CSV file it gave me the answer for the last row that it inserted.
SELECT #contract_no
If you have any suggested improvements on the way I ultimately wrote the query please do tell me.
-Matt

Related

Create a Pandas table via string matching

I have a 3 column table where two columns are URLs and 1 column is a string that might be contained in the urls. The first 100,000 rows can be found at this link:
https://raw.githubusercontent.com/Slusks/LeagueDataAnalysis/main/short_urls.csv
In theory, values in eurl and surl should be the same, and for every value of each, there should be a gameid that matches both, ie:
https://datapoint/Identity1/foobar.com | Identity1 | https://datapoint/Identity1/foobar.com
I've tried some SQL queries on the data and cant get them to line up
SELECT
*
from
table
where
eurl = surl;
since the values started out in different tables, I also tried joining on table1.url = table2.url and that hasn't worked either. It just shows up blank:
SELECT
s.url, e.gameid
FROM
elixerdata e
JOIN
scrapeddata s ON e.url = s.url;
I'm trying to get the gameID's to match up to the surl column and using the eurl column as validation to confirm that it worked correctly.I'm probably not providing enough code or steps to get good feedback but I figure I might as well ask since I am low on ideas myself.
EDIT1:
I cleaned the quotes off by loading the table into python and then re-writing it to a csv with pandas. The data in the csv appears to not have any quotes, then I load it into SQL with the following:
drop table if exists urltable;
create table urltable(
eurl varchar(255),
gameid varchar(20),
surl varchar(255));
LOAD DATA LOCAL INFILE 'csvfile.csv' into table urltable
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES;
When I read the table in MySQL Workbench there are no quotes, but if I export that table back to a csv, all the quotes are back, only for the surl column though.

Multiple input file loading to single mysql table

I am kind of new here. Have been searching for 2 days and no luck so I am posting my question here. Simply put I need to load data into a table in mysql. Now the thing is the input data for this table will be coming from two different source.
For eg: below is the how the 2 input files will be.
Input_file1
Field Cust_ID1, Acct_ID1, MODIFIED, Cust_Name, Acct_name, Comp_name1, Add2, Comp_name2, Add4
Sample value C1001, A1001, XXXXXX, JACK, TIM KFC, SINGAPORE, YUM BRAND, SINGAPORE
Input_file2
Field ID, MODIFIEDBY, Ref_id, Sys_id
Sample value 3001, TONY, 4001, 5001
Sorry was not able to copy data as in excel so improvised. The ',' is to show separate values. Field specifies the column name and its corresponding value is under sample value.
And the table that the above data needs to be loaded into is as such
Sample _table_structure
ID
Cust_ID1
Acct_ID1
Ref_id
Sys_id
MODIFIED
MODIFIEDBY
Cust_Name
Acct_name
Comp_name1
Add2
Comp_name2
Add4
What I need to do is load data into this table from the input data that comes to me in one single go. Is this possible. As you can see the order is also not a match that I can append and load it. Which is one main issue for me.
And no, changing the input sequence is not a option. Data is huge so that will take too much effort. Any help with this I would appreciate. Also I would like to know if we could use a shell or perl script to do this.
Thanks in advance for the help & time.
load data local infile 'c:\\temp\\file.csv'
into table table_name fields terminated by ',' LINES TERMINATED BY '\r\n' ignore 1 lines
(#col1,#col2,#col3,#col4,#col5,#col6,#col7,#col8,#col9)
set Cust_ID1 = #col1,
Acct_ID1 = #col2,
MODIFIED =#col3,
Cust_Name =#col4....;
load data local infile 'c:\\temp\\file2.csv'
into table table_name fields terminated by ',' LINES TERMINATED BY '\r\n' ignore 1 lines
(#col1,#col2,#col3,#col4 ) ## here Number the respective columns as per the table
set ID = #col1,
MODIFIEDBY = #col2,
REF_ID = #col3,
sys_ID = #col4....
ID, MODIFIEDBY, Ref_id, Sys_id
same thing for csv file 2.
this way you can import file to table.
Note :
Please save Excel file as csv format and then import

mysql infile can't access #user variable in sub query

I am trying to read a csv file and insert rows into a table. I am able to insert them without any problem when I am assigning the value. But the same sql stops working when I try to use a #user variable.
All help is appreciated.
This works:
LOAD DATA INFILE 'tmp/test.csv'
INTO TABLE T1
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
( #var1 )
set create_timestamp = now(),
col1 = ( select max(id) from T2 where
col2 = 1234 );
This doesn't work:
LOAD DATA INFILE 'tmp/test.csv'
INTO TABLE T1
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
( #var1 )
set create_timestamp = now(),
col1 = ( select max(id) from T2 where
col2 = #var1 );
I suspect that the subquery ( select max(id) from is being executed before a value is assigned to #var1.
I'm actually surprised that the first example works at all; I've never tried running a subquery in a LOAD DATA.
When you say "This doesn't work" do you mean that the LOAD DATA throws an error? Or do you mean that the LOAD DATA completes successfully, but with unexpected results?
In the latter case, I'd recommend a simple test, doing a separate SET #var1 = 1234; statement before the LOAD DATA, and see what happens for the first row.
If it's throwing an error, it may be that a subquery isn't supported in that context. (We'd need to consult the MySQL Reference Manual, to see if that is supported.)
That's my guesses. 1) unexpected order of operations (#var1 is being evaluated before the value from the row is assigned to #var1), or 2) a subquery isn't valid in that context.
EDIT
According the the MySQL Reference Manual 5.7
http://dev.mysql.com/doc/refman/5.7/en/load-data.html
It looks like a subquery is supported in the context of a LOAD DATA statement.
I'm having the same problems. Part of my problem was that, using your table definition as an example, T1.col1 is set to NOT NULL but the expression "select max(id) from T2 where col2 = #var1" resulted in nothing being returned and so MySQL attempted to assign a null value to col1. That didn't solve everything, but it was a start.
Here's what I'm working with:
LOAD DATA LOCAL INFILE 'streets.csv'
INTO TABLE streets
CHARACTER SET latin1
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(pre_dir, name, suffix, post_dir, #community, #state)
SET community_id = (SELECT community_id FROM communities_v WHERE community_name = #community AND state_abbr = #state);
EDIT: I had originally put a bunch of other stuff here that I thought was related to the problem, but as it turns out, didn't. It really was a data problem. The subquery I'm using had certain combinations of #community and #state that had no value for community_id, so I would double-check your own subquery to see if it returns a value in all cases.

mysql insert update LOAD DATA LOCAL INFILE

i am using LOAD DATA LOCAL INFILE to load data into temp table mid.then i use a update query to update found in table product.The only matching field in both is the model.
$q = "LOAD DATA LOCAL INFILE 'Mid.csv' INTO TABLE mid
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 LINES
(#col1,#col2,#col3,#col4,#col5,#col6) set model=#col1,price=#col3,stock=#col6 ";
mysql_query($q, $db);
mysql_query('UPDATE mid m, products p set p.products_price= m.price,p.products_quantity= m.stock where p.products_model= m.model');
It works and update the product table.the issue i am having is that there new records in mid table which don't get inserted as i am using the update statement.
I have looked at the insert query and update on duplicate.I have seen loads of examples of when it has to work on one table but none where i have to match it against another table.
Either i am searching for the wrong thing or there is another way to to do this.
i would appreciate any help.
regards
naf
I'm not sure what the other columns in the product table are, but here's a basic approach that should work for you based on the 3 columns in your example, assuming the products_model column is unique in the products table:
insert into products (products_price,products_quantity,products_model)
select price, stock, model
from mid
on duplicate key update
products_price = values(products_price),
products_quantity = values(products_quantity)

Import CSV to Update only one column in table

I have a table that looks like this:
products
--------
id, product, sku, department, quantity
There are approximately 800,000 entries in this table. I have received a new CSV file that updates all of the quantities of each product, for example:
productA, 12
productB, 71
productC, 92
So there are approximately 750,000 updates (50,000 products had no change in quantity).
My question is, how do I import this CSV to update only the quantity based off of the product (unique) but leave the sku, department, and other fields alone? I know how to do this in PHP by looping through the CSV and executing an update for each single line but this seems inefficient.
You can use LOAD DATA INFILE to bulk load the 800,000 rows of data into a temporary table, then use multiple-table UPDATE syntax to join your existing table to the temporary table and update the quantity values.
For example:
CREATE TEMPORARY TABLE your_temp_table LIKE your_table;
LOAD DATA INFILE '/tmp/your_file.csv'
INTO TABLE your_temp_table
FIELDS TERMINATED BY ','
(id, product, sku, department, quantity);
UPDATE your_table
INNER JOIN your_temp_table on your_temp_table.id = your_table.id
SET your_table.quantity = your_temp_table.quantity;
DROP TEMPORARY TABLE your_temp_table;
I would load the update data into a seperate table UPDATE_TABLE and perform an update within MySQL using:
UPDATE PRODUCTS P SET P.QUANTITY=(
SELECT UPDATE_QUANTITY
FROM UPDATE_TABLE
WHERE UPDATE_PRODUCT=P.PRODUCT
)
I dont have a MySQL at hand right now, so I can check the syntax perfectly, it might be you need to add a LIMIT 0,1 to the inner SELECT.
Answer from #ike-walker is indeed correct but also remember to double check how your CSV data if formatted. Many times for example CSV files can have string fields enclosed in double quotes ", and lines ending with \r\n if working on Windows.
By default is assumed that no enclosing character is used and line ending is \n.
More info and examples here https://mariadb.com/kb/en/importing-data-into-mariadb/
This can be fixed by using additional options for FIELDS and LINES
CREATE TEMPORARY TABLE your_temp_table LIKE your_table;
LOAD DATA INFILE '/tmp/your_file.csv'
INTO TABLE your_temp_table
FIELDS
TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' -- new option
LINES TERMINATED BY '\r\n' -- new option
(id, product, sku, department, quantity);
UPDATE your_table
INNER JOIN your_temp_table on your_temp_table.id = your_table.id
SET your_table.quantity = your_temp_table.quantity;
DROP TEMPORARY TABLE your_temp_table;