Large records table insertion issue Mysql - mysql

I am a developer and I am facing an issue while managing table which has large amount of records.
I am executing a cron job to fill up data in primary table (Table A) which has 5-6 columns and approx 4,00,000 to 5,00,000 rows and then creating another table and data in this table would continue to increase over the time.
TABLE A contains the raw data and my output table is TABLE B
My cron script truncates data in Table B then inserts data using select query
TRUNCATE TABLE_B;
INSERT INTO TABLE_B (field1, field2)
SELECT DISTINCT(t1.field2), t2.field2
FROM TABLE_A AS t1
INNER JOIN TABLE_A t2 ON t2.field1=t1.field1
WHERE t1.field2 <> t2.field2
GROUP BY t1.field2, t2.field2
ORDER BY COUNT(t1.field2) DESC;
Above select query produces approx 1,50,000 to 2,00,000 rows
Now it takes too much time to populate TABLE B and meanwhile If my application tries to access TABLE B then select query fails
Explaining query results following:
'1','PRIMARY','T1','ALL','field1_index',NULL,NULL,NULL,'431743','Using temporary;Using filesort'
'1','PRIMARY','T2','ref','field1_index','field1_index','767','DBNAME.T1.field1','1','Using where'
Can someone please help me in improving this process, or guide me alternatives for above process?
Thanks
Suketu

You should do the whole process in a stored proc.
Do not truncate such a large table. Follow the following steps:
Copy the TableB structure to TableB_Copy.
DROP TABLEB.
Rename TableB_Copy to TableB
Disable indexes on TableB
Insert the data from TableA into TableB
Create the indexes on TableB.

According to my view the solution would be like this:
SELECT
DISTINCT(t1.field2), t2.field2
FROM
TABLE_A AS t1
INNER JOIN
TABLE_A t2 ON
t2.field1=t1.field1
WHERE
t1.field2 <> t2.field2
GROUP BY
t1.field2, t2.field2
ORDER BY
COUNT(t1.field2)
DESC INTO OUTPUT "PATH-TO-FILE";
For instance file as "C:\TEMP\DATA1.SQL". What will happen with this query a simple new file is created with TAB delimiter to insert into any table.
Now how to import the data to table.
LOAD DATA
"PATH-TO-FILE"
INTO TABLE
table_name
With this query the data will be inserted and on the other hand you will be able to use the table in which you are inserting the data.

Related

MySQL RDS Stored Procedure Update query is slow

I have an update query in the Stored Procedure that updates TABLE1 based on the IDs present from TABLE2. This is written using a subquery as follows.
update TABLE1 A
set status = 'ABC'
where A.ID in (
select ID
from TABLE2 B
where B.SE_ID = V_ID
and B.LOAD_DT = V_DT
);
I have rewritten this using,
a JOIN
masking the subquery from the main query
Used a temp table and join.
Standalone update is faster.
But placing this in the Stored Procedure is very slow.
TABLE1 needs to be updated with 2000 records from the 2000 ID from TABLE2.
Someone please help on this.
Avoid using subqueries in place of joins. MySQL optimizes this use of subquery very poorly. That is, it may run the subquery up to 2000 times.
Use a join:
UPDATE TABLE1 A
INNER JOIN TABLE2 B
ON A.ID = B.ID
SET A.status = 'ABC'
WHERE B.SE_ID = V_ID
AND B.LOAD_DT = V_DT;
You'll want to create an index to optimize this.
ALTER TABLE TABLE2 ADD INDEX (SE_ID, LOAD_DT, ID);
No need to create an index on TABLE1, if my assumption is correct that its ID column is its primary key. That is itself an index.

Move data from one table to other in mysql and drop the table

I want to write a Mysql query for the following scenario.
1.check if a table( ex: tableA) exists.
2.Check if data is there in the table.
3.If tableA exists and data is there in the table move all data to another table( ex: tableB) (tableB there in db and both tables are having same structure)
4.drop tableA
Is it possible to write a mysql query avoiding mysql stored procedure ?
I was able to do first three with the below query, but drop table was not possible.
Hope it helps!
There are two tables: table_1(old),table_2(new). Both have one column "ID".
insert into table_2(id) #inserts into table
select case when
(
select count(*) from table_1 a #checks if there are any records in the table
where exists #checks if table exists
(select 1 from table_1 b where a.id=b.id))>0 then
id
else 0 end
from table_1

Import CSV to Update rows in table

There are approximately 26K products (posts) and each product has meta values like this:
The post_id column is the product id in db and the _sku (meta_key) is the unique id for each product.
I've received a new CSV file that updates all of the values (meta_value) for _sale_price (meta_key) of each product. The CSV file looks like:
SKU, Sale Price
How do I import this CSV to update only the _sale_price row based on the post_id (product id) & _sku value?
Output Example:
I know how to do this in PHP by looping through the CSV and selecting & executing an update for each single product but this seems inefficient.
Preferably with phpMyAdmin and by using LOAD DATA INFILE.
You can use temporary table to hold the update data and then run single update statement.
CREATE TEMPORARY TABLE temp_update_table (meta_key, meta_value)
LOAD DATA INFILE 'your_csv_pathname'
INTO TABLE temp_update_table FIELDS TERMINATED BY ';' (meta_key, meta_value);
UPDATE "table"
INNER JOIN temp_update_table on temp_update_table.meta_key = "table".meta_key
SET "table".meta_value = temp_update_table.meta_value;
DROP TEMPORARY TABLE temp_update_table;
If product_id is the unique column of that table, you can do that using CSV:
Have a CSV file of those you want to import with their unique ID. CSV file must be in same order of the table column, put all your columns and no column name
Then in phpMyAdmin, go to the table of database, click import
Select CSV in the drop-down of Format field
Make sure "Update data when duplicate keys found on import (add ON DUPLICATE KEY UPDATE)" is checked.
You can import the new data into another table (table2). Then update your primary table (table1) using a update with a sub-select:
UPDATE table1 t1 set
sale_price = (select meta_value from table2 t2 where t2.post_id = t1.product_id)
WHERE
(select count(*) from table2 t2 where t1.product_id = t2.post_id) > 0
This is obviously a simplification and you will most likely need to constrain your query a little further.
Make sure to backup your full database before attempting. I recommend you work on a non-production database until the process works flawlessly.
It seems to me that rAndom69's answer does not work on postgresql 12 but the join with the WHERE work:
UPDATE tableA
SET fieldToPopulateInTableA = temp_update_table.fieldPopulated
FROM temp_update_table
WHERE tableA.correspondingField = temp_update_table.correspondingField

Why would a SQL MERGE have a duplicate key error, even with HOLDLOCK declared?

There is a lot of information that I could find on SQL Merge, but I can't seem to get this working for me. Here's what's happening.
Each day I'll be getting an Excel file uploaded to a web server with a few thousand records, each record containing 180 columns. These records contain both new information which would have to use INSERT, and updated information which will have to use UPDATE. To get the information to the database, I'm using C# to do a Bulk Copy to a temp SQL 2008 table. My plan was to then perform a Merge to get the information into the live table. The temp table doesn't have a Primary Key set, but the live table does. In the end, this is how my Merge statement would look:
MERGE Table1 WITH (HOLDLOCK) AS t1
USING (SELECT * FROM Table2) AS t2
ON t1.id = t2.id
WHEN MATCHED THEN
UPDATE SET (t1.col1=t2.col1,t1.col2=t2.col2,...t1.colx=t2.colx)
WHEN NOT MATCHED BY TARGET THEN
INSERT (col1,col2,...colx)
VALUES(t2.col1,t2.col2,...t2.colx);
Even when including the HOLDLOCK, I still get the error Cannot insert duplicate key in object. From what I've read online, HOLDLOCK should allow SQL to read primary keys, but not perform any insert or update until after the task has been executed. I'm basically learning how to use MERGE on the fly, but is there something I have to enable for SQL 2008 to pick up on MERGE Locks?
I found a way around the problem and wanted to post the answer here, in case it helps anyone else. It looks like MERGE wouldn't work for what I needed since the temporary table being used had duplicate records that would be used as a Primary Key in the live table. The solution I came up with was to create the below stored procedure.
-- Start with insert
INSERT INTO LiveTable(A, B, C, D, id)
(
-- Filter rows to get unique id
SELECT A, B, C, D, id FROM(
SELECT A, B, C, D, id,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS row_number
FROM TempTable
WHERE NOT EXISTS(
SELECT id FROM LiveTable WHERE LiveTable.id = TempTable.id)
) AS ROWS
WHERE row_number = 1
)
-- Continue with Update
-- Covers skipped id's during insert
UPDATE tb_TestMLS
SET
LiveTable.A = T.A,
LiveTable.B = T.B,
LiveTable.C = T.C,
LiveTable.D = T.D
FROM LiveTable L
INNER JOIN
TempTable T
ON
L.id= T.id

Transfering all data from one table to the other same database

I am trying to move a complete row of records which includes BLOB from one table (TABLE1) to another table (TABLE2) in the same database using SQL. I have tried what little I know but it failed.
What I have already used which failed is:
INSERT INTO TABLE2
SELECT * FROM TABLE1
WHERE
staff_id = '0010002';
How should I do this instead?
Try:
INSERT INTO TABLE2 (colname1, colname2)
SELECT colename1, colename2
FROM TABLE1
WHERE staff_id = '0010002'
And for more details.http://dev.mysql.com/doc/refman/5.0/en/insert-select.html