SQL Server 2008 bulk update using stored procedure - sql-server-2008

I have 2 tables in the DB. each with "Name" column and "Count" column each.
I would like to update the Count column in the second table from the Count in the first table only where the "Name" columns are equal.
Example:
First Table:
Name Count
jack 25
mike 44
Name Count
jack 23
mike 9
david 88
Result (the second table would look like that...)
Name Count
jack 25
mike 44
david 88
NOTES:
1. Both tables are huge. (although the second table is bigger...)
2. The update must be as fast as possible...
(if there are more options other than stored procedures, i would love to hear.)
3. "Count" defined as bigint while "Name" as nvarchar(100)
4. the "Count" field in the first table is always bigger than the equivalent in the
second table.
I think that there are more options (other than stored procedure) maybe with MERGE or TRANSACTION as long as it will be the fastest way...
Thanks!

The best way would be to keep it simple
UPDATE Table2
SET Count = t1.Count
FROM Table1
WHERE Table2.Name = Table1.Name
AND Table2.Count <> Table1.Count
If the performance of this query is not satisfactory due to the size of your tables the best solution would be to partition the tables based on the name field. The query can then be run from different threads at the same time with and extra filter based on Name to satisfy the partition function.
For example: (assuming name is a varchar(20) column)
UPDATE Table2
SET Count = t1.Count
FROM Table1
WHERE Table2.Name = Table1.Name
AND Table2.Count <> Table1.Count
AND Table2.Name between cast('Jack' as varchar(20))
and cast('Mike' as varchar(20))
(The cast on the strings is a big help for Sql Server to properly do partition elimination.)

Related

mysql stored procedure, select max value and insert the value and assign to variable

I am porting a MSSQL stored procedure to MYSQL and i have a stored procedure that does this.
get the last value in a table by select Max.
add the new value to the table (along with other record
get the last value and store it in a variable for other processing
So far what i have the following
DECLARE lastSeq INT Default 0;
SELECT max(seq) INTO lastSeq from mytable;
Set newSeq = lastSeq + 1;
insert into mytable (seq, value1, value2, value3) values (newSeq, 1, 2, 3);
Unfortunately this is NOT thread safe, say if I select max(seq) and then a new record was added by other thread running a different query, by the time i reach insert, the value is already wrong.
In MSSQL I did this by locking the table during query of max(seq).
BUT
MYSQL does not allow locking of tables in stored procedure, so I cannot directly port the approach.
Havent had the luck to find solution while searching, maybe i am not putting the right keywords in search.
How can I do this in MYSQL thread safe inside stored procedure?
Update: I cannot use auto_increment for this column as this column is not unique and we allow duplicate, maybe my sample is wrong since i used "sequence" which should be auto increment, but in my real code, i use it for a different column that allows duplication.
example;
record id userid name seq status
1 1 adam 1 A
2 1 adam 2 C
3 2 Bob 1 C
In the above record, we have 2 records for Adam but only one valid set to "C" as current and "A" as archived or old value.
so my table has 2 valid record,
This is a bit long for a comment.
So change the data type to be auto_increment. There is no need to do re-implement this logic in a trigger, when the database basically does it for you.
If you are concerned about gaps, then just use row_number() over (order by seq) when you query the table.

How to update millions of records in MySql?

I have two tables tableA and tableB. tableA has 2 Million records and tableB has over 10 millions records. tableA has more than thirty columns whereas tableB has only two column. I need to update a column in tableA from tableB by joining both tables.
UPDATE tableA a
INNER JOIN tableB b ON a.colA=b.colA
SET a.colB= b.colB
colA in both table has been indexed.
Now when I execute the query it takes hours. Honestly I never saw it completed and max i have waited is 5 hours. Is their any way to complete this query within 20-30 minutes. What approach should I take.
EXPLAIN on SQL Query
"id" "_type" "table" "type" "possible_" "key" "key_len" "ref" "rows" "Extra"
"1" "SIMPLE" "a" "ALL" "INDX_DESC" \N \N \N "2392270" "Using where"
"1" "SIMPLE" "b" "ref" "indx_desc" "indx_desc" "133" "cis.a.desc" "1" "Using where"
Your UPDATE operation is performing a single transaction on ten million rows of a large table. (The DBMS holds enough data to roll back the entire UPDATE query if it does not complete for any reason.) A transaction of that size is slow for your server to handle.
When you process entire tables, the operation can't use indexes as well as it can when it has highly selective WHERE clauses.
A few things to try:
1) Don't update rows unless they need it. Skip the rows that already have the correct value. If most rows already have the correct value this will make your update much faster.
UPDATE tableA a
INNER JOIN tableB b ON a.colA=b.colA
SET a.colB = b.colB
WHERE a.colB <> b.colB
2) Do the update in chunks of a few thousand rows, and repeat the update operation until the whole table is updated. I guess tableA contains an id column. You can use it to organize the chunks of rows to update.
UPDATE tableA a
INNER JOIN tableB b ON a.colA=b.colA
SET a.colB = b.colB
WHERE a.id IN (
SELECT a.id
FROM tableA
INNER JOIN tableB ON a.colA = b.colA
WHERE a.colB <> b.colB
LIMIT 5000
)
The subquery finds the id values of 5000 rows that haven't yet been updated, and the UPDATE query updates them. Repeat this query until it changes no rows, and you're done. This makes things faster because the server must only handle smaller transactions.
3) Don't do the update at all. Instead, whenever you need to retrieve your colB value, simply join to tableB in your select query.
Chunking is the right way to go. However, chunk on the PRIMARY KEY of tableA.
I suggest only 1000 rows at a time.
Follow the tips given here
Did you say that the PK of tableA is a varchar? No problem. See the second flavor of code in that link; it uses ORDER BY id LIMIT 1000,1 to find the end of the next chunk, regardless of the datatype of id (the PK).
Hi i am not sure but you can do by cron job.
process: in table tableA you need to add one more field (for example) is_update set its default value is 0, set the cron job every min. when cron is working: for example it pick first time 10000 record having is_update field 0 value and update records and set is_update is1, in 2nd time its pick next 10000 have is_update 0 and so on...
Hope this will help to you.
For updating around 70 million records of a single MySQL table, I wrote a stored procedure to update the table in chunks of 5000. Took approximately 3 hours to complete.
DELIMITER $$
DROP PROCEDURE IF EXISTS update_multiple_example_proc$$
CREATE PROCEDURE update_multiple_example_proc()
BEGIN
DECLARE x bigint;
SET x = 1;
WHILE x <= <MAX_PRIMARY_KEY_TO_REACH> DO
UPDATE tableA A
JOIN tableB B
ON A.col1 = B.col1
SET A.col2_to_be_updated = B.col2_to_be_updated where A.id between x and x+5000 ;
SET x = x + 5000;
END WHILE;
END$$
DELIMITER ;
Look at oak-chunk-update tool. It is one of the best tool if you want to update billion of rows too ;)

improve a SELECT SQL query

My data scheme is really simple, let s say it's about farms
tableA is the main one, with an
important field "is_active" assuming
the farm is trusted (kind of)
tableB is a data storage of
serialized arrays about farms
statistics
I want to retrieve all data about active farm so I just do something like that:
SELECT * FROM tableA LEFT JOIN tableB ON id_tableA=id_tableB WHERE is_active=1 ORDER BY id_tableA DESC;
Right now the query takes 15 sec to execute straight from a sql shell, for example it I want to retrieve all data from the tableB, like :
SELECT * FROM tableB ORDER BY id_tableB DESC;
it takes less than 1 sec (approx 1200 rows)...
Any ideas how to improve the original query ?
thx
Create indexes on the keys joing two tables..
check this link, how to create indexes in mysql:
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
You'll have to create an index.
You could create the following index:
mysql> create index ix_a_active_id on tableA (id_tableA, is_active);
mysql> create index ix_b_id on tableB (id_tableB);
This first creates an index on BOTH the id + is active variable.
The second creates an index on the id for tableB.

Find column that contains a given value in MySQL

I have a table in a MySQL database. I am given a value that occurs as a cell value in that table but I do not know which cell is it i.e. the row and column of that cell. What is the most efficient way to find the column to which that value belongs? Thanks in advance.
Example:
Column_1 | Column_2 | Column_3
1 | 2 | 3
4 | 5 | 6
7 | 8 | 9
Now I am given an input value of "8". I want to know if there is an efficient way to find out that value of "8" belongs to Column_2.
It's a bit strange that you don't know which column the data is in, since columns are meant to have a well-defined function.
[Original response scrubbed.]
EDIT: Your updated post just asks for the column. In that case, you don't need the view, and can just run this query
SELECT col FROM (
SELECT "Column_1" AS col, Column_1 AS value FROM YourTable
UNION ALL SELECT "Column_2", Column_2 FROM YourTable
UNION ALL SELECT "Column_3", Column_3 FROM YourTable
) allValues
WHERE value=8;
When you run this query against your table, it will return "Column_2"
Without knowing more about your app, you have several options:
Use MySQL's built-in full-text search. You can check the MATCH function in the MySQL documentation.
Depending on the needs of your app you could decide to index your whole table with an external full-text search index, like Solr or Sphynx. This provides instant response time, but you'll need to keep the index updated.
You can loop through all the columns in the table doing a LIKE query in MySQL (very expensive in CPU and time)
You're designing this table with repeating groups, which is not satisfying First Normal Form.
You should create a second table and store the values for column1, column2, and column2 in a single column, on three rows.
Learn about the rules of database normalization for more details.

i have two tables in my sql database of 1 million records, is there a way to find out the non matching data

i have two large tables in a database.
table 1 with 2 fields
rank,name
and
table2 with 2 fields
rank,name.
both are of 1 million records.
can you write php sql code to fetch those records which exists in table2 but does not exist in table1.
SELECT *
FROM Table2
WHERE NOT EXISTS (SELECT 1 FROM Table1 WHERE Table1.Rank = Table2.Rank
AND Table1.Name = Table2.Name)
You didn't state what the key was, or what criteria you wanted, but this should get you going down the right path. This could be slow for large record sets, but again, you didn't say why/what you wanted this for.