The following bit of code is causing me no end of headaches, it works; but it is taking upwards of 30min with only 16k rows; the final database will be looking to include 1-5M rows.
So I'm looking to make things more efficient
INSERT INTO r (aKey, bKey, c, d, e, fKey, g, h, i, j, k, l, mKey, nKey, o, pKey, qKey, r, sKey)
SELECT aKey, bKey, c, d, e, fKey, g, h, i, j, k, l, mKey, nKey, o, pKey, qKey, r, sKey
FROM t2
WHERE NOT EXISTS
(
SELECT i,g
FROM r
WHERE t2.i = r.i AND t2.g = r.g
);
The field of i & g are the uniques (not specified in the DB currently - should they be? but a requirement)
Anything with Key on is a foreign key referencing another table
After the above statement I then loop through each none unique and update if it is different.
UPDATE r
INNER JOIN t2 ON t2.i = r.i
SET r.d=t2.d;
I have looked into :
INSERT ... ON DUPLICATE KEY UPDATE
but I was never able to get it to work; as all the examples I found were not dealing with a insert select
Any suggestions to improve the SQL will be most helpful.
Conclusion was: Due to the size of the request best to go back to basics and "set unique database fields to UNIQUE" and then do a INSERT IGNORE
Code below: (including python variable inserts)
INSERT IGNORE INTO """ + tblName + """ (""" + dupeString + """)
SELECT """ + dupeString + """
FROM t2
ON DUPLICATE KEY UPDATE """ + updateString + """;
dropped from 30-40m query to... less than 3 seconds.
Related
So let's say you have column A,B,C. On duplicate key, let's say you do A = {some statement here}, B = {some statement here}, and C= New_A + New_B. Can I use what would be the new values of A and B in order to determine the new value of C, or do I have to retype the expressions for the new A and B? Thanks!
I think you can do it. If you do:
ON DUPLICATE KEY UPDATE
A = A + 1, B = B * 2,
C = A + B
I believe the updates are executed left to right. So when it gets to C = A + B, A and B contain the new values.
For instance I have table A and table B
a.data = {1,2,3,4,5,6}
b.data = {4,5,7}
If you want to lookup one value in a.data or b.data you can use FIND_IN_SET(3, b.data).
But I want to know if at least all the values of b.data are in a.data, or else if I can find
at least the intersection between b.data and a.data. So in this case {4,5}.
WHERE INTERSECT(a.data, b.data) ... something like that. How should I do this in MySQL?
update
The b.data {4,5,7} is the column data of one 1 record, so joining a.data on b.data won't work.
table A
=======
ID DATA
1 {1,2,3,4,5,6}
2 {7,9,12}
table B
=======
ID DATA
1 {4,5,7}
2 {9,10,11,12}
You can take interection of tables using INNER JOIN
have a look at Visual explaination of joins
SELECT fn_intersect_string(a.data, b.data) AS result FROM table_name;
also you can write a user defined function as:
CREATE FUNCTION fn_intersect_string(arg_str1 VARCHAR(255), arg_str2 VARCHAR(255))
RETURNS VARCHAR(255)
BEGIN
SET arg_str1 = CONCAT(arg_str1, ",");
SET #var_result = "";
WHILE(INSTR(arg_str1, ",") > 0)
DO
SET #var_val = SUBSTRING_INDEX(arg_str1, ",", 1);
SET arg_str1 = SUBSTRING(arg_str1, INSTR(arg_str1, ",") + 1);
IF(FIND_IN_SET(#var_val, arg_str2) > 0)
THEN
SET #var_result = CONCAT(#var_result, #var_val, ",");
END IF;
END WHILE;
RETURN TRIM(BOTH "," FROM #var_result);
END;
You get the intersection from an inner join:
SELECT a.data FROM a, b WHERE a.data = b.data
To decide whether b is a subset of a, you can do
SELECT b.data FROM b LEFT JOIN a ON a.data = b.data WHERE a.data IS NULL
This will compute the difference: all values from b which are not contained in a. If it is empty, then b is a subset of a.
You can use both of these approaches as subqueries inside a larger query.
If your column is of type SET, then it is stored as a number internally, and will auto-convert to that number where appropriate. The operations you describe correspond to bit-wise logical operations on those numbers. For example, the intersection can be computed using the bit-wise and of the values from two columns.
a.data & b.data AS intersection,
(a.data & b.data) <> 0 AS aAndBIntersect,
(a.data & b.data) == b.data AS bIsSubsetOfA
This requires that the type of both columns is the same, so that the same strings correspond to the same bits. To turn the result back into a string, you'd could use ELT, but with all the combination that's likely to get ugly. As an alternative, you could save the result in a temporary table with the same data type, storing it as a number and later retrieving it as a string.
I have the an existing table that for some reason the designer decided to manually control the Primary Key value by storing the last used value in a seperate table (changing the table to use Identity is not an option right now).
I now need to do a mass update to this table as follows:
DECLARE #NeedFieldID int
SET #NeedFieldID = 62034
INSERT INTO T_L_NeedField (NeedID, NeedFieldID, FieldName, Sequence, DisplayAs, FieldPrompt, DataType, ValidOptions, IsRequiredForSale)
(
SELECT
DISTINCT n.NeedID,
#NeedFieldID + 1,
'DetailedOutcome',
999,
'Detailed Outcome',
'Select appropriate reason for a No Sale outcome',
'T',
'Pricing, Appointment Date / Time Not Available, Will Call Back, Declined',
0
FROM T_L_Need n
INNER JOIN T_L_NeedField nf
ON n.NeedID = nf.NeedID
WHERE (n.Need LIKE 'Schedule%' AND n.Disabled = 0)
)
Obviously '#NeedFieldID + 1' doesn't work (just using it to show what I want to do). How can I increment #NeedFieldID as SQL inserts the values for each of the distinct NeedId's? I am using SQL Server 2008.
You want row_number():
DECLARE #NeedFieldID int
SET #NeedFieldID = 62034
INSERT INTO T_L_NeedField (NeedID, NeedFieldID, FieldName, Sequence, DisplayAs, FieldPrompt, DataType, ValidOptions, IsRequiredForSale)
(
SELECT
DISTINCT n.NeedID,
#NeedFieldID + row_number() over (order by n.NeedID),
'DetailedOutcome',
999,
'Detailed Outcome',
'Select appropriate reason for a No Sale outcome',
'T',
'Pricing, Appointment Date / Time Not Available, Will Call Back, Declined',
0
FROM T_L_Need n
INNER JOIN T_L_NeedField nf
ON n.NeedID = nf.NeedID
WHERE (n.Need LIKE 'Schedule%' AND n.Disabled = 0)
)
However, your best bet is to make NeedFieldID an identity column and just let SQL Server do the work for you.
I am working on a SQL 2008 machine and cannot seem to get the query to work.
My SQL query is :
select q.Document DOC from references q, equiprates e where e.MachineID=q.UnitID'
The rows retruned by q.Document is:
5570_RESTAURANT.pdf
5650_RESTAURANT.pdf
5110_RESTAURANT.pdf
However, I need the table rows to be as follows:
Restaurant Document
5570_RESTAURANT.pdf
5570_RESTAURANT.pdf
5570_RESTAURANT.pdf
So I am trying to format my selecct string as follows:
Select #sSQL = 'select q.Document DOC, ''''+q.Document+'''' ''Restaurant Document'',
from references q, equiprates e
where e.MachineID=q.UnitID'
My error message is:
Msg 4104, Level 16, State 1, Line 3
The multi-part identifier "q.Document" could not be bound.
Any ideas how to resolve this?
I tried google, but no luck.
Your single quotes are just wrong (I also recommend shifting to more modern INNER JOIN syntax). But why can't the application simply add the HTML around the DOC column? Seems wasteful (never mind more complex, obviously) to add all that HTML at the server, and send all those bytes over the wire.
DECLARE #sSQL NVARCHAR(MAX);
SET #sSQL = N'SELECT
DOC = q.Document,
[Restaurant Document] = ''<a href="Javascript:ViewFile(''''''
+ q.Document + '''''');" class="Link">''
+ q.Document + ''</a>''
FROM references AS q
INNER JOIN equiprates AS e
ON q.UnitID = e.MachineID';
PRINT #sSQL;
try just
select
'' + q.Document + ''
from
references q, equiprates e
where
e.MachineID=q.UnitID
but remember, it's very bad programming style. it's better when the data model and data view are separated from each other
select TO_CHAR(TRUNC(SYSDATE),'DD MONTH,YYYY'),a.appl_no,a.assigned_to,c.trading_name co_name, ' ' co_name2, d.bank_acct_no credit_acct_no, d.bank_no credit_bank_no, d.bank_branch_no credit_branch_no,a.service_id
from newappl a, newappl_hq b, newappl_ret c, newappl_ret_bank d where a.appl_no = c.appl_no and c.ret_id= d.ret_id and a.appl_no=(select appl_no from newappl where appl_no='224') and c.outlet_no in ('1','2') and rownum=1
Why the out put for above statment is only one row while I have 1 & 2 for following statement
select c.outlet_no from newappl_ret c where appl_no = '224'
its hard to say when you dont see data stored in db but try this one:
select c.outlet_no from mss_t_newappl_ret c where appl_no = 224
check if in the column appl_no there isnt any spacebar
maybe this?
and a.appl_no IN (select appl_no from newappl where appl_no='224')
or delete this expression
and rownum=1