Fix mysql query - mysql

DELETE FROM keywords
WHERE NOT EXISTS
(SELECT keywords_relations.k_id FROM keywords_relations WHERE keywords.k_id = keywords_relations.k_id)
It is taking too long...I have 583,000 keywords (utf_unicode) and 1million keywords_relations. In past the query used to happen in 20-60 seconds, but I try running it now and it hasnt happened in half an hour.
Could you please suggest what might be wrong. Also, any other alterantives to this query.
I am trying to delete the keywords from the keywords table whose id dont exist in the keywords relations table.
Thanks
The site is http://domainsoutlook.com/
You can try and going on it and also see that all the queries are running slowly.
PS. The server crashed about a few days ago and a fsck check or something was carried out on the disk by my server maintenance support.
Indexes on keywords = k_id(primary), keywords(unique)
indexes on keywords_relations = k_id(index)

try this instead and see if it makes a difference:
DELETE keywords
FROM keywords
LEFT JOIN keywords_relations
ON keywords.k_id = keywords_relations.k_id
WHERE keywords_relations.k_id IS NULL

WHERE NOT EXISTS (subquery) is known to cause performance issues in MySQL <5.4. Use LEFT JOIN instead
WARNING: Test it before running on live database. I claim no responsibility for data lost.
DELETE
k
FROM
keywords AS k
LEFT JOIN
keywords_relations AS kr
USING (k_id)
WHERE
kr.k_id IS NULL

I assume that you don't want to delete keywords, if there are one or more keyword_relations belonging to it. So the first thing you could do is adding a LIMIT 1 to your SELECT-Query.
You sure have set indexes for k_id, right? If yes, this actually shouldn't be a thing for MySQL...

Related

Inner joining within an inner join

I tried to find if there are any answered but couldn't seem to find any. I'm trying to join together four tables but one of the joins is not on the table that the other two joins are from, I've successfully joined three of the table I'm just not sure of syntax for joining the third.
SELECT * FROM
nc_booking
INNER JOIN
nc_customer ON nc_booking.c_id = nc_customer.id
INNER JOIN
nc_propertys ON nc_booking.p_id = nc_propertys.id
How would i now join nc_propertys to another table nc_owner?
Building on the code from #GordonLinoff, to add your extra table you need to do something like:
SELECT *
FROM nc_booking b INNER JOIN
nc_customer c
ON b.c_id = c.id INNER JOIN
nc_propertys p
ON b.p_id = p.id INNER JOIN
nc_owner o
ON o.id = p.o_id;
You haven't shared the column names we need to use to connect the extra table, so the last line might not be right. A few things to note ...
(1) The SELECT * is not ideal. If you only need particular columns here, list them. I've stuck with your * because I don't know what you want from the tables. Where a column with the same name exists in each table, you'll have "fully qualify" the field name as follows ...
SELECT c.id as customer_id,
-- more field can go here, with a comma after each
...
Several of the joined tables have an id field, so the c. is necessary to tell the database which one we want. Notice that as with the tables, we can also give the fields an 'alias', which in this case is 'customer_id'. This can be very helpful for presentation, and is often essential when using the output from a query as part of a larger piece of code.
(2) Since all the joins are INNER JOINS it makes little (if any) difference what order the tables are listed as long as the connections between them remain the same.
(3) For MySQL, it technically shouldn't matter whether you have lots of new-lines or none at all. SQL is designed to ignore "white space" (except within data). What matters is simply laying out your code so it is easy to read ... especially for other users who later might need to figure out what you were doing (although in my experience also for you, when you return to a piece of code several years later and can't remember it at all).
(4) In each ON clause it doesn't actually matter whether you wright say a = b or b = a. That's because you aren't setting one to equal the other, you are requiring that they already be equal so it amounts to the same thing either way.
My advice to a SQL beginner would be when you are writing a SELECT query (which only reads and doesn't change any data): if you aren't too sure then write some code and set it to run. If it's completely invalid, your software should give you some idea of what is wrong and no harm will be done. If it's valid but wrong, the very worst that can happen is that you put some unnecessary load on your database server ... if it takes a long time to run and you weren't expecting it to, then you should be able to cancel the query. As long as you have some idea of what you expect the results to look like, and roughly how many rows to expect, you won't go too far wrong. If you get completely stuck come back here to Stack Overflow.
Things get a bit different if you are writing code which DELETEs or UPDATEs data. Then you want to know exactly what you're up to. Normally you can write a closely related SELECT statement first to make sure you're going to be making all and only the changes you were expecting. It's also best to make sure you've got a way to undo your changes should the worst happen. Backups are obviously good, and you can often create your own backup copy of a table before you make any alterations. You don't necessarily need to rely on backup software or your in house IT guys for that ... in my experience they don't like databases anyway.
Also there are some great books out there. For a beginner, I'd recommend anything by Ben Forta, including his SQL in 10 Minutes (that's a per chapter figure), or his MySQL Crash Course (the latter is a little old though, so won't have anything on the more recently added features of MySQL).
Your syntax looks okay. I am providing an answer because you really should learn to use table aliases. They make a query easier to write and to read:
SELECT *
FROM nc_booking b INNER JOIN
nc_customer c
ON b.c_id = c.id INNER JOIN
nc_propertys p
ON b.p_id = p.id;

Improve delete with IN performance

I struggle to write a DELETE query in MariaDB 5.5.44 database.
The first of the two following code samples works great, but I need to add a WHERE statement there. That is displayed in the second code sample.
I need to delete only rows from polozkyTransakci where puvodFaktury <> FAKTURA VO CZ in transakce_tmp table. I thought that my WHERE statement in the second sample could have worked ok with the inner SELECT, but it takes forever to process (it takes about 40 minutes in my cloud based ETL tool) and even then it does not leave the rows I want untouched.
1.
DELETE FROM polozkyTransakci
WHERE typPolozky = 'odpocetZalohy';
2.
DELETE FROM polozkyTransakci
WHERE typPolozky = 'odpocetZalohy'
AND idTransakce NOT IN (
SELECT idTransakce
FROM transakce_tmp
WHERE puvodFaktury = 'FAKTURA VO CZ');
Thaks a million for any help
David
IN is very bad on performance .. Try using NOT EXISTS()
DELETE FROM polozkyTransakci
WHERE typPolozky = 'odpocetZalohy'
AND NOT EXISTS (SELECT 1
FROM transakce_tmp r
WHERE r.puvodFaktury = 'FAKTURA VO CZ'
AND r.idTransakce = polozkyTransakci.idTransakce );
Before you can performance tune, you need to figure out why it is not deleting the correct rows.
So first start with doing selects until you get the right rows identified. Build your select a bit at time checking the results at each stage to see if you are getting the results you want.
Once you have the select then you can convert to a delete. When testing the delete do it is a transaction and run some test of the data that is left behind to ensure it deleted properly before rolling back or committing. Since you likely want to performance tune, I would suggest rolling back, so that you can then try again on the performance tuned version to ensure you got the same results. Of course, you only want to do this on a dev server!
Now while I agree that not exists may be faster, some of the other things you want to look at are:
do you have cascade deletes happening? If you end up deleting many
child records, that could be part of the problem.
Are there triggers affecting the delete? especially look to see if someone set one up to run through things row by row instead of as a set. Row by row triggers are a very bad thing when you delete many records. For instance suppose you are deleting 50K records and you have a delete trigger to an audit table. If it inserts to that table one record at a time, it is being executed 50K times. If it inserts all the deleted records in one step, that insert individually might take a bit longer but the total execution is much shorter.
What indexing do you have and is it helping the delete out?
You will want to examine the explain plan for each of your queries to
see if they are improving the details of how the query will be
performed.
Performance tuning is a complex thing and it is best to get read up on it in detail by reading some of the performance tuning books available for your specific database.
I might be inclined to write the query as a LEFT JOIN, although I'm guessing this would have the same performance plan as NOT EXISTS:
DELETE pt
FROM polozkyTransakci pt LEFT JOIN
transakce_tmp tt
ON pt.idTransakce = tt.idTransakce AND
tt.puvodFaktury = 'FAKTURA VO CZ'
WHERE pt.typPolozky = 'odpocetZalohy' AND tt.idTransakce IS NULL;
I would recommend indexes, if you don't have them: polozkyTransakci(typPolozky, idTransakce) and transakce_tmp(idTransakce, puvodFaktury). These would work on the NOT EXISTS version as well.
You can test the performance of these queries using SELECT:
SELECT pt.*
FROM polozkyTransakci pt LEFT JOIN
transakce_tmp tt
ON pt.idTransakce = tt.idTransakce AND
tt.puvodFaktury = 'FAKTURA VO CZ'
WHERE pt.typPolozky = 'odpocetZalohy' AND tt.idTransakce IS NULL;
The DELETE should be slower (due to the cost of logging transactions), but the performance should be comparable.

MSQL alter table with JOIN

I am trying to update a table with a column from another table. I dont want to view the join, I want to alter the table.
However, this is faiing:
UPDATE
a_dataset
SET
a_dataset.lang_flag = b_dataset.language
FROM
a_dataset
INNER JOIN
b_dataset
ON
a_dataset.ID = b_dataset.ID
However, I keep getting a syntax error, and cannot locate what I am missing?
I am guessing that you mean to update your records when you say alter the table. If so, you can simply rewrite your update statement with join like this:
UPDATE a_dataset a
JOIN b_dataset b ON a.ID = b.ID
SET a.lang_flag = b.[LANGUAGE]
As Uueerdo and myself said: Starting table names with numbers is a bad[TM] idea. The same is for letters, which you now chose to use. a is no better than 1 in this regard. Also calling tables just "dataset" isn't really helpful either. What is the table storing? Users? Then call it users. Articles on a news web site? Then call it articles. And so on. Everything in a database is dataset, no need to tell that anyone.
I guess you're new to SQL, am I right? Because another issue is: Unless you're going to drop table b_dataset after this command, you're probably doing something you're not supposed to do in relational data bases. The whole idea is to store all data only once. If you can automagically copy the column from b to a, then you could also select join if from a and b when you need it instead of copying it.
For learning SQL (or anything else), Stack Overflow is probably a bad place (it's good for asking questions in the process, though), so I recommend that you go get someone who has some experience in SQL to teach you, or get some book / tutorial on SQL. From first glance, this seems to be a good on-line book: http://sql.learncodethehardway.org/ - but I didn't read it.

MySQL CONCAT '*' symbol toasts the database

I have a table used for lookups which stores the human-readable value in one column and a the same text stripped of special characters and spaces in another. e.g., the value "Children's Shows" would appear in the lookup column as "childrens-shows".
Unfortunately the corresponding main table isn't quite that simple - for historical reasons I didn't create myself and now would be difficult to undo, the lookup value is actually stored with surrounding asterisks, e.g. '*childrens-shows*'.
So, while trying to join the lookup table sans-asterisks with the main table that has asterisks, I figured CONCAT would help me add them on-the-fly, e.g.;
SELECT *
FROM main_table m
INNER JOIN lookup_table l
ON l.value = CONCAT('*',m.value,'*')
... and then the table was toast. Not sure if I created an infinite loop or really screwed the data, but it required an ISP backup to get the table responding again. I suspect it's because the '*' symbol is probably reserved, like a wildcard, and I've asked the database to do the equivalent of licking its own elbow. Either way, I'm hesitant to 'experiment' to find the answer given the spectacular way it managed to kill the database.
Thanks in advance to anyone who can (a) tell me what the above actually did to the database, and (b) how I should actually join the tables?
When using CONCAT, mysql won't use the index. Use EXPLAIN to check this, but a recent problem I had was that on a large table, the indexed column was there, but the key was not used. This should not bork the whole table however, just make it slow. Possibly it ran out of memory, started to swap and then crashed halfway, but you'd need to check the logs to find out.
However, the root cause is clearly bad table design and that's where the solution lies. Any answer you get that allows you to work around this can only be temporary at best.
Best solution is to move this data into a separate table. 'Childrens shows' sounds like a category and therefore repeated data in many rows. This should really be an id for a 'categories' table, which would prevent the DB from having to run CONCAT on every single row in the table, as you could do this:
SELECT *
FROM main_table m
INNER JOIN lookup_table l
ON l.value = m.value
/* and optionally */
INNER JOIN categories cat
ON l.value = cat.id
WHERE cat.name = 'whatever'
I know this is not something you may be able to do given the information you supplied in your question, but really the reason for not being able to make such a change to a badly normalised DB is more important than the code here. Without either the resources or political backing to do things the right way, you will end up with even more headaches like this, which will end up costing more in the long term. Time for a word with the boss perhaps :)

MySQL How to efficiently compare multiple fields between tables?

So my expertise is not in MySQL so I wrote this query and it is starting to run increasingly slow as in 5 minutes or so with 100k rows in EquipmentData and 30k or so in EquipmentDataStaging (which to me is very little data):
CREATE TEMPORARY TABLE dataCompareTemp
SELECT eds.eds_id FROM equipmentdatastaging eds
INNER JOIN equipment e ON e.e_id_string = eds.eds_e_id_string
INNER JOIN equipmentdata ed ON e.e_id = ed.ed_e_id
AND eds.eds_ed_log_time=ed.ed_log_time
AND eds.eds_ed_unit_type=ed.ed_unit_type
AND eds.eds_ed_value = ed.ed_value
I am using this query to compare data rows pulled from a clients device to current data sitting within their database. From here I take the temp table and use the ID's off it to make conditional decisions. I have the e_id_string indexed and I have e_id indexed and everything else is not. I know that it looks stupid that I have to compare all this information, but the clients system is spitting out redundant data and I am using this query to find it. Any type of help on this would be greatly appreciated whether it be a different approach by SQL or MySql Management. I feel like when I do stuff like this in MSSQL it handles the requests much better, but that is probably because I have something set up incorrectly.
TIPS
index all necessary columns which are using with ON or WHERE condition
here you need to index eds_ed_log_time,eds_e_id_string, eds_ed_unit_type, eds_ed_value,ed_e_id,ed_log_time,ed_unit_type,ed_value
change syntax to SELECT STRAIGHT JOIN ... see more reference