When can I say that it's the best-optimized query? - mysql

I tried to optimize a query, but it still a tad bit slow. Here, EXPLAIN statement data for reference. Also adding an execution and evaluation cost information JSON. Can you tell me if I can improve something? Or it's the best that I can do.
Exaplain.json
EDIT
What I really want to know is, a way to know that the query is fully optimized and I should start looking somewhere else.
Anyways, please tell me for this query and I will learn something more. Adding the query and a diagram of the table structure.
SELECT o.object, b.baseline, s.testType, ut.suite,
JSON_EXTRACT(ut.failTestsData, '$.failButBaselinePassesTests[*]',
'$.baselineDataNotAvailableTests[*]',
'$.failDifferentThanBaselineTests[*]') AS failTests FROM objects as o
LEFT JOIN baselines as b ON b.baselineID = o.baselineID
LEFT JOIN instances AS i ON o.objectID = i.objectID
LEFT JOIN buildOSs as os ON i.osID = os.osID
LEFT JOIN unittestsdetails AS ut ON ut.instanceID = i.instanceID
LEFT JOIN suites AS s ON s.suiteID = ut.suiteID
WHERE o.objectID IN ( 20836, 20210, 20201, 20202, 20370, 21138, 20731,
22242, 21168, 21476, 23384, 22043, 20548, 20289, 20777, 21324, 20545,
20682, 20266, 21184, 21202, 20741, 20918, 20261, 20516, 20291, 20619,
21438, 20351, 22047, 20264, 20265, 21181, 20988, 20842, 21429, 20643,
20570, 20775, 21904, 20923........... )
If you need something else please let me know.

a way to know that the query is fully optimized and I should start looking somewhere else
This doesn't really exist, for a simple reason : if your query is a "bit" complex, depending on the data in your table, what is "fully optimized" may turn to be a pretty bad choice.
Working on a single line vs some thousand of thousand is not the same.
For less complex query, I'll say : if every WHERE and JOIN clause use an index, then you're probably as close as "optimized" you can (maybe function based index and different kind of index, but that's it).
Looking at your query is seem you're already done ;)

Related

Why isn't my SQL LEFT JOIN query working?

It is returning 0 rows when there should be some results.
This is my first attempt at using JOIN but from what I've read this looks pretty much right, right?
"SELECT pending.paymentid, pending.date, pending.ip, pending.payer, pending.type, pending.amount, pending.method, pending.institution, payment.number, _uploads_log.log_filename
FROM pending
LEFT JOIN _uploads_log
ON pending.paymentid='".$_GET['vnum']."'
AND _uploads_log.linkid = pending.paymentid"
I need to return the specified values from each table where both pending.paymentid and _uploads_log.log_filename are equal to $_GET['vnum]
What is the correct way to do this? Why am I not getting any results?
If someone more experienced than me could point me in the right direction I would be much obliged.
EDIT
For pending the primary key is paymentid, for _uploads_log the primary is a col called log_id and log_filename is listed as index.
Try this
SELECT pending.paymentid,
pending.date,
pending.ip,
pending.payer,
pending.type,
pending.amount,
pending.method,
pending.institution,
payment.number,
_uploads_log.log_filename
FROM pending
LEFT JOIN _uploads_log
ON _uploads_log.linkid = pending.paymentid
WHERE _uploads_log.log_filename = '" . $_GET['vnum'] . "'
Your current query is vulnerable with SQL Injection. Please take time to read the article below.
Best way to prevent SQL injection in PHP?
The ON clause only should have the condition to link the two tables especially if it is LEFT JOIN. The WHERE clause then has the actual condition. Otherwise you will get nothing if there is no corresponding entry in _uploads_log. It also is more easy to read in my opinion.
As another remark. It is always better to work with bind parameters to avoid SQL injection.

Rails way of writing this Mysql query

I have a Mysql query which is this:
actors_to_delete = find_by_sql("SELECT * FROM `dvd_role` a
LEFT JOIN `dvd_actor2role` b ON a.id = b.`roleId`
LEFT JOIN `dvd_actor` c ON b.`actorId` = c.`id`
WHERE role LIKE '%uncredited%'
GROUP BY c.id")
I've written it the Railsy way like this:
actors_to_delete = Role.joins("LEFT JOIN `dvd_actor2role` ON dvd_role.id = dvd_actor2role.`roleId`").joins("LEFT JOIN `dvd_actor` ON dvd_actor2role.`actorId` = dvd_actor.`id`").where("dvd_role.role LIKE '%uncredited%'").group("dvd_actor.id")
What I'm wondering (apart from the fact that what is really the difference between these 2 queries, they both do the same thing, so why favour the Railsy way over the more straightforward sql way?) is how do I write it with the relations that are already established in Rails.
If I try do something like this:
Actor.actor2role.role.where("role LIKE ?", '%uncredited%')
I'll get undefined method actor2role because, even though the relationships have been established in Rails, they work for an instance of Actor, not the model itself.
So, in conclusion, just wondering what the best way to do it is. Coming from PHP and Mysql I tend to write a lot of these things out the sql way and am trying to change my evil ways :)
Edit
I also have another problem and that's the fact that in the sql query I get all the info from all 3 tables, the Rails way gives me only the dvd_role table for some reason.
How can I get the data from the other 2 table as well?
I was able to do the latter by adding .select("*") in the beginning. Is this the appropriate way:
actors_to_delete = Role.select("*").joins("LEFT JOIN `dvd_actor2role` ON dvd_role.id = dvd_actor2role.`roleId`").joins("LEFT JOIN `dvd_actor` ON dvd_actor2role.`actorId` = dvd_actor.`id`").where("dvd_role.role LIKE '%uncredited%'").group("dvd_actor.id")

mysql group_concat in where

I am having a problem with the following query(if this is a duplicate question then i'm terribly sorry, but i can't seem to find anything yet that can help me):
SELECT d.*, GROUP_CONCAT(g.name ORDER BY g.name SEPARATOR ", ") AS members
FROM table_d AS d LEFT OUTER JOIN table_g AS g ON (d.eventid = g.id)
WHERE members LIKE '%p%';
MySQL apparently can't handle a comparison of GROUP_CONCAT columns in a WHERE clause.
So my question is very simple. Is there a workaround for this, like using sub-query's or something similar? I really need this piece of code to work and there is not really any alternative to use other than handling this in the query itself.
EDIT 1:
I won't show the actual code as this might be confidential, I'll have to check with my peers. Anyway, I just wrote this code to give you an impression of how the statement looks like although I agree with you that it doesn't make a lot of sense. I'm going to check the answers below in a minute, i'll get back to you then. Again thnx for all the help already!
EDIT 2:
Tried using HAVING, but that only works when i'm not using GROUP BY. When I try it, it gives me a syntax error, but when I remove the GROUP BY the query works perfectly. The thing is, i need the GROUP BY otherwise the query would be meaningless to me.
EDIT 3:
Ok, so I made a stupid mistake and put HAVING before GROUP BY, which obviously doesn't work. Thanks for all the help, it works now!
Use HAVING instead of WHERE.
... HAVING members LIKE '%peter%'
WHERE applies the filter before the GROUP_CONCAT is evaluated; HAVING applies it later.
Edit: I find your query a bit confusing. It looks like it's going to get only one row with all of your names in a single string -- unless there's nobody in your database named Peter, it which case the query will return nothing.
Perhaps HAVING isn't really what you need here...
Try
SELECT ...
...
WHERE g.name = 'peter'
instead. Since you're just doing a simple name lookup, there's no need to search the derived field - just match on the underlying original field.
GROUP_CONCAT is an aggregate function. You have to GROUP BY something. If you just want all the rows that have %peter% in them try
SELECT d.*, g.name
FROM table_d AS d
LEFT OUTER JOIN table_g AS g
ON (d.eventid = g.id)
WHERE g.name LIKE '%peter%';

Linq2sql Optimizing Left join to get items that exist in only in 1 container

I want to get items from one container that don't exist in another. One container is IEnumerable, and another is an entity in DB. For example
IEnumberable<int> ids = new List<int>();
ids.Add(1);
ids.Add(2);
ids.Add(3);
using (MyObjectContext ctx = new MyObjectContext())
{
var filtered_ids = ids.Except(from u in ctx.Users select u.id);
}
This approach works, but I realized that underlying sql is something like SELECT id FROM [Users]. That is not what I want. Changing it to
var filtered_ids = ids.Except(from u in ctx.Users
where ids.Contains(u.id)
select u.id);
improves underlying query and adds WHERE [id] IN (...) which seems a way better.
I have 2 questions:
Is it possible to improve performance any further for this query?
As far as I remember there is a limit on how many parameters can be in IN . Will my current query work if I exceed the limit (which is not very likely to happen, but it's better to be prepare) ?
That query should be fine, provided proper indexes/primary keys are in place.
The upper limit on sql parameters accepted by sql server is around 2100. If you exceed the limit, you will be met with a sql exception instead of results.

Practical limit to length of SQL query (specifically MySQL)

Is it particularly bad to have a very, very large SQL query with lots of (potentially redundant) WHERE clauses?
For example, here's a query I've generated from my web application with everything turned off, which should be the largest possible query for this program to generate:
SELECT *
FROM 4e_magic_items
INNER JOIN 4e_magic_item_levels
ON 4e_magic_items.id = 4e_magic_item_levels.itemid
INNER JOIN 4e_monster_sources
ON 4e_magic_items.source = 4e_monster_sources.id
WHERE (itemlevel BETWEEN 1 AND 30)
AND source!=16 AND source!=2 AND source!=5
AND source!=13 AND source!=15 AND source!=3
AND source!=4 AND source!=12 AND source!=7
AND source!=14 AND source!=11 AND source!=10
AND source!=8 AND source!=1 AND source!=6
AND source!=9 AND type!='Arms' AND type!='Feet'
AND type!='Hands' AND type!='Head'
AND type!='Neck' AND type!='Orb'
AND type!='Potion' AND type!='Ring'
AND type!='Rod' AND type!='Staff'
AND type!='Symbol' AND type!='Waist'
AND type!='Wand' AND type!='Wondrous Item'
AND type!='Alchemical Item' AND type!='Elixir'
AND type!='Reagent' AND type!='Whetstone'
AND type!='Other Consumable' AND type!='Companion'
AND type!='Mount' AND (type!='Armor' OR (false ))
AND (type!='Weapon' OR (false ))
ORDER BY type ASC, itemlevel ASC, name ASC
It seems to work well enough, but it's also not particularly high traffic (a few hundred hits a day or so), and I wonder if it would be worth the effort to try and optimize the queries to remove redundancies and such.
Reading your query makes me want to play an RPG.
This is definitely not too long. As long as they are well formatted, I'd say a practical limit is about 100 lines. After that, you're better off breaking subqueries into views just to keep your eyes from crossing.
I've worked with some queries that are 1000+ lines, and that's hard to debug.
By the way, may I suggest a reformatted version? This is mostly to demonstrate the importance of formatting; I trust this will be easier to understand.
select *
from
4e_magic_items mi
,4e_magic_item_levels mil
,4e_monster_sources ms
where mi.id = mil.itemid
and mi.source = ms.id
and itemlevel between 1 and 30
and source not in(16,2,5,13,15,3,4,12,7,14,11,10,8,1,6,9)
and type not in(
'Arms' ,'Feet' ,'Hands' ,'Head' ,'Neck' ,'Orb' ,
'Potion' ,'Ring' ,'Rod' ,'Staff' ,'Symbol' ,'Waist' ,
'Wand' ,'Wondrous Item' ,'Alchemical Item' ,'Elixir' ,
'Reagent' ,'Whetstone' ,'Other Consumable' ,'Companion' ,
'Mount'
)
and ((type != 'Armor') or (false))
and ((type != 'Weapon') or (false))
order by
type asc
,itemlevel asc
,name asc
/*
Some thoughts:
==============
0 - Formatting really matters, in SQL even more than most languages.
1 - consider selecting only the columns you need, not "*"
2 - use of table aliases makes it short & clear ("MI", "MIL" in my example)
3 - joins in the WHERE clause will un-clutter your FROM clause
4 - use NOT IN for long lists
5 - logically, the last two lines can be added to the "type not in" section.
I'm not sure why you have the "or false", but I'll assume some good reason
and leave them here.
*/
Default MySQL 5.0 server limitation is "1MB", configurable up to 1GB.
This is configured via the max_allowed_packet setting on both client and server, and the effective limitation is the lessor of the two.
Caveats:
It's likely that this "packet" limitation does not map directly to characters in a SQL statement. Surely you want to take into account character encoding within the client, some packet metadata, etc.)
SELECT ##global.max_allowed_packet
this is the only real limit it's adjustable on a server so there is no real straight answer
From a practical perspective, I generally consider any SELECT that ends up taking more than 10 lines to write (putting each clause/condition on a separate line) to be too long to easily maintain. At this point, it should probably be done as a stored procedure of some sort, or I should try to find a better way to express the same concept--possibly by creating an intermediate table to capture some relationship I seem to be frequently querying.
Your mileage may vary, and there are some exceptionally long queries that have a good reason to be. But my rule of thumb is 10 lines.
Example (mildly improper SQL):
SELECT x, y, z
FROM a, b
WHERE fiz = 1
AND foo = 2
AND a.x = b.y
AND b.z IN (SELECT q, r, s, t
FROM c, d, e
WHERE c.q = d.r
AND d.s = e.t
AND c.gar IS NOT NULL)
ORDER BY b.gonk
This is probably too large; optimizing, however, would depend largely on context.
Just remember, the longer and more complex the query, the harder it's going to be to maintain.
Most databases support stored procedures to avoid this issue. If your code is fast enough to execute and easy to read, you don't want to have to change it in order to get the compile time down.
An alternative is to use prepared statements so you get the hit only once per client connection and then pass in only the parameters for each call
I'm assuming you mean by 'turned off' that a field doesn't have a value?
Instead of checking if something is not this, and it's also not that etc. can't you just check if the field is null? Or set the field to 'off', and check if type or whatever equals 'off'.