Practical limit to length of SQL query (specifically MySQL) - mysql

Is it particularly bad to have a very, very large SQL query with lots of (potentially redundant) WHERE clauses?
For example, here's a query I've generated from my web application with everything turned off, which should be the largest possible query for this program to generate:
SELECT *
FROM 4e_magic_items
INNER JOIN 4e_magic_item_levels
ON 4e_magic_items.id = 4e_magic_item_levels.itemid
INNER JOIN 4e_monster_sources
ON 4e_magic_items.source = 4e_monster_sources.id
WHERE (itemlevel BETWEEN 1 AND 30)
AND source!=16 AND source!=2 AND source!=5
AND source!=13 AND source!=15 AND source!=3
AND source!=4 AND source!=12 AND source!=7
AND source!=14 AND source!=11 AND source!=10
AND source!=8 AND source!=1 AND source!=6
AND source!=9 AND type!='Arms' AND type!='Feet'
AND type!='Hands' AND type!='Head'
AND type!='Neck' AND type!='Orb'
AND type!='Potion' AND type!='Ring'
AND type!='Rod' AND type!='Staff'
AND type!='Symbol' AND type!='Waist'
AND type!='Wand' AND type!='Wondrous Item'
AND type!='Alchemical Item' AND type!='Elixir'
AND type!='Reagent' AND type!='Whetstone'
AND type!='Other Consumable' AND type!='Companion'
AND type!='Mount' AND (type!='Armor' OR (false ))
AND (type!='Weapon' OR (false ))
ORDER BY type ASC, itemlevel ASC, name ASC
It seems to work well enough, but it's also not particularly high traffic (a few hundred hits a day or so), and I wonder if it would be worth the effort to try and optimize the queries to remove redundancies and such.

Reading your query makes me want to play an RPG.
This is definitely not too long. As long as they are well formatted, I'd say a practical limit is about 100 lines. After that, you're better off breaking subqueries into views just to keep your eyes from crossing.
I've worked with some queries that are 1000+ lines, and that's hard to debug.
By the way, may I suggest a reformatted version? This is mostly to demonstrate the importance of formatting; I trust this will be easier to understand.
select *
from
4e_magic_items mi
,4e_magic_item_levels mil
,4e_monster_sources ms
where mi.id = mil.itemid
and mi.source = ms.id
and itemlevel between 1 and 30
and source not in(16,2,5,13,15,3,4,12,7,14,11,10,8,1,6,9)
and type not in(
'Arms' ,'Feet' ,'Hands' ,'Head' ,'Neck' ,'Orb' ,
'Potion' ,'Ring' ,'Rod' ,'Staff' ,'Symbol' ,'Waist' ,
'Wand' ,'Wondrous Item' ,'Alchemical Item' ,'Elixir' ,
'Reagent' ,'Whetstone' ,'Other Consumable' ,'Companion' ,
'Mount'
)
and ((type != 'Armor') or (false))
and ((type != 'Weapon') or (false))
order by
type asc
,itemlevel asc
,name asc
/*
Some thoughts:
==============
0 - Formatting really matters, in SQL even more than most languages.
1 - consider selecting only the columns you need, not "*"
2 - use of table aliases makes it short & clear ("MI", "MIL" in my example)
3 - joins in the WHERE clause will un-clutter your FROM clause
4 - use NOT IN for long lists
5 - logically, the last two lines can be added to the "type not in" section.
I'm not sure why you have the "or false", but I'll assume some good reason
and leave them here.
*/

Default MySQL 5.0 server limitation is "1MB", configurable up to 1GB.
This is configured via the max_allowed_packet setting on both client and server, and the effective limitation is the lessor of the two.
Caveats:
It's likely that this "packet" limitation does not map directly to characters in a SQL statement. Surely you want to take into account character encoding within the client, some packet metadata, etc.)

SELECT ##global.max_allowed_packet
this is the only real limit it's adjustable on a server so there is no real straight answer

From a practical perspective, I generally consider any SELECT that ends up taking more than 10 lines to write (putting each clause/condition on a separate line) to be too long to easily maintain. At this point, it should probably be done as a stored procedure of some sort, or I should try to find a better way to express the same concept--possibly by creating an intermediate table to capture some relationship I seem to be frequently querying.
Your mileage may vary, and there are some exceptionally long queries that have a good reason to be. But my rule of thumb is 10 lines.
Example (mildly improper SQL):
SELECT x, y, z
FROM a, b
WHERE fiz = 1
AND foo = 2
AND a.x = b.y
AND b.z IN (SELECT q, r, s, t
FROM c, d, e
WHERE c.q = d.r
AND d.s = e.t
AND c.gar IS NOT NULL)
ORDER BY b.gonk
This is probably too large; optimizing, however, would depend largely on context.
Just remember, the longer and more complex the query, the harder it's going to be to maintain.

Most databases support stored procedures to avoid this issue. If your code is fast enough to execute and easy to read, you don't want to have to change it in order to get the compile time down.
An alternative is to use prepared statements so you get the hit only once per client connection and then pass in only the parameters for each call

I'm assuming you mean by 'turned off' that a field doesn't have a value?
Instead of checking if something is not this, and it's also not that etc. can't you just check if the field is null? Or set the field to 'off', and check if type or whatever equals 'off'.

Related

How to get the time that a value remains above a limit

I have a thermometer which is storing all the reading data.
Example:
http://sqlfiddle.com/#!9/c0aab3/9
The idea is to obtain the time that remained with temperature above 85 fahrenheit.
I have invested everything according to my knowledge and I have not been able to find the solution.
Currently, what I'm doing is getting the time when I went above 85 and then getting the next low value of 85 to calculate the difference in time.
If the temperature is maintained at 85 for 5 consecutive minutes the data may fail.
Please, what would be the way to calculate this?
According to the example of sqlfiddle, the results shown are greater than or equal to 85, but in some cases it was not maintained means that low.
that peak from the beginning to the low must be taken and the time is calculated in seconds, therefore, I must do it successively until the end.
Then add all the seconds and get the time.
Base answer (no modification on the table)
I could find a way around with variables and some IF functions that manipulate them. See if this works for you :
SET #currIndex = 1;
SET #indicator = FALSE;
SET #prevIndex = 0;
SELECT Q2.sen,
MIN(Q2.subTime) as 'From',
MAX(Q2.subTime) AS 'To',
TIMEDIFF(MAX(Q2.subTime), MIN(Q2.subTime)) as diff
FROM (SELECT IF(Q1.temp < 85,
IF(#prevIndex = #currIndex,
(#currIndex := #currIndex +1) -1,
#currIndex),
#prevIndex := #currIndex) AS 'Ind',
Q1.subTime,
Q1.temp,
Q1.sen
FROM (SELECT IF(sen_temp.temp < 85,
(#indicator XOR (#indicator := FALSE)),
#indicator := TRUE) as ind,
s_time AS subTime,
temp,
sen
FROM sen_temp
) AS Q1
WHERE Q1.ind
) AS Q2
GROUP BY Q2.`Ind`
ORDER BY Q2.subTime;
Here's an SQL fiddle based on your own.
The main problem of this query is its performance. Since there is no ID on the table, data had to be carried through the queries.
Performance answer (table modification required)
After a lot of optimization work, I ended up adding an ID to the table. It allowed me to have only one sub query instead of 2 and to carry less data in the sub query.
SET #indicator = FALSE;
SET #currentIndex = 0;
SELECT T1.sen, MIN(T1.s_time) as 'From', MAX(T1.s_time) AS 'To',
TIMEDIFF(MAX(T1.s_time), MIN(T1.s_time)) as diff
FROM (SELECT id, (CASE WHEN (temp >= 85) THEN
#currentIndex + (#indicator := TRUE)
WHEN (#indicator) THEN
#currentIndex := #currentIndex + (#indicator := FALSE) + 1
ELSE
#indicator := FALSE
END) as ind
FROM sen_temp
ORDER BY id, s_date, s_time) AS Q1
INNER JOIN sen_temp AS T1 ON Q1.id = T1.id
WHERE Q1.ind > 0
GROUP BY T1.sen, Q1.ind
Please check this fiddle for this more efficient version.
Performance difference explanation
When creating a MySQL Query, performance is always key. If it is simple, the query will execute efficiently and you should not have any problem unless you get into some syntax error or other optimization problems like filtering or ordering data on a field that's not indexed.
When we create sub-queries, it's harder for the database to handle the query. The reason is quite simple : it potentially uses more RAM. When a query containing sub-queries is executed, the sub-queries are executed first ("obviously!" you might say). When a sub-query is executed, the server needs to keep those values for the upper-query, so it kind of creates a temporary table in the RAM, allowing itself to consult the data in the upper-query if it needs to. Even though RAM is quite fast, it may seem slowed down a lot when handling a ludicrous amount of data. It is even worse when the query makes the database server handle so much data that it won't even fit in the RAM, forcing the server to use the much slower system's storage.
If we limit the amount of data generated in sub-queries to the minimum and then only recover wanted data in the main query, the amount of RAM the server uses for the sub-queries is more negligible and the query runs faster.
Therefore, the smaller the data amount is in sub-queries, the faster the whole query will execute. This much is true for pretty much all "SQL like" databases.
Here's a nice reading explaining how queries are optimized

Error: MySQL client ran out of memory

Can anyone please advise me on this error...
The database has 40,000 news stories but only the fields 'story' is large,
'old' is a numeric value 0 or 1,
'title' and 'shortstory' are very short or NULL.
any advice appreciated. This is the result of running a search database query.
Error: MySQL client ran out of memory
Statement: SELECT news30_access.usehtml, old, title, story, shortstory, news30_access.name AS accessname, news30_users.user AS authorname, timestamp, news30_story.id AS newsid FROM news30_story LEFT JOIN news30_users ON news30_story.author = news30_users.uid LEFT JOIN news30_access ON news30_users.uid = news30_access.uid WHERE title LIKE ? OR story LIKE ? OR shortstory LIKE ? OR news30_users.user LIKE ? ORDER BY timestamp DESC
The simple answer is: don't use story in the SELECT clause.
If you want the story, then limit the number of results being returned. Start with, say, 100 results by adding:
limit 100
to the end of the query. This will get the 100 most recent stories.
I also note that you are using like with story as well as other string columns. You probably want to be using match with a full text index. This doesn't solve your immediate problem (which is returning too much data to the client). But, it will make your queries run faster.
To learn about full text search, start with the documentation.

mySQL: Can one rely on the implicit ORDER BY done by mySQL when using an IN-Statement?

I just noticed that,
when i execute the following query:
SELECT * FROM tbl WHERE some_key = 1 AND some_foreign_key IN (2,5,23,8,9);
the results come back in the same order they where given in the IN-Statement List,
e.g. the row with some_foreign_key = 2 is the first row returned,
the one with
some_foreign_key = 9 is the last and so on.
This is exactly the opposite behaviour of what this guy describes:
MySQL WHERE IN - Ordering
Can one rely on this behaviour or modify it via some mySQL Server setting?
I know common wisdom is "no ORDER BY Clause" == "RDBMS can sort however it pleases",
but in my current Task at hand this behaviour is quite helpful (really large import)
and it would be great if i could rely on it.
EDIT: I know about the ORDER BY FIELD Trick already, just wanted to know if i can safely avoid the ORDER BY Clause by setting some config somewhere.
ORDER BY FIELD(some_foreign_key, 2, 5, 23, 8, 9)
isn't really that tough to implement - unless you're really simplifying this example. And as you already know it's the only way to be 100% sure of the output ordering.

MySQL order by problems

I have the following codes..
echo "<form><center><input type=submit name=subs value='Submit'></center></form>";
$val=$_POST['resulta']; //this is from a textarea name='resulta'
if (isset($_POST['subs'])) //from submit name='subs'
{
$aa=mysql_query("select max(reservno) as 'maxr' from reservation") or die(mysql_error()); //select maximum reservno
$bb=mysql_fetch_array($aa);
$cc=$bb['maxr'];
$lines = explode("\n", $val);
foreach ($lines as $line) {
mysql_query("insert into location_list (reservno, location) values ('$cc', '$line')")
or die(mysql_error()); //insert value of textarea then save it separately in location_list if \n is found
}
If I input the following data on the textarea (assume that I have maximum reservno '00014' from reservation table),
Davao - Cebu
Cebu - Davao
then submit it, I'll have these data in my location_list table:
loc_id || reservno || location
00001 || 00014 || Davao - Cebu
00002 || 00014 || Cebu - Davao
Then this code:
$gg=mysql_query("SELECT GROUP_CONCAT(IF((#var_ctr := #var_ctr + 1) = #cnt,
location,
SUBSTRING_INDEX(location,' - ', 1)
)
ORDER BY loc_id ASC
SEPARATOR ' - ') AS locations
FROM location_list,
(SELECT #cnt := COUNT(1), #var_ctr := 0
FROM location_list
WHERE reservno='$cc'
) dummy
WHERE reservno='$cc'") or die(mysql_error()); //QUERY IN QUESTION
$hh=mysql_fetch_array($gg);
$ii=$hh['locations'];
mysql_query("update reservation set itinerary = '$ii' where reservno = '$cc'")
or die(mysql_error());
is supposed to update reservation table with 'Davao - Cebu - Davao' but it's returning this instead, 'Davao - Cebu - Cebu'. I was previously helped by this forum to have this code working but now I'm facing another difficulty. Just can't get it to work. Please help me. Thanks in advance!
I got it working (without ORDER BY loc_id ASC) as long as I set phpMyAdmin operations loc_id ascending. But whenever I delete all data, it goes back as loc_id descending so I have to reset it. It doesn't entirely solve the problem but I guess this is as far as I can go. :)) I just have to make sure that the table column loc_id is always in ascending order. Thank you everyone for your help! I really appreciate it! But if you have any better answer, like how to set the table column always in ascending order or better query, etc, feel free to post it here. May God bless you all!
The database server is allowed to rewrite your query to optimize its execution. This might affect the order of the individual parts, in particular the order in which the various assignments are executed. I assume that some such reodering causes the result of the query to become undefined, in such a way that it works on sqlfiddle but not on your actual production system.
I can't put my finger on the exact location where things go wrong, but I believe that the core of the problem is the fact that SQL is intended to work on relations, but you try to abuse it for sequential programming. I suggest you retrieve the data from the database using portable SQL without any variable hackery, and then use PHP to perform any post-processing you might need. PHP is much better suited to express the ideas you're formulating, and no optimization or reordering of statements will get in your way there. And as your query currently only results in a single value, fetching multiple rows and combining them into a single value in the PHP code shouldn't increase complexety too much.
Edit:
While discussing another answer using a similar technique (by Omesh as well, just as the answer your code is based upon), I found this in the MySQL manual:
As a general rule, you should never assign a value to a user variable
and read the value within the same statement. You might get the
results you expect, but this is not guaranteed. The order of
evaluation for expressions involving user variables is undefined and
may change based on the elements contained within a given statement;
in addition, this order is not guaranteed to be the same between
releases of the MySQL Server.
So there are no guarantees about the order these variable assignments are evaluated, therefore no guarantees that the query does what you expect. It might work, but it might fail suddenly and unexpectedly. Therefore I strongly suggest you avoid this approach unless you have some relaibale mechanism to check the validity of the results, or really don't care about whether they are valid.

How to tune the following MySQL query?

I am using the following MySQL query which is working fine, I mean giving me the desired output but... lets first see the query:
select
fl.file_ID,
length(fl.filedesc) as l,
case
when
fl.part_no is null
and l>60
then concat(fl.fileno,' ', left(fl.filedesc, 60),'...')
when
fl.part_no is null
and length(fl.filedesc)<=60
then concat(fl.fileno,' ',fl.filedesc)
when
fl.part_no is not null
and length(fl.filedesc)>60
then concat(fl.fileno,'(',fl.part_no,')', left(fl.filedesc, 60),'...')
when
fl.part_no is not null
and length(fl.filedesc)<=60
then concat(fl.fileno,'(',fl.part_no,')',fl.filedesc)
end as filedesc
from filelist fl
I don't want to use the length function repeatedly because I guess it would hit the database everytime claiming performance issue. Please suggest if I can store the length once and use it several times.
Once you have accessed a given row, what you do with the columns has only a small impact on performance. So your guess that it "hits the database" more to serve repeated use of that length function isn't as bad as you think.
The analogy I would use is a postal carrier delivering mail to your house, which is miles outside of town. He drives for 20 minutes to your mailbox, and then he worries that it takes too much time to insert one letter at a time into your mailbox, instead of all the letters at once. The cost of that inefficiency is insignificant compared to the long drive.
That said, you can make the query more concise or easier to code or to look at. But this probably won't have a big benefit for performance.
select
fl.file_ID,
concat(fl.fileno,
ifnull(concat('(',fl.part_no,')'), ' '),
left(fl.filedesc,60),
if(length(fl.filedesc)>60,'...','')
) as filedesc
from filelist fl