I need to return value in preference order, if the value starts with a it should be returned first, if it does not exist we return the value which starts with b.
SELECT value FROM table one
WHERE value LIKE a% OR value LIKE b%
OR value LIKE c% OR value LIKE d% LIMIT 1;
Does the query return the value in order, as in the value which starts with 'a' should be returned and if a value does not exist then the value starting with 'b' should be returned and so on.
The stuff in your WHERE clause does not determine the order of rows in your result set. Unless your query includes an ORDER BY clause the order of rows is unpredictable. The server is free to return the rows to you in any convenient order, depending on who knows what?
Many programmers fall into a trap here. We experiment with writing the query different ways on tiny test datasets and see different orders, so we get lulled into thinking the order is predictable. Then, our code goes into production on big, and hopefully growing data sets, and BAM! suddenly the server starts using a different query plan because the data grew, and the rows come back in a different order, and some program starts to fail. Usually in the middle of the night.
It would be great if servers returned rows in random order when we don't give an ORDER BY clause. That way we'd have a chance of catching this kind of bug in testing.
If you find yourself writing a query with no ORDER BY, be afraid. Stop and think about unpredictable ordering.
Try with something like this
SELECT *
FROM [Table]
ORDER BY
CASE
WHEN [Value] like 'a%' THEN 1
WHEN [Value] like 'b%' THEN 2
WHEN [Value] like 'c%' THEN 3
ELSE 99 END
Related
I am currently experiencing a (to me) very strange behaviour for one of my mysql 5.6 queries.
I have a given system I am trying to optimize. One step is to only select the fields necessary for the next operation.
The given query looks as follows:
SELECT oxv_oxcategories_6_fr.*
FROM oxv_oxobject2category_6 AS oxobject2category
LEFT JOIN oxv_oxcategories_6_fr ON oxv_oxcategories_6_fr.oxid =
oxobject2category.oxcatnid
WHERE oxobject2category.oxobjectid = '<hashed id>'
AND oxv_oxcategories_6_fr.oxid IS NOT NULL
AND (oxv_oxcategories_6_fr.oxactive = 1
AND oxv_oxcategories_6_fr.oxhidden = '0')
ORDER BY oxobject2category.oxtime
I have taken the libery to use more sensible naming in my own query:
SELECT
category_view.*
FROM oxv_oxobject2category_6 category_mapping_view
LEFT JOIN oxv_oxcategories_6_fr category_view ON category_view.OXID =
category_mapping_view.OXCATNID
WHERE category_mapping_view.OXOBJECTID = '<hashed id>'
AND category_view.OXID IS NOT NULL
AND (category_view.OXACTIVE = 1
AND category_view.OXHIDDEN = '0')
ORDER BY category_mapping_view.OXTIME
As you can see, there is not much difference, only the naming is different. So far, everything works as expected. Now I am trying to only select the values I need. So the query looks like this:
SELECT
category_view.OXID,
category_view.OXTITLE
FROM oxv_oxobject2category_6 category_mapping_view
LEFT JOIN oxv_oxcategories_6_fr category_view ON category_view.OXID =
category_mapping_view.OXCATNID
WHERE category_mapping_view.OXOBJECTID = '<hashed id>'
AND category_view.OXID IS NOT NULL
AND (category_view.OXACTIVE = 1
AND category_view.OXHIDDEN = '0')
ORDER BY category_mapping_view.OXTIME;
This also works as expected. But, I also need the field OXPARENTID, so I change the SELECT statement to
category_view.OXID,
category_view.OXTITLE,
category_view.OXPARENTID
Now the order of the items is different and I cannot seem to find out why that is. The new as well as the original query both sort for OXTIME without that field being present in the final result set. There are about 10 entries where OXTIME is 0, and it is those items that get turned around (ordering-wise) as soon as I query for OXPARENTID.
In the original query, OXPARENTID is present as well, so why does it make a difference now? I am guessing that there is some sort of ordering logic going on I do not yet know about.
Mind, that both joined tables are actually views, maybe that has something to do with it. Also, OXID and OXPARENTID are both md5 hashed values.
Any help would be greatly appreciated.
EDIT
In order to clarify, I know that the fact that multiple entries have OXTIME equal 0 makes it impossible to predict beforehand, which entry will be the top one. However, I still expected the order of the entries to be the same every time I call the query (regardless of what I am selecting).
One answer (#GordonLinoff) explains, that
[...] the same query can return the results in different order on different runs
Where does this "randomness" come from?
Your ordering is:
ORDER BY category_mapping_view.OXTIME;
And then you state:
There are about 10 entries where OXTIME is 0, and it is those items that get turned around (ordering-wise) as soon as I query for OXPARENTID.
What you have are ties in the keys. The results can be in any order -- and the same query can return the results in different order on different runs. Technically, the ordering in SQL is unstable.
You can fix this by including another column in the ORDER BY so each row is uniquely defined by the ORDER BY keys. Perhaps that is OXID:
ORDER BY category_mapping_view.OXTIME, category_view.OXID;
By the way, it is "obvious" that sorting in SQL is unstable. Why? SQL tables represent unordered sets. There is no ordering to fall back on when the keys are the same.
I've got a complex query I have to run in an application that is giving me some performance trouble. I've simplified it here. The database is MySQL 5.6.35 on CentOS.
SELECT a.`po_num`,
Count(*) AS item_count,
Sum(b.`quantity`) AS total_quantity,
Group_concat(`web_sku` SEPARATOR ' ') AS web_skus
FROM `order` a
INNER JOIN `order_item` b
ON a.`order_id` = b.`order_key`
WHERE `store` LIKE '%foobar%'
LIMIT 200 offset 0;
The key part of this query is where I've placed "foobar" as a placeholder. If this value is something like big_store, the query takes much longer (roughly 0.4 seconds in the query provided here, much longer in the query I'm actually using) than if the value is small_store (roughly 0.1 seconds in the query provided). big_store would return significantly more results if there were not limit.
But there is a limit and that's what surprises me. Both datasets have more than the LIMIT, which is only 200. It appears to me that MySQL performing the select functions COUNT, SUM, GROUP_CONCAT for all big_store/small_store rows and then applies the LIMIT retroactively. I would imagine that it'd be best to stop when you get to 200.
Could it not do the select functions COUNT, SUM, GROUP_CONCAT actions after grabbing the 200 rows it will use, making my query much much quicker? This seems feasible to me except in cases where there's an ORDER BY on one of those rows.
Does MySQL not use LIMIT to optimize a query select functions? If not, is there a good reason for that? If so, did I make a mistake in my thinking above?
It can stop short due to the LIMIT, but that is not a reasonable query since there is no ORDER BY.
Without ORDER BY, it will pick whatever 200 rows it feels like and stop short.
With an ORDER BY, it will have to scan the entire table that contains store (please qualify columns with which table they come from!). This is because of the leading wildcard. Only then can it trim to 200 rows.
Another problem -- Without a GROUP BY, aggregates (SUM, etc) are performed across the entire table (or at least those that remain after filtering). The LIMIT does not apply until after that.
Perhaps what you are asking about is MariaDB 5.5.21's "LIMIT_ROWS_EXAMINED".
Think of it this way ... All of the components of a SELECT are done in the order specified by the syntax. Since LIMIT is last, it does not apply until after the other stuff is performed.
(There are a couple of exceptions: (1) SELECT col... must be done after FROM ..., since it would not know which table(s); (2) The optimizer readily reorders JOINed table and clauses in WHERE ... AND ....)
More details on that query.
The optimizer peeks ahead, and sees that the WHERE is filtering on order (that is where store is, yes?), so it decides to start with the table order.
It fetches all rows from order that match %foobar%.
For each such row, find the row(s) in order_item. Now it has some number of rows (possibly more than 200) with which to do the aggregates.
Perform the aggregates - COUNT, SUM, GROUP_CONCAT. (Actually this will probably be done as it gathers the rows -- another optimization.)
There is now 1 row (with an unpredictable value for a.po_num).
Skip 0 rows for the OFFSET part of the LIMIT. (OK, another out-of-order thingie.)
Deliver up to 200 rows. (There is only 1.)
Add ORDER BY (but no GROUP BY) -- big deal, sort the 1 row.
Add GROUP BY (but no ORDER BY) in, now you may have more than 200 rows coming out, and it can stop short.
Add GROUP BY and ORDER BY and they are identical, then it may have to do a sort for the grouping, but not for the ordering, and it may stop at 200.
Add GROUP BY and ORDER BY and they are not identical, then it may have to do a sort for the grouping, and will have to re-sort for the ordering, and cannot stop at 200 until after the ORDER BY. That is, virtually all the work is performed on all the data.
Oh, and all of this gets worse if you don't have the optimal index. Oh, did I fail to insist on providing SHOW CREATE TABLE?
I apologize for my tone. I have thrown quite a few tips in your direction; please learn from them.
Is there any standard way to find which clause has limited the result to zero record?
For example i have this query:
SELECT * FROM `tb` WHERE `room` > 2 AND `keywords` LIKE 'Apartment'
If this query do not return any record, How i can find which field has limited the result to zero record.
When you try to search some thing, if there is no result, Some search engine show you a messeage like this:
Try to search without keywords
Or if you are using MATCH(city) AGAINST('tegas') It show you:
Are you meaning texas
During the query execution, all criteria is evaluated. In order to determine if one specific item caused the query to return zero records, then you must run a separate statement for each criteria scenario.
I would suggest starting with all possible criteria, and then working back based off of the importance of the remaining items. This way you are limiting the processing in the most effective manner.
Good Afternoon
Please can someone help me, I’m nearly a total noob. I have a very simple DB which has thousands of rows and very few columns. I have an ID, Name, Image, Information, and Date Added. Really basic!
Now I’m trying to display only a single row of data at a time so there is no need for loops and things in this request. Sounds very simple in theory?.
I can display a row in date order, and by the most recent or oldest, ascending or descending. But I want to be able to display for example: =
The 6th newest entry. And perhaps somewhere else on my sites the 16 most recent entry and so on. This could even be the 1232 most recent entry.
Sounds to me like it would be a common task but I can’t find the answer anywhere. Can someone provide me with the very short command for doing this? I probably missing something really daft and fundamental.
Thanks
Leah
The LIMIT clause can be used to constrain the number of rows returned by
the SELECT statement. LIMIT takes one or two numeric arguments, which
must both be nonnegative integer constants (except when using prepared
statements).
With two arguments, the first argument specifies the offset of the first
row to return, and the second specifies the maximum number of rows to
return. The offset of the initial row is 0 (not 1):
SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15
http://dev.mysql.com/doc/refman/5.1/en/select.html
So if you want the 1232nd row from your table you can something like this:
SELECT * FROM tbl ORDER BY date_added LIMIT 1231,1;
In your query use LIMIT e.g.
LIMIT 6,1 // Starts at row 6 and retrieves one result.
This is going to be one of those questions but I need to ask it.
I have a large table which may or may not have one unique row. I therefore need a MySQL query that will just tell me TRUE or FALSE.
With my current knowledge, I see two options (pseudo code):
[id = primary key]
OPTION 1:
SELECT id FROM table WHERE x=1 LIMIT 1
... and then determine in PHP whether a result was returned.
OPTION 2:
SELECT COUNT(id) FROM table WHERE x=1
... and then just use the count.
Is either of these preferable for any reason, or is there perhaps an even better solution?
Thanks.
If the selection criterion is truly unique (i.e. yields at most one result), you are going to see massive performance improvement by having an index on the column (or columns) involved in that criterion.
create index my_unique_index on table(x)
If you want to enforce the uniqueness, that is not even an option, you must have
create unique index my_unique_index on table(x)
Having this index, querying on the unique criterion will perform very well, regardless of minor SQL tweaks like count(*), count(id), count(x), limit 1 and so on.
For clarity, I would write
select count(*) from table where x = ?
I would avoid LIMIT 1 for two other reasons:
It is non-standard SQL. I am not religious about that, use the MySQL-specific stuff where necessary (i.e. for paging data), but it is not necessary here.
If for some reason, you have more than one row of data, that is probably a serious bug in your application. With LIMIT 1, you are never going to see the problem. This is like counting dinosaurs in Jurassic Park with the assumption that the number can only possibly go down.
AFAIK, if you have an index on your ID column both queries will be more or less equal performance. The second query will need 1 less line of code in your program but that's not going to make any performance impact either.
Personally I typically do the first one of selecting the id from the row and limiting to 1 row. I like this better from a coding perspective. Instead of having to actually retrieve the data, I just check the number of rows returned.
If I were to compare speeds, I would say not doing a count in MySQL would be faster. I don't have any proof, but my guess would be that MySQL has to get all of the rows and then count how many there are. Altough...on second thought, it would have to do that in the first option as well so the code will know how many rows there are as well. But since you have COUNT(id) vs COUNT(*), I would say it might be slightly slower.
Intuitively, the first one could be faster since it can abort the table(or index) scan when finds the first value. But you should retrieve x not id, since if the engine it's using an index on x, it doesn't need to go to the block where the row actually is.
Another option could be:
select exists(select 1 from mytable where x = ?) from dual
Which already returns a boolean.
Typically, you use group by having clause do determine if there are duplicate rows in a table. If you have a table with id and a name. (Assuming id is the primary key, and you want to know if name is unique or repeated). You would use
select name, count(*) as total from mytable group by name having total > 1;
The above will return the number of names which are repeated and the number of times.
If you just want one query to get your answer as true or false, you can use a nested query, e.g.
select if(count(*) >= 1, True, False) from (select name, count(*) as total from mytable group by name having total > 1) a;
The above should return true, if your table has duplicate rows, otherwise false.