IN() is usually applied like this:
SELECT eid FROM comments WHERE id IN (1,2,3,4,5,6)
Would this generate an error or is it just syntactically bad?
SELECT eid FROM comments WHERE id IN (6)
It will work as expected. Most probably under the hood it will be optimised as WHERE id = 6 anyway.
No, it won't generate error, it will work correctly,
because the point of IN clause - is to check whether value exists in defined list.
In your case this list contains only one value (6).
No it will not error, the MySQL optimizer is smart in that because it knows id IN (6) is equal to id = 6 and will handle it like that.
SELECT eid FROM comments WHERE id IN (6)
Will be rewritten/handled after optimizing as
/* select#1 */ select test.comments.eid AS eid from test.comments where (
test.comments.id = 6)
see demo
MySQL IN() function finds a match in the given arguments.
Syntax:
expr IN (value,...)
The function returns 1 if expr is equal to any of the values in the IN
list, otherwise, returns 0. If all values are constants, they are
evaluated according to the type of expr and sorted. The search for the
item then is done using a binary search. This means IN is very quick
if the IN value list consists entirely of constants. Otherwise, type
conversion takes place according to the rules.
For your case, If you are conscious about the performance of IN() with one element vs =, actually there is no significant difference between the MySQL statements, and the MySQL optimizer will transform the IN to the = when IN is just one element.
Something like-
SELECT eid FROM comments WHERE id IN (6)
to
SELECT eid FROM comments WHERE id = 6
It will arise performance issue if it contains multiple elements inside the IN(). You can try with EXPLAIN to see the difference. See HERE
Related
When I run the following query, I am returned two entries with duplicate results. Why are duplicate results returned when I’m using distinct here? The primary keys are the house number, street name, and unit number.
SELECT distinct
house_num,
Street_name,
Unit_Designator,
Unit_Num
FROM voterinfo.voter_info
WHERE house_num = 420
AND street_name = "PARK"
AND Unit_Num = ''
AND Unit_Designator = '';
select distinct is a statement that ensures that the result set has no duplicate rows. That is, it filters out rows where every column is the same (and NULL values are considered equal).
It does not look at a subset of columns.
Sometimes, people use select distinct and don't realize that it applies to all columns. It is rather amusing when the first column is in parentheses -- as if parentheses make a difference (they don't).
Then, you might also have situations where values look the same but are not.
Consider this simple example where values differ by only a space as the end of string:
select distinct x
from (select 'a' as x union all
select 'a '
) y;
Here is a db<>fiddle with this example.
This returns two rows, not 1.
Without sample data it is hard to say which of these situations you are referring to. But the rows that you think are "identical" really are not.
For the fields with datatype as Char or similar ( Street_name,Unit_Designator) it is possible that there are spaces that aren't visible in the query editor that are to be removed by applying appropriate trimming logic.Please refer below link,
MySQL select fields containing leading or trailing whitespace
I have a simple query with a few rows and multiple criteria in the where clause but it is only returning one row instead of 13. No joins and the syntax was triple checked and appears to be free of errors.
Query:
select column1, column2, column3
from mydb
where onecolumn in (number1, number2....number13)
Results:
returns one row of data associated with a random number in the where clause
spent a big part of the day trying to figure this one out and am now out of ideas. Please help...
Absent a more detailed test case, and the actual SQL statement that is actually running, this question cannot be answered. Here are some "ideas"...
Our first guess is that the rows you think are going to satisfy the predicates aren't actually satisfying all of the conditions.
Our second guess is that you've got an aggregate expression (COUNT(), MAX(), SUM()) in the SELECT list that's causing an implicit GROUP BY. This is a common "gotcha"... the non-standard MySQL extension to GROUP BY which allows non-aggregates to appear in the SELECT list, which are not also included as expressions in the GROUP BY clause. This same gotcha appears when the GROUP BY clause is omitted entirely, and an aggregate is included in the SELECT list.
But the question doesn't make any mention of an aggregate expression in the SELECT list.
Our third guess is another issue that beginners frequently overlook: the order of precedence of operations, especially AND and OR. For example, consider the expressions:
a AND b OR c
a AND ( b OR c )
( a AND b ) OR c
consider those while we sing-along, Sesame Street style,...: "One of these things is not like the others, one of these things just doesn't belong..."
A fourth guess... if it wasn't for the row being returned having a value of onecolumn as a random number in the IN list... if it was instead the first number in the IN list, we'd be very suspicious that the IN list actually contains a single string value that looks like a list a values, but is actually not.
The two expression in the SELECT list look very similar, but they are very different:
SELECT t.n IN (2,3,5,7) AS n_in_list
, t.n IN ('2,3,5,7') AS n_in_string
FROM ( SELECT 2 AS n
UNION ALL SELECT 3
UNION ALL SELECT 5
) t
The first expression is comparing n to each value in a list of four values.
The second expression is equivalent to t.n IN (2).
This is a frequent trip up when neophytes are dynamically creating SQL text, thinking that they can pass in a string value and that MySQL will see the commas within the string as part of the SQL statement.
(But this doesn't explain how a some the random one in the list.)
Those are all just guesses. Those are some of the most frequent trip ups we see, but we're just guessing. It could be something else entirely. In it's current form, there is no definitive "answer" to the question.
I faced this question in an interview. They asked is there any hierarchy.
Ex: SELECT * FROM invoice WHERE invoiceID=100 AND grossAmount>2000 AND customerName= 'Adam'
Is there a special hierarchy to add those 3 conditions? Something Like check numeric condition first?
Please give me your opinion.
The query optimizer will look at the conditions in the WHERE clause and evaluate them in whatever order it finds that:
Ensures correctness
Takes advantage of indexes and other information about the DB
For example, if you had an index on invoiceID it might evaluate that first so that it had fewer rows to examine in checking customerName and grossAmount.
Your example is all 'AND' clauses so there is no precedence involved.
Here is the official documentation on Oracle's website
In your case, the query will run as its written since = and < have same operator precedence.
SELECT * FROM invoice WHERE (invoiceID=100 AND grossAmount>2000 AND customerName= 'Adam')
If it was an OR clause
SELECT * FROM invoice WHERE (invoiceID=100) OR (grossAmount>2000 AND customerName= 'Adam')
Then the AND would run first and then OR. Only in cases where same operators exists then it gets to = + etc. Check documentation for order.
This is a simple question about efficiency specifically related to the MySQL implementation. I want to just check if a table is empty (and if it is empty, populate it with the default data). Would it be best to use a statement like SELECT COUNT(*) FROM `table` and then compare to 0, or would it be better to do a statement like SELECT `id` FROM `table` LIMIT 0,1 then check if any results were returned (the result set has next)?
Although I need this for a project I am working on, I am also interested in how MySQL works with those two statements and whether the reason people seem to suggest using COUNT(*) is because the result is cached or whether it actually goes through every row and adds to a count as it would intuitively seem to me.
You should definitely go with the second query rather than the first.
When using COUNT(*), MySQL is scanning at least an index and counting the records. Even if you would wrap the call in a LEAST() (SELECT LEAST(COUNT(*), 1) FROM table;) or an IF(), MySQL will fully evaluate COUNT() before evaluating further. I don't believe MySQL caches the COUNT(*) result when InnoDB is being used.
Your second query results in only one row being read, furthermore an index is used (assuming id is part of one). Look at the documentation of your driver to find out how to check whether any rows have been returned.
By the way, the id field may be omitted from the query (MySQL will use an arbitrary index):
SELECT 1 FROM table LIMIT 1;
However, I think the simplest and most performant solution is the following (as indicated in Gordon's answer):
SELECT EXISTS (SELECT 1 FROM table);
EXISTS returns 1 if the subquery returns any rows, otherwise 0. Because of this semantic MySQL can optimize the execution properly.
Any fields listed in the subquery are ignored, thus 1 or * is commonly written.
See the MySQL Manual for more info on the EXISTS keyword and its use.
It is better to do the second method or just exists. Specifically, something like:
if exists (select id from table)
should be the fastest way to do what you want. You don't need the limit; the SQL engine takes care of that for you.
By the way, never put identifiers (table and column names) in single quotes.
What would be the difference between doing:
SELECT person FROM population WHERE id = 1 or id = 2 or id = 3
and -
SELECT person FROM population WHERE id IN (1,2,3)
Are they executed the exact same way? What difference is there? Would there ever be a reason where one would you IN rather than multiple ='s?
No, they perform the same thing. The IN minimizes the query string. That's all. Such statements help in query optimization.
One difference in these two comparison operators would be that IN uses a SET of values to compare, unlike the "=" or "<>" which takes a single value.
According to the manual:
if expr is equal to any of the values in the IN list, else returns 0.
If all values are constants, they are evaluated according to the type
of expr and sorted. The search for the item then is done using a
binary search. This means IN is very quick if the IN value list
consists entirely of constants.