I faced this question in an interview. They asked is there any hierarchy.
Ex: SELECT * FROM invoice WHERE invoiceID=100 AND grossAmount>2000 AND customerName= 'Adam'
Is there a special hierarchy to add those 3 conditions? Something Like check numeric condition first?
Please give me your opinion.
The query optimizer will look at the conditions in the WHERE clause and evaluate them in whatever order it finds that:
Ensures correctness
Takes advantage of indexes and other information about the DB
For example, if you had an index on invoiceID it might evaluate that first so that it had fewer rows to examine in checking customerName and grossAmount.
Your example is all 'AND' clauses so there is no precedence involved.
Here is the official documentation on Oracle's website
In your case, the query will run as its written since = and < have same operator precedence.
SELECT * FROM invoice WHERE (invoiceID=100 AND grossAmount>2000 AND customerName= 'Adam')
If it was an OR clause
SELECT * FROM invoice WHERE (invoiceID=100) OR (grossAmount>2000 AND customerName= 'Adam')
Then the AND would run first and then OR. Only in cases where same operators exists then it gets to = + etc. Check documentation for order.
Related
I have a doubt and question regarding alias in sql. If i want to use the alias in same query can i use it. For eg:
Consider Table name xyz with column a and b
select (a/b) as temp , temp/5 from xyz
Is this possible in some way ?
You are talking about giving an identifier to an expression in a query and then reusing that identifier in other parts of the query?
That is not possible in Microsoft SQL Server which nearly all of my SQL experience is limited to. But you can however do the following.
SELECT temp, temp / 5
FROM (
SELECT (a/b) AS temp
FROM xyz
) AS T1
Obviously that example isn't particularly useful, but if you were using the expression in several places it may be more useful. It can come in handy when the expressions are long and you want to group on them too because the GROUP BY clause requires you to re-state the expression.
In MSSQL you also have the option of creating computed columns which are specified in the table schema and not in the query.
You can use Oracle with statement too. There are similar statements available in other DBs too. Here is the one we use for Oracle.
with t
as (select a/b as temp
from xyz)
select temp, temp/5
from t
/
This has a performance advantage, particularly if you have a complex queries involving several nested queries, because the WITH statement is evaluated only once and used in subsequent statements.
Not possible in the same SELECT clause, assuming your SQL product is compliant with entry level Standard SQL-92.
Expressions (and their correlation names) in the SELECT clause come into existence 'all at once'; there is no left-to-right evaluation that you seem to hope for.
As per #Josh Einstein's answer here, you can use a derived table as a workaround (hopefully using a more meaningful name than 'temp' and providing one for the temp/5 expression -- have in mind the person who will inherit your code).
Note that code you posted would work on the MS Access Database Engine (and would assign a meaningless correlation name such as Expr1 to your second expression) but then again it is not a real SQL product.
Its possible I guess:
SELECT (A/B) as temp, (temp/5)
FROM xyz,
(SELECT numerator_field as A, Denominator_field as B FROM xyz),
(SELECT (numerator_field/denominator_field) as temp FROM xyz);
This is now available in Amazon Redshift
E.g.
select clicks / impressions as probability, round(100 * probability, 1) as percentage from raw_data;
Ref:
https://aws.amazon.com/about-aws/whats-new/2018/08/amazon-redshift-announces-support-for-lateral-column-alias-reference/
You might find W3Schools "SQL Alias" to be of good help.
Here is an example from their tutorial:
SELECT po.OrderID, p.LastName, p.FirstName
FROM Persons AS p,
Product_Orders AS po
WHERE p.LastName='Hansen' AND p.FirstName='Ola'
Regarding using the Alias further in the query, depending on the database you are using it might be possible.
My requirements are: I now have a table, I need to group according to one of the fields, and get the latest record in the group, and then I search the scheme on the Internet,
SELECT
* FROM(
SELECT
*
FROM
record r
WHERE
r.id in (xx,xx,xx) HAVING 1
ORDER BY
r.time DESC
) a
GROUP BY
a.id
, the result is correct, but I can't understand the meaning of "having 1" after the where statement. I hope a friend can give me an answer. Thank you very much.
It does nothing, just like having true would. Presumably it is a placeholder where sometimes additional conditions are applied? But since there is no group by or use of aggregate functions in the subquery, any having conditions are going to be treated no differently than where conditions.
Normally you select rows and apply where conditions, then any grouping (explicit, or implicit as in select count(*)) occurs, and the having clause can specify further constraints after the grouping.
Note that your query is not guaranteed to give the results you want; the order by in the subquery in theory has no effect on the outer query and the optimizer may skip it. It is possible the presence of having makes a difference to the optimizer, but that is not something you should rely on, certainly from one version of mysql to another.
This is a simple question about efficiency specifically related to the MySQL implementation. I want to just check if a table is empty (and if it is empty, populate it with the default data). Would it be best to use a statement like SELECT COUNT(*) FROM `table` and then compare to 0, or would it be better to do a statement like SELECT `id` FROM `table` LIMIT 0,1 then check if any results were returned (the result set has next)?
Although I need this for a project I am working on, I am also interested in how MySQL works with those two statements and whether the reason people seem to suggest using COUNT(*) is because the result is cached or whether it actually goes through every row and adds to a count as it would intuitively seem to me.
You should definitely go with the second query rather than the first.
When using COUNT(*), MySQL is scanning at least an index and counting the records. Even if you would wrap the call in a LEAST() (SELECT LEAST(COUNT(*), 1) FROM table;) or an IF(), MySQL will fully evaluate COUNT() before evaluating further. I don't believe MySQL caches the COUNT(*) result when InnoDB is being used.
Your second query results in only one row being read, furthermore an index is used (assuming id is part of one). Look at the documentation of your driver to find out how to check whether any rows have been returned.
By the way, the id field may be omitted from the query (MySQL will use an arbitrary index):
SELECT 1 FROM table LIMIT 1;
However, I think the simplest and most performant solution is the following (as indicated in Gordon's answer):
SELECT EXISTS (SELECT 1 FROM table);
EXISTS returns 1 if the subquery returns any rows, otherwise 0. Because of this semantic MySQL can optimize the execution properly.
Any fields listed in the subquery are ignored, thus 1 or * is commonly written.
See the MySQL Manual for more info on the EXISTS keyword and its use.
It is better to do the second method or just exists. Specifically, something like:
if exists (select id from table)
should be the fastest way to do what you want. You don't need the limit; the SQL engine takes care of that for you.
By the way, never put identifiers (table and column names) in single quotes.
Can math calculations be done in the WHERE portion of a MySQL statement?
For example, lets say I have the following SQL statement:
SELECT
employee_id,
max_hours,
sum(hours) AS total_hours
FROM
some_table
WHERE
total_hours < (max_hours * 1.5)
I looked around and found that MySQL does have math functions, but all the examples are in the SELECT portion of the statement.
You can use any (supported) arithmetic you like in a where or join clause, as long as the final result is a boolean (true, false or NULL (where null is treat as false).
This will usually mean indexes can not be used as their structure only allows their use for direct equality, inequality, or range lookups. In the example you gave there will be no useful index you could define so the query runner would be forced to perform a table scan. For simple filtering clauses referring to one table an index will only get used if one side is a constant (or a variable that is constant for the run time of the query).
With joining clauses an index might be used for one side of the match, if that side is a direct column reference (i.e. no arithmetic) though if the join is likely to cover many rows a scan may still be used as in index (or even table) scan can be quicker than a great many index seeks.
You might try something like this...
SELECT
employee_id,
max_hours,
SUM(hours)
FROM
some_table
GROUP BY
employee_id
HAVING
SUM(hours) < (max_hours * 1.5)
Here is the query I'm trying to execute, and it's supposed to return a table containing data for the pools that are not full (members_nr < members_max).
SELECT id, name,
(
SELECT COUNT(*) FROM pools_entries WHERE pool_id=p.id AND pending=0
) AS members_nr,
members_max, open
FROM pools p
WHERE id IN(1,2,3,4) AND members_nr < members_max;
The problem is MySQL won't recognize members_nr as a field since it's a result from a subquery. Is there a logic solution to this little issue?
Any help will be much appreciated :)
N.B. is correct, you need the having clause. But for the sake of the googler's i'll share a little knowledge.
The WHERE clause is used for restricting the resultset to specific records, it is also used for optimisation. Mysql uses the WHERE clause to identify which index's it can use to speed up the query.
The HAVING clause is executed right at the end of the query. It is used for filtering the recordset. So imagine you have a list of stuff from the database that matches your WHERE clause. You can then use HAVING to filter that list down further on some set conditions.
My basic rule of thumb is: if you need to select based on a column's value, use WHERE, if you need to select based on the value of something which is not a column in the table, use HAVING.