What is B.N in mysql query? - mysql

SELECT N, IF(P IS NULL,'Root',IF((SELECT COUNT(*) FROM BST WHERE P=B.N)>0,'Inner','Leaf'))
FROM BST AS B
ORDER BY N;
Here N and P are the column names where N is node and P is parent,BST is the name of table and the above query is to find node type of BST but i am not able to understand what P=B.N mean?

First, let me start by saing I really hope these are not the actual names you are using. If they are, do your future self a huge favor and replace them with readable names that actually describs the data the columns and tables holds.
That being said, B.N is the N column in the row of the outer query, since it's using B as an alias to the table name.
In the where clause of the sub query, you are comparing the value of P with the value of N from the main query. This subquery will run once for each row in the main query, so for each row you are getting the count of rows where N is a parent of some node.

In
WHERE P=B.N
P is the "parent" column of BST of the inner most SELECT statement.
B.N refers to th eN ("node") column of the BST table referred in the outer SELECT statement.
The clause
FROM BST AS B
creates B as the alias for the outer BST.

"From BST as B" defines that B to be used as variable for the table BST for this query and N has to be a column in that table so:
cell value of column N in table BST

For any value in column N from table B, first find how many records does the column P has when the value in P equal to the one in N, if the total number is larger than 0, populate it as Inner, otherwise Leaf

Related

Why do dot notation is required here in SQL query?

You are given a table, BST, containing two columns: N and P, where N represents the value of a node in Binary Tree, and P is the parent of N.
Question - Write a query to find the node type of Binary Tree ordered by the value of the node. Output one of the following for each node:
Root: If node is root node.
Leaf: If node is leaf node.
Inner: If node is neither root nor leaf node.
Solution 1 - I am using dot(.) notation on alias B.N=P
SELECT N,
CASE
WHEN P IS NULL THEN 'Root'
WHEN (SELECT COUNT(*) FROM BST WHERE B.N=P)>0 THEN 'Inner'
ELSE 'Leaf'
END AS PLACE
FROM BST B
ORDER BY N;
Solution 2 - Not using and dot operator?
SELECT N,
CASE
WHEN P IS NULL THEN 'Root'
WHEN (SELECT COUNT(*) FROM BST WHERE N=P)>0 THEN 'Inner'
ELSE 'Leaf'
END AS PLACE
FROM BST
ORDER BY N;
My question is -
Why do the Solution 1 is generating correct answer, is it due to . (dot) ? If it is due to dot operator why we didn't use dot operation on P (B.N = P)?
Even I modify solution 2 and write (BST.N = P), it is still giving me wrong answer. Why it is so?
I am confused in usage of .(dot)
You use BST twice in your query. The . tells the DBMS which instance you are using. When omitted, the DBMS has to chose it implicitly.
The table that is implicitly chosen happens not be the same throughout your query.
With more explicit aliases, your query is:
SELECT N,
CASE
WHEN P IS NULL THEN 'Root'
WHEN (SELECT COUNT(*) FROM BST InsideAlias WHERE OutsideAlias.N=InsideAlias.P)>0 THEN 'Inner'
ELSE 'Leaf'
END AS PLACE
FROM BST OutsideAlias
When you remove the aliases, the implicitly chosen instance of BST is:
Inside the subquery SELECT COUNT(*) FROM BST InsideAlias : InsideAlias
In the rest of your query: OutsideAlias (InsideAlias is out of scope for the rest of the query anyway).
Which means:
(SELECT COUNT(*) FROM BST InsideAlias WHERE N=P)
is equivalent to
(SELECT COUNT(*) FROM BST InsideAlias WHERE InsideAlias.N=InsideAlias.P)
Therefore, you are getting the wrong results because it requires a node to be its own parent for the COUNT(*) to be greater than 0.
Instead, OutsideAlias.N=InsideAlias.P translates to: is my node the parent of some other node? Another way to do the test is with EXISTS (SELECT * FROM BST WHERE OutsideAlias.N = P), although that was not your question.
This is about correlated subquery. A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. And how do we distinguish the subquery table and the table from the outer query? There are two cases.
The inner table and the outer table are different tables (having different names): In this case,we simply use their table names as they are distinct.
The inner table and the outer table are the same table (same table name): In this case, in order to distinguish them, we need to give the outer table an alias (if you give the inner table an aliase e.g i and use i.n=n, if means innertable.n=innertable.n, NOT innertable.n=outertable.n)
Therefore, to answer your first question,please check the comments besides the query:
SELECT N,
CASE
WHEN P IS NULL THEN 'Root'
WHEN (SELECT COUNT(*) FROM BST -- this is the table name of the subquery table which does not need an alias
WHERE B.N=P /*B.N=P means table condition of this subquery requires that value of P from this subquery table BST equals to column p of its outer(parent) table which is aliased as B */)>0 THEN 'Inner'
ELSE 'Leaf'
END AS PLACE
FROM BST B -- this is the table name of the main query which needs an alias so it can be distinguishable in the correlated subquery
ORDER BY N;
Before answering your second question, how do we make two tables of the same name distinguishable? We need to give one of them a different name ,which calls for using an alias. But if you use BST.N = P(Here you didn't state in your second question as where you would put the condition. From the context i presume you mean the subquery table condition) in the subquery, then this BST actually means the innertable,thus making the express BST.N = P same as N=P(both prefixed using the innertable). To fix the issue, give the outer table an alias and use the aliase as prefix for the outertable columns which are used in the subquery.
It has to do with namespaces -- where columns and expressions "live".
A SQL query may include multiple namespaces where columns and expressions are named and can be accessible.
In your queries there are two namespaces:
one for the main query.
another for the inner scalar subquery.
In the first query, columns in the main query can be referenced by prepending the namespace B, as in B.<column> while columns in the inner namespace can be referenced using the namespace BST as in BST.<column>.
If a column name (or expression name) does not explicitly includes a namespace, then the closer accessible one wins.
In your second query you don't specify a namespace and, therefore, the columns N and P in the expression N = P reference the same inner table, and so the subquery is not correlated to the main one. In the first query B.N references a column on the main query, and therefore the expression B.N = P compares columns from different tables, and then the query is correlated.

How to add a column of a constant value from a query to another query result?

Basically, I have two tables. From table A, I want to calculate the total number of rows from it. I can use SELECT COUNT(*) FROM A as the first query to get it. From other table B, I want to select all things(columns) from it. I can use SELECT * FROM B as the second query. My question is how to use a single query to add the result from the first query as a column to the second query result. In other words, I want to have an extra column with the value of total number of rows from Table A to all things from Table B, by using a single query.
CROSS JOIN it:
SELECT * FROM
(SELECT COUNT(*) as cnt FROM A) a
CROSS JOIN
B
Join makes the resultset wider. Union makes the resultset taller. Any time you want to grow the number of columns you have to join, but if you haven't got anything to join ON you can use a CROSS JOIN as it doesn't require any ON predicates
You could alternatively use an INNER JOIN with a predicate that is always true, an old style join syntax without any associated WHERE, or you can put a select that returns a single value as a subquery in the select list area without any co-ordinating predicates. Most DBA would probably assert that none of these are preferable to the CROSS JOIN syntax because CROSS JOIN is an explicit statement of your intent, whereas the others might just look like you forgot something

a comma between SELECT statements

I have this query:
SELECT (#a:=#a+1) AS priority
FROM (SELECT t1.name FROM t1 LIMIT 100) x, (SELECT #a:=0) r
a few questions:
1 - What is the comma doing between the SELECTS? I have never seen a comma between commands, and I don't know what it means
2 - why is the second SELECT given a name?
3 - why is the second SELECT inside brackets?
4 - Performance-wize: Does it select the first 100 rows form t1, and then assigns them a number? What is going on here??
It is performing a CROSS JOIN (a cartesian product of the rows) but without the explicit syntax. The following 2 queries produce identical in results:
SELECT *
FROM TableA, TableB
SELECT *
FROM TableA
CROSS JOIN TableB
The query in the question uses 2 "derived tables" instead. I would encourage you to use the explicit join syntax CROSS JOIN and never use just commas. The biggest issue with using just commas is you have no idea if the Cartesian product is deliberate or accidental.
Both "derived tables" have been given an alias - and that is a good thing. How else would you reference some item of the first or second "derived table"? e.g. Imagine they were both queries that had the column ID in them, you would then be able to reference x.ID or r.ID
Regarding what the overall query is doing. First note that the second query is just a single row (1 row). So even though the syntax produces a CROSS JOIN it does not expand the total number of rows because 100 * 1 = 100. In effect the subquery "r" is adding a "placeholder" #a (initially at value zero) on every row. Once that #a belongs on each row, then you can increment the value by 1 for each row, and as a result you get that column producing a row number.
x and r are effectively anonymous views produced by the SELECT statements. If you imagine that instead of using SELECTs in brackets, you defined a view using the select statement and then referred to the view, the syntax would be clear.
The selects are given names so that you can refer to these names in WHERE conditions, joins or in the list of fields to select.
That is the syntax. You have to have brackets.
Yes, it selects the first 100 rows. I am not sure what you mean by "gives them a number".

SQL duplicated rows selection without using "having"

I have this type of table:
A.code A.name
1. X
2. Y
3. X
4. Z
5. Y
And i need to write a query that gives me all duplicated names like this:
A.name
X
Y
Z
Without using "group by".
The correlated subquery is your friend here. The subquery is evaluated for every row in the table referenced in the outer query due to the table alias used in both the outer query and the subquery.
In the subquery, the outer table is queried again without the alias to determine the row's compliance with the condition.
SELECT DISTINCT name FROM Names AS CorrelatedNamesTable
WHERE
(
SELECT COUNT(Name) FROM Names WHERE Name = CorrelatedNamesTable.Name
) > 1
Try using DISTINCT for the column. Please note in tables with a large number of rows, this is not the best performance option.
SELECT DISTINCT A.Name FROM A
SELECT a1.name FROM A a1, A a2 WHERE a1.name=a2.name AND a1.code<>a2.code
This assumes code is unique ;).

Optimize sql query by cover index

Query:
SELECT a, b, c FROM table WHERE a = .. and b like 'example%' and c = '..'
Does this query use index (a,b,c) or (a,b)?
For a covering index to even begin to help this query, it needs to be
a,c,b
That's because the query wants a specific single value for a and c and a range of values (LIKE 'string%') for b.
The compound BTREE index gets random-accessed to the specific a,c value and the starting b value. It scans (in a so-called tight scan) to the last eligible b value.
Note that
c,a,b
will also work.