Q
1: Which query produces the following output from table marks
table name : marks rnk 1 2 3 4
Output rnk 1 3 6 10
select rnk from (select b.rnk as alpha,sum(a.rnk) as rnk from (select * from marks) a join (select * from marks) b on a.rnk <= b.rnk group by 1 )
select rnk from (select b.rnk as alpha,sum(a.rnk) as rnk from (select * from marks) a join (select * from marks) b on a.rnk > b.rnk group by 1 )
select rnk from (select b.rnk as alpha,sum(a.rnk) as rnk from (select * from marks) a join (select * from marks) b on a.rnk = b.rnk group by 1 )
select rnk from (select b.rnk as alpha,avg(a.rnk) as rnk from (select * from marks) a join (select * from marks) b on a.rnk <= b.rnk group by 1 )
This was a question asked in an interview. And I didn't even new the topic related to this.
I failed the test but I really want to know which topics should I cover so I can be more prepared for future. The answer is first I guess but I don't understand what's going on in this query.
Sorry for the bad title but I was unable to even express my thoughts
Thanks in advance and sorry for my bad english.
Which query produces the following output from table marks
Correct answer - none.
The subquery must be assigned with an alias. There is no outer subquery alias in each of these queries - i.e. all 4 queries will produce syntax error like 'Every derived table must have its own alias'.
If you fix these errors then there is another problem - the queries does not contain ORDER BY clause. So the output rows ordering is not defined (is not deterministic), and even when the query produces needed rows then the ordering of these rows may not match to shown one.
If you fix this problem then the query #1 will produce desired output.
The answer is #1.
(We had to add an alias for the sub-query: Z)
We use 2 aliases for the same table and join them so that each row in a is joined to all rows less than itself in b.
We then return a.rnk which is like an id and the sum of b.rnk which is therefore a running total.
Akina is right that there is no order by in the query so there is no guarantee that the order will be the same. (The question was not "how can we garantie this result" but, which query could produce this output")
As you had a problem with this question I suggest that you need to find an SQL tutorial and start from the basics. There are a number of good tutorials out there.
create table marks (rnk int);
insert into marks values (1),(2),(3),(4);
✓
✓
select rnk from
( select
b.rnk as alpha,
sum(a.rnk) as rnk
from
(select * from marks) a
join (select * from marks) b
on a.rnk <= b.rnk group by 1
)z;
| rnk |
| --: |
| 1 |
| 3 |
| 6 |
| 10 |
db<>fiddle here
Related
Here I have this table:
Copies
nInv | Subject | LoanDate | BookCode |MemberCode|
1 |Storia |15/04/2019 00:00:00 |7844455544| 1 |
2 |Geografia |12/09/2020 00:00:00 |8004554785| 4 |
4 |Francese |17/05/2006 00:00:00 |8004894886| 3 |
5 |Matematica |17/06/2014 00:00:00 |8004575185| 3 |
I'm trying to find the value of the highest number of duplicates in the MemberCode column. So in this case I should get 3 as result, as its value appears two times in the table. Also, MemberCode is PK in another table, so ideally I should select all rows of the second table that match the MemberCode in both tables. For the second part I guess I should write something like SELECT * FROM Table2, Copies WHERE Copies.MemberCode = Table2.MemberCode but I'm missing out almost everything on the first part. Can you guys help me?
Use group by and limit:
select membercode, count(*) as num
from t
group by membercode
order by count(*) desc
limit 1;
SELECT MAX(counted) FROM
(SELECT COUNT(MemberCode) AS counted
FROM table_name GROUP BY MemberCode)
Using analytic functions, we can assign a rank to each member code based on its count. Then, we can figure out what its count is.
WITH cte AS (
SELECT t2.MemberCode, COUNT(*) AS cnt,
RANK() OVER (ORDER BY COUNT(*) DESC, t2.MemberCode) rnk
FROM Table2 t2
INNER JOIN Copies c ON c.MemberCode = t2.MemberCode
GROUP BY t2.MemberCode
)
SELECT cnt
FROM cte
WHERE rnk = 1;
Something like this
with top_dupe_member_cte as (
select top(1) MemberCode, Count(*)
from MemberTable
group by MemberCode
order by 2 desc)
select /* columns from your other table */
from OtherTable ot
join top_dupe_member_cte dmc on ot.MemberCode=dmc.MemberCode;
Does a relational database exist that has a GROUP BY aggregate function such as DISTINCT EXISTS that returns TRUE if there is more than one distinct value for the group and FALSE otherwise? I am looking for something that would iterate through the values in the group until the current value is not the same as the previous value, instead of counting ALL of the distinct values.
Example:
pv_name | time_stamp | value
A | 1 | 1
B | 2 | 1
C | 3 | 1
A | 4 | 2
C | 5 | 2
B | 6 | 3
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING DISTINCT_EXISTS(value);
Result: A, C
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING MIN(value)<>MAX(value);
Might get you there quicker depending on indexes. I don't think you'll do much better than this or COUNT(DISTINCT value) though.
Have you tried joining to example twice?
Psuedo-code example:
with
(
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
) as Q
select distinct Q1.pv_name
from Q as Q1 inner join Q as Q2 on
Q1.pv_name=Q2.pv_name and
Q1.value<>q2.value
You probably know about the COUNT(DISTINCT) function and you want to avoid it to prevent unnecessary computations.
It is hard to know why you are looking for this but I assume that it takes long time to find these groups using the most obvious query:
SELECT type, COUNT(DISTINCT product)
FROM aTable
GROUP BY type
HAVING COUNT(DISTINCT product) > 1
I can recommend you try the window functions. Try for example the new T-SQL's LAST_VALUE and FIRST_VALUE functions:
with c as (
SELECT type
,LAST_VALUE(product) OVER (PARTITION BY type ORDER BY product) lv
,FIRST_VALUE(product) OVER (PARTITION BY type ORDER BY product) pv
FROM aTable
)
SELECT * from c where lv <> pv
If the DB engine is smart enough it will find the first/last value for the group and will not try to count all the values, and therefore perform better.
For MySQL you can use helper variables to get the row_number per group based on the distinct values, something like this:
SELECT type, product
FROM (
SELECT #row_num := IF(#prev_type=type and #prev_prod=product,#row_num+1,1) AS RowNumber
,type
,product
,#prev_type := type
,#prev_prod := product
FROM Person,
(SELECT #row_num := 1) x,
(SELECT #prev_type := '') y,
(SELECT #prev_prod := '') z
ORDER BY type, product
) as a
WHERE RowNumber > 1
I think the having min (value) <> max (value) will be most efficient here. An alternative is:
Select distinct pv_name
From example e
Left join (
Select value
From example
Where ...
Group by value
Having count (*) = 1
) s on e.value = s.value
Where s.value is null
Or you could use NOT EXISTS against that subquery instead.
Include the relevant where clause in the sub query.
I have a table with the following structure:
id name
1 X
1 X
1 Y
2 A
2 A
2 B
Basically what I am trying to do is to write a query that returns X for 1 because X has repeated more than Y (2 times) and returns A for 2. So if a value occurs more than the other one my query should return that. Sorry if the title is confusing but I could not find a better explanation. This is what I have tried so far:
SELECT MAX(counted) FROM(
SELECT COUNT(B) AS counted
FROM table
GROUP BY A
) AS counts;
The problem is that my query should return the actual value other than the count of it.
Thanks
This should work:
SELECT count(B) as occurrence, A, B
FROM table
GROUP BY B
ORDER BY occurrence DESC
LIMIT 1;
Please check: http://sqlfiddle.com/#!9/dfa09/3
You can try like this using a GROUP BY clause. See a Demo Here
select *, max(occurence) as Maximum_Occurence from
(
select B, count(B) as occurence
from table1
group by B
) xxx
This is how I finally handled my problem. Not the most efficient way but get the job done:
select A,B from
(select A,B, max(cnt) from
(select A ,B ,count(B) as cnt
from myTable
group by A,B
order by cnt desc
) as x group by A
) as xx
I have a table tbl_patient and I want to fetch last 2 visit of each patient in order to compare whether patient condition is improving or degrading.
tbl_patient
id | patient_ID | visit_ID | patient_result
1 | 1 | 1 | 5
2 | 2 | 1 | 6
3 | 2 | 3 | 7
4 | 1 | 2 | 3
5 | 2 | 3 | 2
6 | 1 | 3 | 9
I tried the query below to fetch the last visit of each patient as,
SELECT MAX(id), patient_result FROM `tbl_patient` GROUP BY `patient_ID`
Now i want to fetch the 2nd last visit of each patient with query but it give me error
(#1242 - Subquery returns more than 1 row)
SELECT id, patient_result FROM `tbl_patient` WHERE id <(SELECT MAX(id) FROM `tbl_patient` GROUP BY `patient_ID`) GROUP BY `patient_ID`
Where I'm wrong
select p1.patient_id, p2.maxid id1, max(p1.id) id2
from tbl_patient p1
join (select patient_id, max(id) maxid
from tbl_patient
group by patient_id) p2
on p1.patient_id = p2.patient_id and p1.id < p2.maxid
group by p1.patient_id
id11 is the ID of the last visit, id2 is the ID of the 2nd to last visit.
Your first query doesn't get the last visits, since it gives results 5 and 6 instead of 2 and 9.
You can try this query:
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
union
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
where id not in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
GROUP BY patient_ID)
order by 1,2
SELECT id, patient_result FROM `tbl_patient` t1
JOIN (SELECT MAX(id) as max, patient_ID FROM `tbl_patient` GROUP BY `patient_ID`) t2
ON t1.patient_ID = t2.patient_ID
WHERE id <max GROUP BY t1.`patient_ID`
There are a couple of approaches to getting the specified resultset returned in a single SQL statement.
Unfortunately, most of those approaches yield rather unwieldy statements.
The more elegant looking statements tend to come with poor (or unbearable) performance when dealing with large sets. And the statements that tend to have better performance are more un-elegant looking.
Three of the most common approaches make use of:
correlated subquery
inequality join (nearly a Cartesian product)
two passes over the data
Here's an approach that uses two passes over the data, using MySQL user variables, which basically emulates the analytic RANK() OVER(PARTITION ...) function available in other DBMS:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM (
SELECT p.id
, p.patient_id
, p.visit_id
, p.patient_result
, #rn := if(#prev_patient_id = patient_id, #rn + 1, 1) AS rn
, #prev_patient_id := patient_id AS prev_patient_id
FROM tbl_patients p
JOIN (SELECT #rn := 0, #prev_patient_id := NULL) i
ORDER BY p.patient_id DESC, p.id DESC
) t
WHERE t.rn <= 2
Note that this involves an inline view, which means there's going to be a pass over all the data in the table to create a "derived tabled". Then, the outer query will run against the derived table. So, this is essentially two passes over the data.
This query can be tweaked a bit to improve performance, by eliminating the duplicated value of the patient_id column returned by the inline view. But I show it as above, so we can better understand what is happening.
This approach can be rather expensive on large sets, but is generally MUCH more efficient than some of the other approaches.
Note also that this query will return a row for a patient_id if there is only one id value exists for that patient; it does not restrict the return to just those patients that have at least two rows.
It's also possible to get an equivalent resultset with a correlated subquery:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
WHERE ( SELECT COUNT(1) AS cnt
FROM tbl_patients p
WHERE p.patient_id = t.patient_id
AND p.id >= t.id
) <= 2
ORDER BY t.patient_id ASC, t.id ASC
Note that this is making use of a "dependent subquery", which basically means that for each row returned from t, MySQL is effectively running another query against the database. So, this will tend to be very expensive (in terms of elapsed time) on large sets.
As another approach, if there are relatively few id values for each patient, you might be able to get by with an inequality join:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
LEFT
JOIN tbl_patients p
ON p.patient_id = t.patient_id
AND t.id < p.id
GROUP
BY t.id
, t.patient_id
, t.visit_id
, t.patient_result
HAVING COUNT(1) <= 2
Note that this will create a nearly Cartesian product for each patient. For a limited number of id values for each patient, this won't be too bad. But if a patient has hundreds of id values, the intermediate result can be huge, on the order of (O)n**2.
Try this..
SELECT id, patient_result FROM tbl_patient AS tp WHERE id < ((SELECT MAX(id) FROM tbl_patient AS tp_max WHERE tp_max.patient_ID = tp.patient_ID) - 1) GROUP BY patient_ID
Why not use simply...
GROUP BY `patient_ID` DESC LIMIT 2
... and do the rest in the next step?
I googled a bit and looked on SO but I didn't find anything that helped me.
I have a working MySQL query that selects some columns (accross three tables, with two JOIN statements) and I am looking to do something extra on the result set.
I would like to SELECT all rows from the 3 most recent groups. (I can only assume I have to use a GROUP BY on that column) I'm having a hard time explaining this clearly so I'll use an example:
id | group
--------------
1 | 1
2 | 2
3 | 2
4 | 2
5 | 3
6 | 3
7 | 4
8 | 4
Of course, I dumbed it down a lot for the sake of simplicity (and my current query doesn't include an id column).
Right now my ideal query would return, in order (that's the id field):
8, 7, 6, 5, 4, 3, 2
If I were to add the following 9th element:
id | group
--------------
9 | 5
My ideal query would then return, in order:
9, 8, 7, 6, 5
Because these are all the rows from the most 3 recent groups. Also, when two rows have the same group (and are still in the results set), I would like to ORDER them BY another field (which I have not included in my dumbed down example).
In my search I only found how to do actions on elements of GROUPS (MAX of each, AVG of group elements, etc.) and not GROUPS themselves (first 3 groups ordered by a field).
Thank you in advance for your help!
Edit: Here is what my real query looks like.
SELECT t1.f1, t1.f2, t2.f1, t2.f2, t2.f3, t3.f1, t3.f2, t3.f3, t3.f4
FROM t1
LEFT JOIN t2 ON t2.f1=t1.f3
LEFT JOIN t3 ON t2.f1=t3.f5
WHERE t1.f4='some_constant' AND t2.f4='some_other_constant'
ORDER BY t1.f2 DESC
SELECT `table`.* FROM
(SELECT DISTINCT `group`
FROM `table`
ORDER BY `group` DESC LIMIT 3) t1
INNER JOIN `table` ON `table`.`group` = t1.`group`
the subquery should return the three groups with the largest value, the INNER JOIN will ensure no rows are included which do not have these group values.
assuming t1.f2 is your group column:
SELECT a,b,c,d,e,f,g,h,i
FROM
(
SELECT t1.f1 as a, t1.f2 as b, t2.f1 as c, t2.f2 as d, t2.f3 as e, t3.f1 as f, t3.f2 as g, t3.f3 as h, t3.f4 as i
FROM t1
LEFT JOIN t2 ON t2.f1=t1.f3
LEFT JOIN t3 ON t2.f1=t3.f5
WHERE t1.f4='some_constant' AND t2.f4='some_other_constant'
ORDER BY t1.f2 DESC
) first_table
INNER JOIN
(
SELECT DISTINCT `f2`
FROM `t1`
ORDER BY `f2` DESC LIMIT 3
) second_table
ON first_table.b = second_table.f2
Note that this may be very inefficient depending on your table structure, but is the best I can do without more information.
how about this way... (i use groupId instead of 'group'
[QUERY] => something like (SELECT id, groupId from tables.....) (your query with 2 joins).
-- with this query you have the last thre groups.
[QUERY2] => SELECT distinct(groupId) as groupId FROM ([QUERY]) ORDER BY groupId DESC LIMIT 0,3
and finally you will have:
SELECT id, groupId from tables----...... WHERE groupId in ([QUERY2]) order by groupId DESC, id DESC