SQL limiting the output to a number of results - mysql

I have this query.
SELECT * FROM Customer WHERE TAGID IN ('UK', 'Germany');
This outputs the results to show all customers in UK and Germany, but as you can imagine, there will be loads of outputs, and I just require it to output 2 results for UK and 2 results for Germany?
Is this possible?

UNION ALL comes to mind:
(SELECT c.*
FROM Customer c
WHERE TAGID = 'UK'
LIMIT 2
) UNION ALL
(SELECT c.*
FROM Customer c
WHERE TAGID = 'Germany'
LIMIT 2
);
Note that this can take advantage of an index onCustomer(TagID). Also, this returns two arbitrary rows. You can use an ORDER BY to return the newest, oldest, biggest, smallest, reddest, bluest or whatever.
Here is a db<>fiddle.

Yes It is possible. Please check below methods.
Method 1 :
SELECT TOP 2 * FROM Customer WHERE TAGID ='UK'
UNION ALL
SELECT TOP 2 * FROM Customer WHERE TAGID = 'Germany'
Method 2:
(SELECT * FROM Customer WHERE TAGID ='UK' LIMIT 2)
UNION ALL
(SELECT * FROM Customer WHERE TAGID = 'Germany' LIMIT 2)
Method 3:
(SELECT * FROM Customer WHERE TAGID ='UK' AND ROWNUM = 2)
UNION ALL
(SELECT * FROM Customer WHERE TAGID = 'Germany' AND ROWNUM = 2)

Related

Random elements inside JOIN

I have this code here
INSERT INTO Directory.CatalogTaxonomy (`CatalogId`, `TaxonomyId`, `TaxonomyTypeId`, `IsApprovalRelevant`)
SELECT cat.CatalogId, dep.Id, #department_type, false
FROM Directory.Catalog cat
JOIN (SELECT * FROM (
SELECT * FROM Taxonomy.Department LIMIT 10
) as dep_tmp ORDER BY RAND() LIMIT 3) AS dep
WHERE cat.CatalogId NOT IN (SELECT CatalogId FROM Directory.CatalogTaxonomy WHERE TaxonomyTypeId = #department_type)
AND cat.UrlStatus = #url_status_green
AND (cat.StatusId = #status_published
OR cat.StatusId = #status_review_required);
And the problem is that, it should for each catalog take the first 10 elements from Department and randomly choose 3 of them, then add to CatalogDepartment 3 rows, each containing the catalog id and a taxonomy id. But instead it randomly chooses 3 Department elements and then adds those 3 elements to each catalog.
The current result looks like this:
1 000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
1 001d4060-2924-4c75-b304-d780454f261b
1 001bc4b8-c1bc-498d-9aee-3825a40587d5
2 000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
2 001d4060-2924-4c75-b304-d780454f261b
2 001bc4b8-c1bc-498d-9aee-3825a40587d5
3 000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
3 001d4060-2924-4c75-b304-d780454f261b
3 001bc4b8-c1bc-498d-9aee-3825a40587d5
As you can see, there are only 3 departments chosen and repeated for every catalog
If you think that the query:
SELECT * FROM (
SELECT * FROM Taxonomy.Department LIMIT 10
) as dep_tmp
ORDER BY RAND() LIMIT 3
that you join to Directory.Catalog returns 3 different departments for each catalog then you are wrong.
This query is executed only once and returns 3 random departments which are joined (always the same 3) to Directory.Catalog.
What you can do is after you CROSS JOIN 10 departments to Directory.Catalog, choose randomly 3 of them for each catalog.
Try this:
INSERT INTO Directory.CatalogTaxonomy (`CatalogId`, `TaxonomyId`, `TaxonomyTypeId`, `IsApprovalRelevant`)
WITH cte AS (
SELECT cat.CatalogId, dep.Id AS TaxonomyId, #department_type AS TaxonomyTypeId, false AS IsApprovalRelevant
FROM Directory.Catalog AS cat
CROSS JOIN (SELECT * FROM Taxonomy.Department LIMIT 10) AS dep
WHERE cat.CatalogId NOT IN (SELECT CatalogId FROM Directory.CatalogTaxonomy WHERE TaxonomyTypeId = department_type)
AND cat.UrlStatus = #url_status_green
AND (cat.StatusId = #status_published OR cat.StatusId = #status_review_required);
)
SELECT t.CatalogId, t.TaxonomyId, t.TaxonomyTypeId, t.IsApprovalRelevant
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY CatalogId ORDER BY RAND()) rn
FROM cte
) t
WHERE t.rn <= 3
Note that this:
SELECT * FROM Taxonomy.Department LIMIT 10
does not guarantee that you get the first 10 elements from Department because a table is not ordered.

Get lowest shared value in a one to many relationship table

I have a one to many relationship table.
I want to get the lowest other_id that is shared between multiple id's.
id other_id
5 5
5 6
5 7
6 6
6 7
7 7
I can do this by building an SQL statement dynamically with parts added for each additional id I want to query on. For example:
select * from (
select other_id from SomeTable where id = 5
) as a
inner join (
select other_id from SomeTable where id = 6
) as b
inner join (
select other_id from SomeTable where id = 7
) as c
on a.other_id = b.other_id
and a.other_id = c.other_id
Is there a better way to do this? More specifically, is there a way to do this that doesn't require a variable number of joins? I feel like this problem probably already has a name and better solutions.
My query gives me the number 7, which is what I want.
i dont have a mysql server to test it out at the moment, but try a "group by" statement:
select
other_id
from
SomeTable
group by
other_id having count(*) > 1
order by
other_id asc
limit 1
the lowest other_id shared between all ids
select other_id
from SomeTable
group by other_id
having count(distinct id) = 3
order by other_id limit 1
or dynamically
select other_id
from SomeTable
group by other_id
having count(distinct id) = (select count(distinct id) from SomeTable)
order by other_id limit 1
or if you want to look for the lowest other_id shared between specific ids
select other_id
from SomeTable
where id in (5,6,7)
group by other_id
having count(distinct id) = 3
order by other_id limit 1
Use MIN for simple query
SELECT MIN(other_id) FROM table
Though, there seems to be working solutions I want to add my version, just to show, how smart I am ;-)
SELECT other_id FROM SomeTable a INNER JOIN SomeTable b on a.other_id=b.other_id ORDER by other_id ASC LIMIT 1

Fetch 2nd Higest value from MySql DB with GROUP BY

I have a table tbl_patient and I want to fetch last 2 visit of each patient in order to compare whether patient condition is improving or degrading.
tbl_patient
id | patient_ID | visit_ID | patient_result
1 | 1 | 1 | 5
2 | 2 | 1 | 6
3 | 2 | 3 | 7
4 | 1 | 2 | 3
5 | 2 | 3 | 2
6 | 1 | 3 | 9
I tried the query below to fetch the last visit of each patient as,
SELECT MAX(id), patient_result FROM `tbl_patient` GROUP BY `patient_ID`
Now i want to fetch the 2nd last visit of each patient with query but it give me error
(#1242 - Subquery returns more than 1 row)
SELECT id, patient_result FROM `tbl_patient` WHERE id <(SELECT MAX(id) FROM `tbl_patient` GROUP BY `patient_ID`) GROUP BY `patient_ID`
Where I'm wrong
select p1.patient_id, p2.maxid id1, max(p1.id) id2
from tbl_patient p1
join (select patient_id, max(id) maxid
from tbl_patient
group by patient_id) p2
on p1.patient_id = p2.patient_id and p1.id < p2.maxid
group by p1.patient_id
id11 is the ID of the last visit, id2 is the ID of the 2nd to last visit.
Your first query doesn't get the last visits, since it gives results 5 and 6 instead of 2 and 9.
You can try this query:
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
union
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
where id not in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
GROUP BY patient_ID)
order by 1,2
SELECT id, patient_result FROM `tbl_patient` t1
JOIN (SELECT MAX(id) as max, patient_ID FROM `tbl_patient` GROUP BY `patient_ID`) t2
ON t1.patient_ID = t2.patient_ID
WHERE id <max GROUP BY t1.`patient_ID`
There are a couple of approaches to getting the specified resultset returned in a single SQL statement.
Unfortunately, most of those approaches yield rather unwieldy statements.
The more elegant looking statements tend to come with poor (or unbearable) performance when dealing with large sets. And the statements that tend to have better performance are more un-elegant looking.
Three of the most common approaches make use of:
correlated subquery
inequality join (nearly a Cartesian product)
two passes over the data
Here's an approach that uses two passes over the data, using MySQL user variables, which basically emulates the analytic RANK() OVER(PARTITION ...) function available in other DBMS:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM (
SELECT p.id
, p.patient_id
, p.visit_id
, p.patient_result
, #rn := if(#prev_patient_id = patient_id, #rn + 1, 1) AS rn
, #prev_patient_id := patient_id AS prev_patient_id
FROM tbl_patients p
JOIN (SELECT #rn := 0, #prev_patient_id := NULL) i
ORDER BY p.patient_id DESC, p.id DESC
) t
WHERE t.rn <= 2
Note that this involves an inline view, which means there's going to be a pass over all the data in the table to create a "derived tabled". Then, the outer query will run against the derived table. So, this is essentially two passes over the data.
This query can be tweaked a bit to improve performance, by eliminating the duplicated value of the patient_id column returned by the inline view. But I show it as above, so we can better understand what is happening.
This approach can be rather expensive on large sets, but is generally MUCH more efficient than some of the other approaches.
Note also that this query will return a row for a patient_id if there is only one id value exists for that patient; it does not restrict the return to just those patients that have at least two rows.
It's also possible to get an equivalent resultset with a correlated subquery:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
WHERE ( SELECT COUNT(1) AS cnt
FROM tbl_patients p
WHERE p.patient_id = t.patient_id
AND p.id >= t.id
) <= 2
ORDER BY t.patient_id ASC, t.id ASC
Note that this is making use of a "dependent subquery", which basically means that for each row returned from t, MySQL is effectively running another query against the database. So, this will tend to be very expensive (in terms of elapsed time) on large sets.
As another approach, if there are relatively few id values for each patient, you might be able to get by with an inequality join:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
LEFT
JOIN tbl_patients p
ON p.patient_id = t.patient_id
AND t.id < p.id
GROUP
BY t.id
, t.patient_id
, t.visit_id
, t.patient_result
HAVING COUNT(1) <= 2
Note that this will create a nearly Cartesian product for each patient. For a limited number of id values for each patient, this won't be too bad. But if a patient has hundreds of id values, the intermediate result can be huge, on the order of (O)n**2.
Try this..
SELECT id, patient_result FROM tbl_patient AS tp WHERE id < ((SELECT MAX(id) FROM tbl_patient AS tp_max WHERE tp_max.patient_ID = tp.patient_ID) - 1) GROUP BY patient_ID
Why not use simply...
GROUP BY `patient_ID` DESC LIMIT 2
... and do the rest in the next step?

Union - Same table, excluding previous results MySQL

I'm trying to write a query that will:
Run a query, give me (x) number of rows (limit 4)
If that query didn't give me the 4 I need, run a second query limit 4-(x) and exclude the ids from the first query
A third query that acts like the second
I have this:
(SELECT *, 1 as SORY_QUERY1 FROM xbamZ where state = 'Minnesota' and industry = 'Miscellaneous' and id != '229' limit 4)
UNION
(SELECT *, 2 FROM xbamZ where state = 'Minnesota' limit 2)
UNION
(SELECT *, 3 FROM xbamZ where industry = 'Miscellaneous' limit 1)
How (or is?) do I do that? Am I close? This query gives me duplicates
I think there is no need for union and three selects. One will work as well
SELECT a.*
FROM
(
SELECT xbamZ.*,
CASE
WHEN state = 'Minnesota' and industry = 'Miscellaneous' and id != '229' THEN 1
WHEN state = 'Minnesota' THEN 2
WHEN industry = 'Miscellaneous' THEN 3
END as rnk
FROM xbamZ
where state = 'Minnesota' or industry = 'Miscellaneous'
)a
ORDER BY rnk
LIMIT 4;

MySQL select rows until fixed number of condition is reached

I have this table
id fruit
---------
1 apple
2 banana <--
3 apple
4 apple
5 apple
6 apple
7 banana <----
8 apple
9 banana
10 apple
And I want to select rows until 2 bananas are found, like
SELECT id FROM table_fruit UNTIL number_of_bananas = 2
So the result would be 1,2,3,4,5,6,7
How could I achieve this?
thanks
I wish I could give credits to all of you who answered my question. I'v tested all of them, and they all work perfectly (got the expected result).
Though answers of Devart and ypercube seem a little bit complex and difficult for me to understand.
And since AnandPhadke was the first one provided a working solution, I'll choose his answer as accepted.
You guys are awesome, thanks!
Try this query -
SELECT id, fruit FROM (
SELECT
b.*, #b:=IF(b.fruit = 'banana', 1, 0) + #b AS banana_number
FROM
bananas b,
(SELECT #b := 0) t
ORDER BY id) t2
WHERE
banana_number < 2 OR banana_number = 2 AND fruit = 'banana'
SQLFiddle demo
select * from tables where id <=
(
select id from (
select id from tables where fruit='banana'
order by id limit 2) a order by id desc limit 1
)
SQLFIDDLE DEMO
#Devart's answer is perfect but it's an alternative option to we can use:
SELECT * FROM table_fruit WHERE id <=
(
SELECT id FROM
(SELECT id FROM table_fruit WHERE fruit='banana' ORDER BY id LIMIT 2) a
ORDER BY ID DESC LIMIT 1
);
Or using MAX
SELECT * FROM table_fruit WHERE id <=
(
SELECT MAX(id) FROM
(SELECT id FROM table_fruit WHERE fruit='banana' ORDER BY id LIMIT 2) a
);
See this SQLFiddle
select * from table_fruit where id <=
(
select max(id) from
(select id from table_fruit where fruit='banana' order by id limit 2) t
)
If there are less than 2 rows with 'banana', this will return all rows of the table:
SELECT t.*
FROM table_fruit AS t
JOIN
( SELECT MAX(id) AS id
FROM
( SELECT id
FROM table_fruit
WHERE fruit = 'banana'
ORDER BY id
LIMIT 1 OFFSET 1
) AS lim2
) AS lim
ON t.id <= lim.id
OR lim.id IS NULL ;