I have the following sql statement:
select val, count(*) as tally from sometable group by val order by tally desc
Which produces the following (example) table:
val | tally
----+-------
4 | 20
5 | 10
6 | 5
7 | 3
8 | 2
Now, I want to only display the rows where tally > 5, so the result will be:
val | tally
----+-------
4 | 20
5 | 10
I tried this statement, but it does not work (it says "tally" is unknown):
select val, count(*) as tally from sometable where tally > 5 group by val order by tally desc
Try HAVING (more information):
select val, count(*) as tally from sometable group by val having tally > 5 order by tally desc
There are two approaches you can take:
1: Use HAVING. This means that the query results will be filtered as per the HAVING clause and only some of the original result rows will be returned to you:
SELECT val, COUNT( * ) AS tally
FROM sometable
GROUP BY val
ORDER BY tally DESC
HAVING tally > 5
2: Use a sub-select. This may give you better performance (I 'm no sql guru though so I can't say with confidence):
SELECT val, tally
FROM (SELECT val, COUNT( * ) AS tally
FROM sometable
GROUP BY val
ORDER BY tally DESC)
temp
WHERE tally >1
Related
How to get the MAX value in every albumID(45, 12, 22, 8) in the following table?
I tried with this query.
But it returned me the first value, not max value.(3, 6, 5, 6)
SELECT
*
FROM
(
SELECT
*
FROM
contentnew
WHERE
genreID = 1
ORDER BY
albumID DESC,
reg_count DESC
) AS newTB
GROUP BY
albumID;
Look this
If I use the
Once you group by, you can apply aggregate functions such as max on each group. In your example try:
SELECT albumID, max(reg_count) as max_count
FROM contentnew
GROUP BY albumID
This will project each albumID with the max_count in the group. In the select statement you can only use aggregate functions. The reason why we are able to project (or print) albumID is because this is the column we grouped by.
Following comments:
SELECT *
FROM contentnew as c1
WHERE c1.reg_count < (
SELECT max(c2.reg_count)
FROM contentnew as c2
WHERE c1.albumID = c2.albumID
GROUP BY c2.albumID)
You can try
select max(reg_count) from contentnew group by albumID
You are almost there, one thing that might be helpful is to use row_number() function, if you want every column from the table.
with contentnew_test
as
(
select row_number() over (partition by albumId order by reg_count desc) row
,* from
contentnew
)
select * from contentnew_test where row = 1 order by reg_count desc;
I used this as a reference as not sure about the syntax
https://www.mysqltutorial.org/mysql-window-functions/mysql-row_number-function/
Subquery will give you a result set something like this:
row albumId reg_count ...
1 1 8 ...
2 1 7 ...
3 1 3 ...
4 1 1 ...
1 2 22 ...
2 2 9 ...
3 2 6 ...
4 2 1...and so on.
Does a relational database exist that has a GROUP BY aggregate function such as DISTINCT EXISTS that returns TRUE if there is more than one distinct value for the group and FALSE otherwise? I am looking for something that would iterate through the values in the group until the current value is not the same as the previous value, instead of counting ALL of the distinct values.
Example:
pv_name | time_stamp | value
A | 1 | 1
B | 2 | 1
C | 3 | 1
A | 4 | 2
C | 5 | 2
B | 6 | 3
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING DISTINCT_EXISTS(value);
Result: A, C
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
GROUP BY pv_name
HAVING MIN(value)<>MAX(value);
Might get you there quicker depending on indexes. I don't think you'll do much better than this or COUNT(DISTINCT value) though.
Have you tried joining to example twice?
Psuedo-code example:
with
(
SELECT pv_name
FROM example
WHERE time_stamp > 0 AND time_stamp < 6
) as Q
select distinct Q1.pv_name
from Q as Q1 inner join Q as Q2 on
Q1.pv_name=Q2.pv_name and
Q1.value<>q2.value
You probably know about the COUNT(DISTINCT) function and you want to avoid it to prevent unnecessary computations.
It is hard to know why you are looking for this but I assume that it takes long time to find these groups using the most obvious query:
SELECT type, COUNT(DISTINCT product)
FROM aTable
GROUP BY type
HAVING COUNT(DISTINCT product) > 1
I can recommend you try the window functions. Try for example the new T-SQL's LAST_VALUE and FIRST_VALUE functions:
with c as (
SELECT type
,LAST_VALUE(product) OVER (PARTITION BY type ORDER BY product) lv
,FIRST_VALUE(product) OVER (PARTITION BY type ORDER BY product) pv
FROM aTable
)
SELECT * from c where lv <> pv
If the DB engine is smart enough it will find the first/last value for the group and will not try to count all the values, and therefore perform better.
For MySQL you can use helper variables to get the row_number per group based on the distinct values, something like this:
SELECT type, product
FROM (
SELECT #row_num := IF(#prev_type=type and #prev_prod=product,#row_num+1,1) AS RowNumber
,type
,product
,#prev_type := type
,#prev_prod := product
FROM Person,
(SELECT #row_num := 1) x,
(SELECT #prev_type := '') y,
(SELECT #prev_prod := '') z
ORDER BY type, product
) as a
WHERE RowNumber > 1
I think the having min (value) <> max (value) will be most efficient here. An alternative is:
Select distinct pv_name
From example e
Left join (
Select value
From example
Where ...
Group by value
Having count (*) = 1
) s on e.value = s.value
Where s.value is null
Or you could use NOT EXISTS against that subquery instead.
Include the relevant where clause in the sub query.
Could you help me with simple table SUM and COUNT calculating?
I've simple table 'test'
id name value
1 a 4
2 a 5
3 b 3
4 b 7
5 b 1
I need calculate SUM and Count for "a" and "b". I try this sql request:
SELECT name, SUM( value ) AS val, COUNT( * ) AS count FROM `test`
result:
name val count
a 20 5
But should be
name val count
a 9 2
b 11 3
Could you help me with correct sql request?
Add GROUP BY. That will cause the query to return a count and sum per group you defined (in this case, per name).
Without GROUP BY you just get the totals and any of the names (in your case 'a', but if could just as well have been 'b').
SELECT name, SUM( value ) AS val, COUNT( * ) AS count
FROM `test`
GROUP BY name
You need group by
select
name,
sum(value) as value,
count(*) as `count`
from test group by name ;
I'm trying to query a database but excluding the first and last rows from the table. Here's a sample table:
id | val
--------
1 1
2 9
3 3
4 1
5 2
6 6
7 4
In the above example, I'd first like to order it by val and then exclude the first and last rows for the query.
id | val
--------
4 1
5 2
3 3
7 4
6 6
This is the resulting set I would like. Note row 1 and 2 were excluded as they had the lowest and highest val respectively.
I've considered LIMIT, TOP, and a couple of other things but can't get my desired result. If there's a method to do it (even better with first/last % rather than first/last n), I can't figure it out.
You can try this mate:
SELECT * FROM numbers
WHERE id NOT IN (
SELECT id FROM numbers
WHERE val IN (
SELECT MAX(val) FROM numbers
) OR val IN (
SELECT MIN(val) FROM numbers
)
);
You can try this:
Select *
from table
where
val!=(select val from table order by val asc LIMIT 1)
and
val!=(select val from table order by val desc LIMIT 1)
order by val asc;
You can also use UNION and avoid the 2 val!=(query)
;WITH cte (id, val, rnum, qty) AS (
SELECT id
, val
, ROW_NUMBER() OVER(ORDER BY val, id)
, COUNT(*) OVER ()
FROM t
)
SELECT id
, val
FROM cte
WHERE rnum BETWEEN 2 AND qty - 1
What if you use UNION and exclude the val you don't want. Something like below
select * from your_table
where val not in (
select top 1 val from your_table order by val
union
select top 1 val from your_table order by val desc)
I have a table tbl_patient and I want to fetch last 2 visit of each patient in order to compare whether patient condition is improving or degrading.
tbl_patient
id | patient_ID | visit_ID | patient_result
1 | 1 | 1 | 5
2 | 2 | 1 | 6
3 | 2 | 3 | 7
4 | 1 | 2 | 3
5 | 2 | 3 | 2
6 | 1 | 3 | 9
I tried the query below to fetch the last visit of each patient as,
SELECT MAX(id), patient_result FROM `tbl_patient` GROUP BY `patient_ID`
Now i want to fetch the 2nd last visit of each patient with query but it give me error
(#1242 - Subquery returns more than 1 row)
SELECT id, patient_result FROM `tbl_patient` WHERE id <(SELECT MAX(id) FROM `tbl_patient` GROUP BY `patient_ID`) GROUP BY `patient_ID`
Where I'm wrong
select p1.patient_id, p2.maxid id1, max(p1.id) id2
from tbl_patient p1
join (select patient_id, max(id) maxid
from tbl_patient
group by patient_id) p2
on p1.patient_id = p2.patient_id and p1.id < p2.maxid
group by p1.patient_id
id11 is the ID of the last visit, id2 is the ID of the 2nd to last visit.
Your first query doesn't get the last visits, since it gives results 5 and 6 instead of 2 and 9.
You can try this query:
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
union
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
where id not in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
GROUP BY patient_ID)
order by 1,2
SELECT id, patient_result FROM `tbl_patient` t1
JOIN (SELECT MAX(id) as max, patient_ID FROM `tbl_patient` GROUP BY `patient_ID`) t2
ON t1.patient_ID = t2.patient_ID
WHERE id <max GROUP BY t1.`patient_ID`
There are a couple of approaches to getting the specified resultset returned in a single SQL statement.
Unfortunately, most of those approaches yield rather unwieldy statements.
The more elegant looking statements tend to come with poor (or unbearable) performance when dealing with large sets. And the statements that tend to have better performance are more un-elegant looking.
Three of the most common approaches make use of:
correlated subquery
inequality join (nearly a Cartesian product)
two passes over the data
Here's an approach that uses two passes over the data, using MySQL user variables, which basically emulates the analytic RANK() OVER(PARTITION ...) function available in other DBMS:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM (
SELECT p.id
, p.patient_id
, p.visit_id
, p.patient_result
, #rn := if(#prev_patient_id = patient_id, #rn + 1, 1) AS rn
, #prev_patient_id := patient_id AS prev_patient_id
FROM tbl_patients p
JOIN (SELECT #rn := 0, #prev_patient_id := NULL) i
ORDER BY p.patient_id DESC, p.id DESC
) t
WHERE t.rn <= 2
Note that this involves an inline view, which means there's going to be a pass over all the data in the table to create a "derived tabled". Then, the outer query will run against the derived table. So, this is essentially two passes over the data.
This query can be tweaked a bit to improve performance, by eliminating the duplicated value of the patient_id column returned by the inline view. But I show it as above, so we can better understand what is happening.
This approach can be rather expensive on large sets, but is generally MUCH more efficient than some of the other approaches.
Note also that this query will return a row for a patient_id if there is only one id value exists for that patient; it does not restrict the return to just those patients that have at least two rows.
It's also possible to get an equivalent resultset with a correlated subquery:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
WHERE ( SELECT COUNT(1) AS cnt
FROM tbl_patients p
WHERE p.patient_id = t.patient_id
AND p.id >= t.id
) <= 2
ORDER BY t.patient_id ASC, t.id ASC
Note that this is making use of a "dependent subquery", which basically means that for each row returned from t, MySQL is effectively running another query against the database. So, this will tend to be very expensive (in terms of elapsed time) on large sets.
As another approach, if there are relatively few id values for each patient, you might be able to get by with an inequality join:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
LEFT
JOIN tbl_patients p
ON p.patient_id = t.patient_id
AND t.id < p.id
GROUP
BY t.id
, t.patient_id
, t.visit_id
, t.patient_result
HAVING COUNT(1) <= 2
Note that this will create a nearly Cartesian product for each patient. For a limited number of id values for each patient, this won't be too bad. But if a patient has hundreds of id values, the intermediate result can be huge, on the order of (O)n**2.
Try this..
SELECT id, patient_result FROM tbl_patient AS tp WHERE id < ((SELECT MAX(id) FROM tbl_patient AS tp_max WHERE tp_max.patient_ID = tp.patient_ID) - 1) GROUP BY patient_ID
Why not use simply...
GROUP BY `patient_ID` DESC LIMIT 2
... and do the rest in the next step?