SQL: Get the most frequent value for each group - mysql

Lets say that I have a table ( MS-ACCESS / MYSQL ) with two columns ( Time 'hh:mm:ss' , Value ) and i want to get most frequent value for each group of row.
for example i have
Time | Value
4:35:49 | 122
4:35:49 | 122
4:35:50 | 121
4:35:50 | 121
4:35:50 | 111
4:35:51 | 122
4:35:51 | 111
4:35:51 | 111
4:35:51 | 132
4:35:51 | 132
And i want to get most frequent value of each Time
Time | Value
4:35:49 | 122
4:35:50 | 121
4:35:51 | 132
Thanks in advance
Remark
I need to get the same result of this Excel solution : Get the most frequent value for each group
** MY SQL Solution **
I found a solution(Source) that works fine with mysql but i can't get it to work in ms-access:
select cnt1.`Time`,MAX(cnt1.`Value`)
from (select COUNT(*) as total, `Time`,`Value`
from `my_table`
group by `Time`,`Value`) cnt1,
(select MAX(total) as maxtotal from (select COUNT(*) as total,
`Time`,`Value` from `my_table` group by `Time`,`Value`) cnt3 ) cnt2
where cnt1.total = cnt2.maxtotal GROUP BY cnt1.`Time`

Consider an INNER JOIN to match the two derived table subqueries rather than a list of subquery select statements matched with WHERE clause. This has been tested in MS Access:
SELECT MaxCountSub.`Time`, CountSub.`Value`
FROM
(SELECT myTable.`Time`, myTable.`Value`, Count(myTable.`Value`) AS CountOfValue
FROM myTable
GROUP BY myTable.`Time`, myTable.`Value`) As CountSub
INNER JOIN
(SELECT dT.`Time`, Max(CountOfValue) As MaxCountOfValue
FROM
(SELECT myTable.`Time`, myTable.`Value`, Count(myTable.`Value`) AS CountOfValue
FROM myTable
GROUP BY myTable.`Time`, myTable.`Value`) As dT
GROUP BY dT.`Time`) As MaxCountSub
ON CountSub.`Time` = MaxCountSub.`Time`
AND CountSub.CountOfValue = MaxCountSub.MaxCountOfValue

you can do this by query like this:
select time, value
from (select value, time from your_table
group by value , time
order by count(time) desc
) temp where temp.value = value
group by value

Related

sql query group by with aggregate

Have the following table as an example:
zipcode | zipsource | patientcount
-----------------------------------
81501 | CMHSP | 10
81503 | CMHSP | 20
81505 | CMHSP | 30
81501 | SMHRMC | 15
81503 | SMHRMC | 25
81505 | SMHRMC | 35
Trying to show only the zipcodes where the patient count is above 20% of total for Source and Source = SMHRMC(normally a parameter but for the example I've selected SMHRMC). Output table as follows:
zipcode | zipsource | patientcount | Total | Percent
--------------------------------------------------------
81503 | SMHRMC | 25 | 75 | 25%
81505 | SMHRMC | 35 | 75 | 47%
I've tried multiple queries but at this point I don't think I'm close. Any ideas?
The query that works is as follows:
select zipcode,
zip_source,
patient_count,
total_count,
patient_count *100/total_count as percentage
from Zip_Count_Source
cross join (select sum(patient_count) as total_count
from zip_count_Source
where zip_source = 'COMHSP') as X
where zip_source = 'COMHSP' and patient_count*100/total_count > 1
But what I'm having issues with now is that Zip_source can be a multi-value parameter so I change the clauses to zip_source in ('COMHSP', 'SMHRMC') and it works but I want to total_count for each source but not the combined for both sources. Group By did not work after the where clause. Thanks for all that have helped.
Try this:
SELECT zipcode, zipsource, patientcount * 100 / total_count
FROM mytable
CROSS JOIN (SELECT SUM(patientcount) AS total_count
FROM mytable
WHERE zipsource = 'SMHRMC') AS x
WHERE zipsource = 'SMHRMC' AND patientcount / total_count > 0.2
The query uses a CROSS JOIN in order to link the total count of patientcount with the table. Using this count we can calculate the percentage, and filter out any rows not exceeding the desired value.
Demo here
This should do the trick
select t1.zipcode,
t1.zipsource,
t1.patientcount,
t2.total,
t1.patientcount / t2.total * 100 as percent
from yourTable t1
join (
select zipcode, sum(patientcount) as total
from yourTable
group by zipcode
) t2
on t1.zipcode = t2.zipcode
where t1.patientcount / t2.total > 0.2
To filter for a single zipsource, you can add a condition to the where clause
where t1.patientcount / t2.total > 0.2 and
t1.zipsource = 'SMHRMC'
When joining to another table, there is commonly a join condition to filter out rows preventing a Cartesian product (the contents of one table multiplied by another). Since the derived table (total) returns a single value, the join condition is unneeded (the contents of table1 multiplied by 1 = table1).
This is acceptable as long as the value represents what you want to express (i.e., remove the where condition and it will produce the grand total of all patients).
select zipcode,
zipsource,
sum(patientcount) as patientcount,
Total,
concat(round(100*sum_patientcount/Total),'%') as `SHRMC_%`
from table1,
(select count(*) as Total
from table1
where zipsource='SHRMC') as total
where zipsource='SHRMC'
group by zipcode
having sum(patientcount)/total >= .2

SQL to display rows greater than or equal to a row value by keeping the same id and loop for all id's

I have 3 columns production number(int) , op number(int) and value(float). No column is distinct by itself. I need to look for the values <= 0 and display everything that's within that production number(int)
Example :
PO# | OP# | values
5247 | 100 | 12.0
5247 | 200 | 22.0
5247 | 300 | -12.0
5247 | 400 | 52.0
6328 | 100 | 11.0
6328 | 300 | 55.0
I need to get these two rows
5247, 300 , -12.0 and
5247, 400 , 52.0
not any other rows. How do I do that?
Just another guess to improve #DarshanMehta query:
http://sqlfiddle.com/#!9/0a3567/2
SELECT t.*
FROM prodOps t
INNER JOIN
(SELECT po, op
FROM prodOps
WHERE `values` < 0) f
ON t.OP >= f.op AND f.po=t.po
but what will happen if you have several records with negative values?
Check this fiddle:
http://sqlfiddle.com/#!9/138f44/1
and guess-solution is:
SELECT t.*
FROM prodOps t
INNER JOIN
(SELECT po, MAX(op) op
FROM prodOps
WHERE `values` < 0
GROUP BY po) f
ON t.OP >= f.op AND f.po=t.po
But I would suggest you to rethink the problem and redesign table and refactor your app. All this way goes to nowhere.
using exists()
select *
from t
where exists (
select 1
from t as i
where t.PO = i.PO
and t.OP >= i.OP
and i.value < 0
)
If I understand your requiremetns
Select A.*
From YourTable A
Join (
Select [PO#]
,min([Values]) as minV
,max([Values]) as maxV
From YourTable
Group by [PO#]
Having min([Values])<0
) B
on A.[PO#]=B.[PO#] and A.[Values] in (B.minV,B.maxV)
Returns
PO# OP# values
5247 300 -12
5247 400 52
Can you try the below query:
SELECT *
FROM table t
WHERE t.OP >=
(SELECT MAX(OP)
FROM table
WHERE PO = t.PO AND value < 0);
update
Here's the SQL Fiddle.

MySQL - select rows under an ID, group by column value that has the latest timestamp

Table:
----------------------------------------------------
ID | field_name | field_value | timestamp
----------------------------------------------------
2 | postcode | LS1 | 2016-11-09 16:45:15
2 | age | 34 | 2016-11-09 16:45:22
2 | job | Scientist | 2016-11-09 16:45:27
2 | age | 38 | 2016-11-09 16:46:40
7 | postcode | LS5 | 2016-11-09 16:47:05
7 | age | 24 | 2016-11-09 16:47:44
I wonder if anyone could give me a few pointers, based on the above data, I would like to query by ID 2, return a row for each unique field_name (if more than one row exists under the same id with the same field_name then just return the row with the latest timestamp).
I have managed to almost achieve this by grouping the field_name, which will return a list of unique rows but not necessarily the latest row.
SELECT * FROM fragment WHERE (id = :id) GROUP BY field_name
I would really be grateful for any pointers on what exactly I should do here, and how I could fit something along the lines of MAX(timestamp) in this query,
Many thanks!
Consider you first need a set of data for each ID, FieldName with the max time stamp. (generate that set) as an inline view (B below). Then, join this set (B) back to your base set allowing the inner join to eliminate the unwanted rows.
SELECT A.ID, A.field_name, A.field_value, A.timestamp
FROM Table A
INNER JOIN (SELECT ID, field_name, MAX(timestamp) TS
FROM table
GROUP BY ID, field_name) B
on A.ID = B.ID
and A.field_name = B.field_name
and A.timestamp = B.TS
Outside of MySQL this could be done using window/analytical functions as you would be able to assign a row number to each record and eliminate those > 1 something like....
SELECT B.*
FROM (SELECT A.ID
, A.field_name
, A.field_Vale
, A.timestamp
, Rownumber() over (Order by A.timestamp Desc) RN
FROM Table A ) B
WHERE B.RN = 1
or using a cross apply with a limit or top.
The Simpliest way to do:
SELECT *
FROM fragment fra1
WHERE (id = :id)
and timestamp = (select max(timestamp)
from fragment fra2
where fra2.id = fra1.id
and fra2.field_name = fra1.field_name)
GROUP BY field_name

Using ORDER BY and GROUP BY together

My table looks like this (and I'm using MySQL):
m_id | v_id | timestamp
------------------------
6 | 1 | 1333635317
34 | 1 | 1333635323
34 | 1 | 1333635336
6 | 1 | 1333635343
6 | 1 | 1333635349
My target is to take each m_id one time, and order by the highest timestamp.
The result should be:
m_id | v_id | timestamp
------------------------
6 | 1 | 1333635349
34 | 1 | 1333635336
And i wrote this query:
SELECT * FROM table GROUP BY m_id ORDER BY timestamp DESC
But, the results are:
m_id | v_id | timestamp
------------------------
34 | 1 | 1333635323
6 | 1 | 1333635317
I think it causes because it first does GROUP_BY and then ORDER the results.
Any ideas? Thank you.
One way to do this that correctly uses group by:
select l.*
from table l
inner join (
select
m_id, max(timestamp) as latest
from table
group by m_id
) r
on l.timestamp = r.latest and l.m_id = r.m_id
order by timestamp desc
How this works:
selects the latest timestamp for each distinct m_id in the subquery
only selects rows from table that match a row from the subquery (this operation -- where a join is performed, but no columns are selected from the second table, it's just used as a filter -- is known as a "semijoin" in case you were curious)
orders the rows
If you really don't care about which timestamp you'll get and your v_id is always the same for a given m_i you can do the following:
select m_id, v_id, max(timestamp) from table
group by m_id, v_id
order by max(timestamp) desc
Now, if the v_id changes for a given m_id then you should do the following
select t1.* from table t1
left join table t2 on t1.m_id = t2.m_id and t1.timestamp < t2.timestamp
where t2.timestamp is null
order by t1.timestamp desc
Here is the simplest solution
select m_id,v_id,max(timestamp) from table group by m_id;
Group by m_id but get max of timestamp for each m_id.
You can try this
SELECT tbl.* FROM (SELECT * FROM table ORDER BY timestamp DESC) as tbl
GROUP BY tbl.m_id
SQL>
SELECT interview.qtrcode QTR, interview.companyname "Company Name", interview.division Division
FROM interview
JOIN jobsdev.employer
ON (interview.companyname = employer.companyname AND employer.zipcode like '100%')
GROUP BY interview.qtrcode, interview.companyname, interview.division
ORDER BY interview.qtrcode;
I felt confused when I tried to understand the question and answers at first. I spent some time reading and I would like to make a summary.
The OP's example is a little bit misleading.
At first I didn't understand why the accepted answer is the accepted answer.. I thought that the OP's request could be simply fulfilled with
select m_id, v_id, max(timestamp) as max_time from table
group by m_id, v_id
order by max_time desc
Then I took a second look at the accepted answer. And I found that actually the OP wants to express that, for a sample table like:
m_id | v_id | timestamp
------------------------
6 | 1 | 11
34 | 2 | 12
34 | 3 | 13
6 | 4 | 14
6 | 5 | 15
he wants to select all columns based only on (group by)m_id and (order by)timestamp.
Then the above sql won't work. If you still don't get it, imagine you have more columns than m_id | v_id | timestamp, e.g m_id | v_id | timestamp| columnA | columnB |column C| .... With group by, you can only select those "group by" columns and aggreate functions in the result.
By far, you should have understood the accepted answer.
What's more, check row_number function introduced in MySQL 8.0:
https://www.mysqltutorial.org/mysql-window-functions/mysql-row_number-function/
Finding top N rows of every group
It does the simlar thing as the accepted answer.
Some answers are wrong. My MySQL gives me error.
select m_id,v_id,max(timestamp) from table group by m_id;
#abinash sahoo
SELECT m_id,v_id,MAX(TIMESTAMP) AS TIME
FROM table_name
GROUP BY m_id
#Vikas Garhwal
Error message:
[42000][1055] Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'testdb.test_table.v_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Why make it so complicated? This worked.
SELECT m_id,v_id,MAX(TIMESTAMP) AS TIME
FROM table_name
GROUP BY m_id
Just you need to desc with asc. Write the query like below. It will return the values in ascending order.
SELECT * FROM table GROUP BY m_id ORDER BY m_id asc;

Query to Segment Results Based on Equal Sets of Column Value

I'd like to construct a single query (or as few as possible) to group a data set. So given a number of buckets, I'd like to return results based on a specific column.
So given a column called score which is a double which contains:
90.00
91.00
94.00
96.00
98.00
99.00
I'd like to be able to use a GROUP BY clause with a function like:
SELECT MIN(score), MAX(score), SUM(score) FROM table GROUP BY BUCKETS(score, 3)
Ideally this would return 3 rows (grouping the results into 3 buckets with as close to equal count in each group as is possible):
90.00, 91.00, 181.00
94.00, 96.00, 190.00
98.00, 99.00, 197.00
Is there some function that would do this? I'd like to avoid returning all the rows and figuring out the bucket segments myself.
Dave
create table test (
id int not null auto_increment primary key,
val decimal(4,2)
) engine = myisam;
insert into test (val) values
(90.00),
(91.00),
(94.00),
(96.00),
(98.00),
(99.00);
select min(val) as lower,max(val) as higher,sum(val) as total from (
select id,val,#row:=#row+1 as row
from test,(select #row:=0) as r order by id
) as t
group by ceil(row/2)
+-------+--------+--------+
| lower | higher | total |
+-------+--------+--------+
| 90.00 | 91.00 | 181.00 |
| 94.00 | 96.00 | 190.00 |
| 98.00 | 99.00 | 197.00 |
+-------+--------+--------+
3 rows in set (0.00 sec)
Unluckily mysql doesn't have analytical function like rownum(), so you have to use some variable to emulate it. Once you do it, you can simply use ceil() function in order to group every tot rows as you like. Hope that it helps despite my english.
set #r = (select count(*) from test);
select min(val) as lower,max(val) as higher,sum(val) as total from (
select id,val,#row:=#row+1 as row
from test,(select #row:=0) as r order by id
) as t
group by ceil(row/ceil(#r/3))
or, with a single query
select min(val) as lower,max(val) as higher,sum(val) as total from (
select id,val,#row:=#row+1 as row,tot
from test,(select count(*) as tot from test) as t2,(select #row:=0) as r order by id
) as t
group by ceil(row/ceil(tot/3))