Finding the entry with the most occurrences per group - mysql

I have the following (simplified) Schema.
CREATE TABLE TEST_Appointment(
Appointment_id INT AUTO_INCREMENT PRIMARY KEY,
Property_No INT NOT NULL,
Property_Type varchar(10) NOT NULL
);
INSERT INTO TEST_Appointment(Property_No, Property_Type) VALUES
(1, 'House'),
(1, 'House'),
(1, 'House'),
(2, 'Flat'),
(2, 'Flat'),
(3, 'Flat'),
(4, 'House'),
(5, 'House'),
(6, 'Studio');
I am trying to write a query to get the properties that have the most appointments in each property type group. An example output would be:
Property_No | Property_Type | Number of Appointments
-----------------------------------------------------
1 | House | 3
2 | Flat | 2
6 | Studio | 1
I have the following query to get the number of appointments per property but I am not sure how to go from there
SELECT Property_No, Property_Type, COUNT(*)
from TEST_Appointment
GROUP BY Property_Type, Property_No;

If you are running MySQL 8.0, you can use aggregation and window functions:
select *
from (
select property_no, property_type, count(*) no_appointments,
rank() over(partition by property_type order by count(*) desc) rn
from test_appointment
group by property_no, property_type
) t
where rn = 1
In earlier versions, one option uses a having clause and a row-limiting correlated subquery:
select property_no, property_type, count(*) no_appointments
from test_appointment t
group by property_no, property_type
having count(*) = (
select count(*)
from test_appointment t1
where t1.property_type = t.property_type
group by t1.property_no
order by count(*) desc
limit 1
)
Note that both queries allow ties, if any.

Related

MYSQL sum of max score taken users list

I have following table
CREATE TABLE Table1
(`userid` varchar(11), `score` int, `type` varchar(22));
INSERT INTO Table1
(`userid`, `score`,`type`)
VALUES
(11, 2,'leader'),
(11, 6,'leader'),
(13, 6,'leader'),
(15, 4,'leader'),
(15, 4,'leader'),
(12, 1,'leader'),
(14, 1,'leader');
I need to get userid of the maximum score take user.
if the max score is the same for two or more user need to get that userid also.
I have try following query
SELECT userid, sum(score) as totalScore
FROM Table1 WHERE type = "leader" GROUP BY userid
ORDER BY totalScore DESC;
But it gets all user data, cant get the max score take the first two users id.
But I need to get only first two row of data ..
Please help me
On MySQL 8+, I suggest using the RANK() analytic function:
WITH cte AS (
SELECT userid, SUM(score) AS totalScore,
RANK() OVER (ORDER BY SUM(score) DESC) rnk
FROM Table1
WHERE type = 'leader'
GROUP BY userid
)
SELECT userid, totalScore
FROM cte
WHERE rnk = 1;
if you need just top 2 records add limit in your query :
SELECT userid, sum(score) as totalScore
FROM Table1 WHERE type = "leader" GROUP BY userid
ORDER BY totalScore DESC LIMIT 2;

count wthout invalid use group of function mysql

I have a table like this,
CREATE TABLE order_match
(`order_buyer_id` int, `createdby` int, `createdAt` datetime, `quantity` decimal(10,2))
;
INSERT INTO order_match
(`order_buyer_id`, `createdby`, `createdAt`, `quantity`)
VALUES
(19123, 19, '2017-02-02', 5),
(193241, 19, '2017-02-03', 5),
(123123, 20, '2017-02-03', 1),
(32242, 20, '2017-02-04', 4),
(32434, 20, '2017-02-04', 5),
(2132131, 12, '2017-02-02', 6)
;
here's the fiddle
on this table, order_buyer_id is id of the transaction, createdby are the buyer, createdAt are the time of each transaction, quantity are the quantity of transaction
I want to find out the maximum, minimum, median and average for each repeat order (the buyer with transaction > 1)
so on this table, expected results are just like this
+-----+-----+---------+--------+
| MAX | MIN | Average | Median |
+-----+-----+---------+--------+
| 3 | 2 | 2.5 | 3 |
+-----+-----+---------+--------+
note: im using mysql 5.7
I am using this syntax
select -- om.createdby, om.quantity, x1.count_
MAX(count(om.createdby)) AS max,
MIN(count(om.createdby)) AS min,
AVG(count(om.createdby)) AS average
from (select count(xx.count_) as count_
from (select count(createdby) as count_ from order_match
group by createdby
having count(createdby) > 1) xx
) x1,
(select createdby
from order_match
group by createdby
having count(createdby) > 1) yy,
order_match om
where yy.createdby = om.createdby
and om.createdAt <= '2017-02-04'
and EXISTS (select 1 from order_match om2
where om.createdby = om2.createdby
and om2.createdAt >= '2017-02-02'
and om2.createdAt <= '2017-02-04')
but it's said
Invalid use of group function
We can try aggregating by createdby, and then taking the aggregates you want:
SELECT
MAX(cnt) AS MAX,
MIN(cnt) AS MIN,
AVG(cnt) AS Average
FROM
(
SELECT createdby, COUNT(*) AS cnt
FROM order_match
GROUP BY createdby
HAVING COUNT(*) > 0
) t
To simulate the median in MySQL 5.7 is a lot of work, and ugly. If you have a long term need for median, consider upgrading to MySQL 8+.

MySQL behaviour when using ANY_VALUE multiple times

I want to get a random row for each group when using GROUP BY in MySQL 5.7. The most clean way to do it from my research is doing something like this:
SELECT ANY_VALUE(column_1), ANY_VALUE(column_2), ..., ANY_VALUE(column_n)
FROM table
GROUP BY column
Since there is no syntax for something like ANY_VALUE(*) or ANY_VALUE(column_1, column2, ..., column_n) I am left confused if with the above query each value can come from a different row, or if all ANY_VALUE fields will come from the same row.
If you want a random row, use row_number():
select t.*
from (select t.*,
row_number() over (partition by column order by rand()) as seqnum
from t
) t
where seqnum = 1;
I am guessing that this is also faster than group by, but you can check if that is the case.
In MySQL 5.7, you can use variables:
select t.*
from (select t.*,
(#rn := if(#c = column, #rn + 1,
if(#c := column, 1, 1)
)
) as rn
from (select t.* from t order by column, rand) t cross join
(select #c := '', #rn := 0) params
) t
where rn = 1;
Assuming the following schema and sample data:
create table tbl(
id int auto_increment primary key,
grp int not null,
val int not null,
index (grp)
);
insert into tbl (grp, val) values (1, 1);
insert into tbl (grp, val) values (1, 2);
insert into tbl (grp, val) values (1, 3);
insert into tbl (grp, val) values (2, 1);
insert into tbl (grp, val) values (2, 2);
Get distinct groups in a derived table (or use the base table for groups, if you have). Get a random primary key in a subquery in SELECT clause with ORDER BY rand() LIMIT 1. Then join the result as a derived table with the base table.
select t.*
from (
select (
select id
from tbl t
where t.grp = g.grp
order by rand()
limit 1
) as id
from (select distinct grp from tbl) g
) r
join tbl t using (id);
Result would be something like
| id | grp | val |
| --- | --- | --- |
| 2 | 1 | 2 |
| 4 | 2 | 1 |
View on DB Fiddle

How to get most occurences of rows for every user in mysql

user_id category suburb dated walk_time
1 experience US 2016-04-09 5
1 discovery US 2016-04-09 5
1 experience UK 2016-04-09 5
1 experience AUS 2016-04-23 10
2 actions IND 2016-04-15 2
2 actions IND 2016-04-15 1
2 discovery US 2016-04-21 2
3 discovery FR 2016-04-12 3
3 Emotions IND 2016-04-23 3
3 discovery UK 2016-04-12 4
3 experience IND 2016-04-12 3
I am trying to get every users most used category,suburb,dated,walk_time
so resulting table would be
user_id category suburb dated walk_time
1 experience US 2016-04-09 5
2 actions IND 2016-04-15 2
3 discovery IND 2016-04-12 3
The query I am trying here is
select user_id,
substring_index(group_concat(suburb order by cnt desc), ',', 1) as suburb_visited,
substring_index(group_concat(category order by cct desc), ',', 1) as category_used,
substring_index(group_concat(walk_time order by wct desc), ',', 1) as walked,
substring_index(group_concat(dated order by nct desc), ',', 1) as dated_at
from (select user_id, suburb, count(*) as cnt,category, count(*) cct, walk_time, count(*) wct, dated,count(*) nct
from temp_user_notes
group by user_id, suburb,category,walk_time,dated
) upv
group by user_id;
SELECT user_id,
(SELECT category FROM temp_user_notes t1
WHERE t1.user_id = T.user_id
GROUP BY category ORDER BY count(*) DESC LIMIT 1) as category,
(SELECT suburb FROM temp_user_notes t2
WHERE t2.user_id = T.user_id
GROUP BY suburb ORDER BY count(*) DESC LIMIT 1) as suburb,
(SELECT dated FROM temp_user_notes t3
WHERE t3.user_id = T.user_id
GROUP BY dated ORDER BY count(*) DESC LIMIT 1) as dated,
(SELECT walk_time FROM temp_user_notes t4
WHERE t4.user_id = T.user_id
GROUP BY walk_time ORDER BY count(*) DESC LIMIT 1) as walk_time
FROM (SELECT DISTINCT user_id FROM temp_user_notes) T
http://sqlfiddle.com/#!9/8aac6a/19
Try this, seems to be a little complicated, but hope help for you;)
Mysql Schema:
CREATE TABLE table1
(`user_id` int, `category` varchar(10), `suburb` varchar(3), `dated` datetime, `walk_time` int)
;
INSERT INTO table1
(`user_id`, `category`, `suburb`, `dated`, `walk_time`)
VALUES
(1, 'experience', 'US', '2016-04-09 00:00:00', 5),
(1, 'discovery', 'US', '2016-04-09 00:00:00', 5),
(1, 'experience', 'UK', '2016-04-09 00:00:00', 5),
(1, 'experience', 'AUS', '2016-04-23 00:00:00', 10),
(2, 'actions', 'IND', '2016-04-15 00:00:00', 2),
(2, 'actions', 'IND', '2016-04-15 00:00:00', 1),
(2, 'discovery', 'US', '2016-04-21 00:00:00', 2),
(3, 'discovery', 'FR', '2016-04-12 00:00:00', 3),
(3, 'Emotions', 'IND', '2016-04-23 00:00:00', 3),
(3, 'discovery', 'UK', '2016-04-12 00:00:00', 4),
(3, 'experience', 'IND', '2016-04-12 00:00:00', 3)
;
Query SQL:
select c.user_id, c.category, s.suburb, d.dated, w.walk_time
from (
select user_id, left(group_concat(category order by cnt desc), locate(',', group_concat(category order by cnt desc)) - 1) as category
from (
select
user_id, category, count(1) as cnt
from table1
group by user_id, category
) t
group by user_id
) c
inner join (
select user_id, left(group_concat(suburb order by cnt desc), locate(',', group_concat(suburb order by cnt desc)) - 1) as suburb
from (
select
user_id, suburb, count(1) as cnt
from table1
group by user_id, suburb
) t
group by user_id
) s on c.user_id = s.user_id
inner join (
select user_id, left(group_concat(dated order by cnt desc), locate(',', group_concat(dated order by cnt desc)) - 1) as dated
from (
select
user_id, dated, count(1) as cnt
from table1
group by user_id, dated
) t
group by user_id
) d on c.user_id = d.user_id
inner join (
select user_id, left(group_concat(walk_time order by cnt desc), locate(',', group_concat(walk_time order by cnt desc)) - 1) as walk_time
from (
select
user_id, walk_time, count(1) as cnt
from table1
group by user_id, walk_time
) t
group by user_id
) w on c.user_id = w.user_id
Result:
| user_id | category | suburb | dated | walk_time |
+---------+------------+--------+---------------------+-----------+
| 1 | experience | US | 2016-04-09 00:00:00 | 5 |
| 2 | actions | IND | 2016-04-15 00:00:00 | 2 |
| 3 | discovery | IND | 2016-04-12 00:00:00 | 3 |

MySQL query, MAX() + GROUP BY

Daft SQL question. I have a table like so ('pid' is auto-increment primary col)
CREATE TABLE theTable (
`pid` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`cost` INT UNSIGNED NOT NULL,
`rid` INT NOT NULL,
) Engine=InnoDB;
Actual table data:
INSERT INTO theTable (`pid`, `timestamp`, `cost`, `rid`)
VALUES
(1, '2011-04-14 01:05:07', 1122, 1),
(2, '2011-04-14 00:05:07', 2233, 1),
(3, '2011-04-14 01:05:41', 4455, 2),
(4, '2011-04-14 01:01:11', 5566, 2),
(5, '2011-04-14 01:06:06', 345, 1),
(6, '2011-04-13 22:06:06', 543, 2),
(7, '2011-04-14 01:14:14', 5435, 3),
(8, '2011-04-14 01:10:13', 6767, 3)
;
I want to get the PID of the latest row for each rid (1 result per unique RID). For the sample data, I'd like:
pid | MAX(timestamp) | rid
-----------------------------------
5 | 2011-04-14 01:06:06 | 1
3 | 2011-04-14 01:05:41 | 2
7 | 2011-04-14 01:14:14 | 3
I've tried running the following query:
SELECT MAX(timestamp),rid,pid FROM theTable GROUP BY rid
and I get:
max(timestamp) ; rid; pid
----------------------------
2011-04-14 01:06:06; 1 ; 1
2011-04-14 01:05:41; 2 ; 3
2011-04-14 01:14:14; 3 ; 7
The PID returned is always the first occurence of PID for an RID (row / pid 1 is frst time rid 1 is used, row / pid 3 the first time RID 2 is used, row / pid 7 is first time rid 3 is used). Though returning the max timestamp for each rid, the pids are not the pids for the timestamps from the original table. What query would give me the results I'm looking for?
(Tested in PostgreSQL 9.something)
Identify the rid and timestamp.
select rid, max(timestamp) as ts
from test
group by rid;
1 2011-04-14 18:46:00
2 2011-04-14 14:59:00
Join to it.
select test.pid, test.cost, test.timestamp, test.rid
from test
inner join
(select rid, max(timestamp) as ts
from test
group by rid) maxt
on (test.rid = maxt.rid and test.timestamp = maxt.ts)
select *
from (
select `pid`, `timestamp`, `cost`, `rid`
from theTable
order by `timestamp` desc
) as mynewtable
group by mynewtable.`rid`
order by mynewtable.`timestamp`
Hope I helped !
SELECT t.pid, t.cost, to.timestamp, t.rid
FROM test as t
JOIN (
SELECT rid, max(tempstamp) AS maxtimestamp
FROM test GROUP BY rid
) AS tmax
ON t.pid = tmax.pid and t.timestamp = tmax.maxtimestamp
I created an index on rid and timestamp.
SELECT test.pid, test.cost, test.timestamp, test.rid
FROM theTable AS test
LEFT JOIN theTable maxt
ON maxt.rid = test.rid
AND maxt.timestamp > test.timestamp
WHERE maxt.rid IS NULL
Showing rows 0 - 2 (3 total, Query took 0.0104 sec)
This method will select all the desired values from theTable (test), left joining itself (maxt) on all timestamps higher than the one on test with the same rid. When the timestamp is already the highest one on test there are no matches on maxt - which is what we are looking for - values on maxt become NULL. Now we use the WHERE clause maxt.rid IS NULL or any other column on maxt.
You could also have subqueries like that:
SELECT ( SELECT MIN(t2.pid)
FROM test t2
WHERE t2.rid = t.rid
AND t2.timestamp = maxtimestamp
) AS pid
, MAX(t.timestamp) AS maxtimestamp
, t.rid
FROM test t
GROUP BY t.rid
But this way, you'll need one more subquery if you want cost included in the shown columns, etc.
So, the group by and join is better solution.
If you want to avoid a JOIN, you can use:
SELECT pid, rid FROM theTable t1 WHERE t1.pid IN ( SELECT MAX(t2.pid) FROM theTable t2 GROUP BY t2.rid);
Try:
select pid,cost, timestamp, rid from theTable order by timestamp DESC limit 2;