MySQL wrong result using MIN() function - mysql

Question:
In this example, we have the grades of 5 students from school 1. We want to know which student had the lowest grade.
We were expecting to get student number 4, but SQL returns student 1
Can someone help me?
Thanks in advance
Table 1:
CREATE TABLE `table1` (
`school_id` int(11) unsigned NOT NULL,
`student_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`grade` int(11) unsigned NOT NULL,
PRIMARY KEY (`student_id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8;
Data:
INSERT INTO `table1` (`school_id`, `student_id`, `grade`)
VALUES
(1, 1, 20),
(1, 2, 15),
(1, 3, 18),
(1, 4, 12),
(1, 5, 15);
SQL Query:
SELECT t1.`school_id`, t1.`student_id`, MIN(t1.grade)
FROM table1 as t1
WHERE t1.`school_id`=1
GROUP BY t1.`school_id`;
Printscreen:

SELECT * FROM table1 ORDER BY grade LIMIT 1
If you want the worst performing student in each school, then that's...
SELECT x.*
FROM table1 x
JOIN
( SELECT school_id
, MIN(grade) grade
FROM table1
GROUP
BY school_id
) y
ON y.school_id = x.school_id
AND y.grade = x.grade;
http://sqlfiddle.com/#!9/f44cb2/1

With #tadman's tip, we came up with a solution:
You can find it bellow in case you came across with this same issue.
We didn't understand why we have to use the limit. if we take out the limit line, we will get a wrong result
SELECT t2.`school_id`, t2.`student_id`, t2.grade
FROM
(
SELECT t1.`school_id`, t1.`student_id`, t1.grade
FROM table1 as t1
WHERE t1.`school_id`=1
ORDER BY t1.`grade` ASC
limit 4294967295
)
as t2
GROUP BY t2.`school_id`;

Unless there are some more requirements for your problem, I guess you would be good with just:
select t1.school_id, t1.student_id, t1.grade
from table1 as t1,
(select school_id, min(grade) as grade from table1 group by school_id) as t2
where t1.school_id=t2.school_id
and t1.grade=t2.grade;

Related

Problem MYSQL Query join with output want

I has problem with MySql query. I has try many time for query, but still not get what I want. Maybe anyone can help my problem.
This is structure table and what output I want :
This is whats i try, but when #IDPERIODS=2, thats not show i want :
SET #IDPERIODS:=2;
SELECT billing.*
FROM _t_data_user
LEFT JOIN (
SELECT user_id as iduser,
IF(a.id_bill_type=b.id_bill_type,a.id_setting_bill,ifnull(b.id_setting_bill,a.id_setting_bill)) as idsettingbill,
id_user_group as group_user,
IF(a.id_bill_type=b.id_bill_type,a.id_bill_type, ifnull(b.id_bill_type,a.id_bill_type)) as idbilltype,
IF(a.id_bill_type=b.id_bill_type,a.id_period, ifnull(b.id_period,a.id_period)) as period,
IF(a.id_bill_type=b.id_bill_type,a.amount_bill, ifnull(b.amount_bill,a.amount_bill)) as amount_billing
FROM _t_data_user
LEFT JOIN _t_setting_bill_user b ON b.id_group_user=id_user_group and b.id_period=#IDPERIODS
LEFT JOIN _t_setting_bill_user a ON a.id_user=user_id and a.id_period=#IDPERIODS
WHERE IFNULL(a.id_period, b.id_period) = #IDPERIODS
) billing ON iduser = user_id
WHERE period = #IDPERIODS
GROUP BY user_id, idbilltype
This MySql table scheme :
Table structure and sample data:
CREATE TABLE `_t_data_user` (
`user_id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`id_user_group` int(4) DEFAULT NULL,
PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8;
INSERT INTO `_t_data_user` (`user_id`, `id_user_group`)
VALUES
(1, 1),
(2, 1),
(3, 1),
(4, 2);
CREATE TABLE `_t_setting_bill_user` (
`id_setting_bill` int(11) unsigned NOT NULL AUTO_INCREMENT,
`id_group_user` int(4) DEFAULT NULL,
`id_user` int(4) DEFAULT NULL,
`id_period` int(4) DEFAULT NULL,
`id_bill_type` int(4) DEFAULT NULL,
`amount_bill` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id_setting_bill`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8;
INSERT INTO `_t_setting_bill_user`
(`id_setting_bill`, `id_group_user`, `id_user`,
`id_period`, `id_bill_type`, `amount_bill`)
VALUES
(1, 1, 0, 1, 1, 1000),
(2, 1, 0, 1, 2, 500),
(3, 0, 1, 1, 1, 900),
(4, 0, 1, 2, 1, 1000),
(5, 1, 0, 2, 2, 500),
(6, 2, 0, 1, 1, 1100);
This gives you the raw data you want:
SELECT *
FROM
setting_bill_user s
JOIN data_user d
ON
s.id_group_user = d.id_user_group OR
s.id_user = d.user_id
Look:
You just have to choose and alias the columns appropriately
I can't work out why your desired output is missing the 1000 row for user id 1/setting bill 1, but I'm sure you can add some WHERE clause to cover that, whatever the reason may be
It seems that removing a few parts from the subquery will return the result that you're looking for:
SELECT billing.*
FROM _t_data_user
JOIN (
SELECT user_id AS iduser,
IF(a.id_bill_type=b.id_bill_type,a.id_setting_bill,IFNULL(b.id_setting_bill,a.id_setting_bill)) AS idsettingbill,
id_user_group AS group_user,
IF(a.id_bill_type=b.id_bill_type,a.id_bill_type, IFNULL(b.id_bill_type,a.id_bill_type)) AS idbilltype,
IF(a.id_bill_type=b.id_bill_type,a.id_period, IFNULL(b.id_period,a.id_period)) AS period,
IF(a.id_bill_type=b.id_bill_type,a.amount_bill, IFNULL(b.amount_bill,a.amount_bill)) AS amount_billing
FROM _t_data_user t
LEFT JOIN _t_setting_bill_user b ON b.id_group_user=t.id_user_group
LEFT JOIN _t_setting_bill_user a ON a.id_user=t.user_id
WHERE IFNULL(a.id_period, b.id_period) = #IDPERIODS
) billing ON iduser = user_id
GROUP BY user_id, idbilltype;
I've removed AND id_period=#IDPERIODS of the ON condition in both of LEFT JOIN in the subquery.
I've changed LEFT JOIN to JOIN in the outer query because you were doing LEFT JOIN with a WHERE condition on the data from the right reference (the subquery). Which, in practice is just a normal JOIN so LEFT JOIN is unnecessary. Therefore, I also removed WHERE period = #IDPERIODS from the outer query.
And that's it. Other than that, most of your original query structures are still intact.
Demo fiddle
Halo bro desugha,
I think you should make it on programatic way, you should re-arrange and create object to combine data from data_user INTO setting_bill_user
SELECT compacted_usr_bill.* FROM (
SELECT billusr.*,
usr.user_id AS usr_id_usr, usr.id_user_group AS usr_id_group
FROM (
SELECT bill.id_setting_bill,
CASE
WHEN bill.id_user > 0 AND bill.id_group_user = 0 THEN bill.id_user
WHEN bill.id_user = 0 AND bill.id_group_user > 0 THEN bill.id_group_user
ELSE bill.id_user
END AS grouped_id_usr,
bill.id_period, bill.id_bill_type, bill.amount_bill
FROM _t_setting_bill_user AS bill)
AS billusr
LEFT JOIN _t_data_user AS usr
ON billusr.grouped_id_usr IN(usr.user_id, usr.id_user_group)
) AS compacted_usr_bill
This query will combine them, you can filter again with Grouping or Programmatic way

MySQL: ORDER BY and GROUP BY together

I recently upgraded to MySQL 5.7.22 and my query stopped working. I have two tables "items" and "packages" where I'm trying to output a row for each item including a column for the package with the minimum price per unit, but ignore packages that have a price per unit set to 0.
Here's a minimal sample of tables and data:
CREATE TABLE `items` (
`id` int(11) NOT NULL
);
CREATE TABLE `packages` (
`item_id` int(11) NOT NULL,
`price_per_unit` float(16,6) DEFAULT 0
);
INSERT INTO `items` (`id`) VALUES
(1),
(2),
(3);
INSERT INTO `packages` (`item_id`, `price_per_unit`) VALUES
(1, 0.45),
(1, 0),
(1, 0.56),
(1, 0.34);
Here's the query:
SELECT
*
FROM
(
SELECT
items.id,
NULLIF(pkgs.ppu, 0) AS mppu
FROM
items
LEFT JOIN
(
SELECT
item_id,
price_per_unit AS ppu
FROM
packages
) AS pkgs ON pkgs.item_id = items.id
ORDER BY
IFNULL(mppu, 9999)
) X
GROUP BY
X.id
I was setting the zero values to null and then bumping their values to be much higher during the ordering. There must be a better way (especially since this method doesn't work any longer).
The expected output for this data is:
id mppu
1 0.34
2 null
3 null
I think your query is a bit too complex. What about this?
SELECT i.id,IFNULL(Min(p.price_per_unit), 'NONE')
FROM items i
LEFT JOIN packages p
ON ( i.id = p.item_id )
WHERE p.price_per_unit > 0
OR p.price_per_unit IS NULL
GROUP BY i.id
See this fiddle. I used this data:
INSERT INTO `items` (`id`) VALUES
(1),(2),(3);
INSERT INTO `packages` (`item_id`, `price_per_unit`) VALUES
(1, 0.45),
(1, 0),
(1, 0.56),
(1, 0.34),
(2, 9.45),
(2, 0),
(2, 0.56),
(2, 0.14);
And got this result:
id IFNULL(min(p.price_per_unit),'None')
1 0.340000
2 0.140000
3 None
Agree with GL,
SELECT * FROM GROUP BY
is not predictable .
i will rewrite the query with :
SELECT a.*,b.min_price_per_unit
FROM items a
LEFT JOIN (
SELECT item_id
,min(CASE
WHEN price_per_unit = 0
THEN 9999
ELSE price_per_unit
END) AS min_price_per_unit
FROM packages
GROUP BY item_id
) b ON a.id = b.item_id;

How to optmize query with a query with multiple GROUP BY's,sub queries and WHERE IN over a large table?

I am working on a scraping project to crawl items and their scores over different schedules.Schedule is a user defined period (date) when the script is intended to run.
Table structure is as follows:
--
-- Table structure for table `test_join`
--
CREATE TABLE IF NOT EXISTS `test_join` (
`schedule_id` int(11) NOT NULL,
`player_name` varchar(50) NOT NULL,
`type` enum('celebrity','sportsperson') NOT NULL,
`score` int(11) NOT NULL,
PRIMARY KEY (`schedule_id`,`player_name`,`type`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `test_join`
--
INSERT INTO `test_join` (`schedule_id`, `player_name`, `type`, `score`) VALUES
(1, 'sachin', 'sportsperson', 100),
(1, 'ganguly', 'sportsperson', 80),
(1, 'dravid', 'sportsperson', 60),
(1, 'sachin', 'celebrity', 100),
(2, 'sachin', 'sportsperson', 120),
(2, 'ganguly', 'sportsperson', 100),
(2, 'sachin', 'celebrity', 120);
The scraping is done over periods and for each schedule it is expected to have about 10k+ entries.The schedules could be made in daily basis,hence the data would grow to be be around 2 million in 5-6 months.
Over this data I need to perform queries to aggregate the player who come across each schedules in a selected range of schedules.
For example:
I need aggregate same players who come across multiple schedules. If schedule 1 and 2 are selected,items which come under both of the schedules only will be selected.
I am using the following query to aggregate results based on the type,
For schedule 1:
SELECT fullt.type,COUNT(*) as count,SUM(fullt.score) FROM
(SELECT tj.*
FROM `test_join` tj
RIGHT JOIN
(SELECT `player_name`,`type`,COUNT(`schedule_id`) as c FROM `test_join` WHERE `schedule_id` IN (1,2) GROUP BY `player_name`,`type` HAVING c=2) stj
on tj.player_name = stj.player_name
WHERE tj.`schedule_id`=1
GROUP BY tj.`type`,tj.`player_name`)AS fullt
GROUP BY fullt.type
Reason for c = 2;
WHERE `schedule_id` IN (1,2) GROUP BY `player_name`,`type` HAVING c=2
Here we are selecting two schedules,1 and 2.Hence the count 2 is taken to make the query to to fetch records which belongs to both the schedules and occurs twice.
It would generate a results as follows,
Schedule 1 :Expected Results
Schedule 2 :Expected Results
This is my expected result and the query returns the results as above.
(In the real case I have to work across pretty big MySQL tables)
On my understanding of standardized MySQL queries, using sub queries,WHERE IN, varchar comparison fields ,multiple GROUP BY's would affect in the query performance.
I need the aggregate results in real time and query speed and well as standards are a concern too.How this could be optimized for better performance in this context.
EDIT:
I had reduced sub queries now:
SELECT fullt.type,COUNT(*) as count,SUM(fullt.score) FROM (
SELECT t.*
FROM `test_join` t
INNER JOIN test_join t1 ON t.`player_name` = t1.player_name AND t1.schedule_id = 1
INNER JOIN test_join t2 ON t.player_name = t2.player_name AND t2.schedule_id = 2
WHERE t.schedule_id = 2
GROUP BY t.`player_name`,t.`type`) AS fullt
GROUP BY fullt.type
Is this a better way to do so.I had replaced WHERE IN with JOINS.
Any advise would be highly appreciated.I would be happy to provide any supporting information if needed.
try below SQL Query in MYSQL:
SELECT tj.`type`,COUNT(*) as count,SUM(tj.`score`) FROM
`test_join` tj
where tj.`schedule_id`=1
and `player_name` in
(
select tj1.`player_name` from `test_join` tj1
group by tj1.`player_name` having count(tj1.`player_name`) > 1
)
group by tj.`type`
Actuallly I tried same data in Sybase as i dont have MySQL installed in my machine.It worked as exepected !
CREATE TABLE #test_join
(
schedule_id int NOT NULL,
player_name varchar(50) NOT NULL,
type1 varchar(15) NOT NULL,
score int NOT NULL,
)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES
(1, 'sachin', 'sportsperson', 100)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(1, 'ganguly', 'sportsperson', 80)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(1, 'dravid', 'sportsperson', 60)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(1, 'sachin', 'celebrity', 100)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(2, 'sachin', 'sportsperson', 120)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(2, 'ganguly', 'sportsperson', 100)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(2, 'sachin', 'celebrity', 120)
select * from #test_join
Print 'Solution #1 : Inner join'
select type1,count(*),sum(score) from
#test_join
where schedule_id=1 and player_name in (select player_name from #test_join t1 group by player_name having count(player_name) > 1 )
group by type1
select player_name,type1,sum(score) Score into #test_join_temp
from #test_join
group by player_name,type1
having count(player_name) > 1
Print 'Solution #2 using Temp Table'
--select * from #test_join_temp
select type1,count(*),sum(score) from
#test_join
where schedule_id=1 and player_name in (select player_name from #test_join_temp )
group by type1
I hope This Helps :)

MySQL multiple COUNTs

I have a table like this:
Fiddle: http://sqlfiddle.com/#!2/44d9e/14
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(20) NOT NULL,
`money_earned` int(20) NOT NULL,
PRIMARY KEY (`id`)
) ;
INSERT INTO mytable (user_id,money_earned) VALUES ("111","10");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","6");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","40");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","45");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","1");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","5");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","19");
I need to know table has how many rows, how many different users, and how many times each user has earned.
I need this result:
TOTAL_ROWS: 7
TOTAL_INDIVIDUAL_USERS: 3
USER_ID USER_TIMES
111 3
222 2
333 2
Is your problem that you want the total as well? If so, then you can get this using rollup:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times
FROM mytable
GROUP BY user_id with rollup;
You can get the user counts in a separate column with this trick:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times, count(distinct user_id) as UserCount
FROM mytable
GROUP BY user_id with rollup;
You realize that a SQL query just returns a table of values. You are asking for very specific formatting, which is typically done better at the application level. That said, you can get close to what you want with something like this:
select user, times
from ((SELECT 3 as ord, cast(user_id as char(20)) as user, COUNT(*) as times
FROM mytable
GROUP BY user_id
)
union all
(select 1, 'Total User Count', count(*)
from mytable
)
union all
(select 2, 'Total Users', count(distinct user_id)
from mytable
)
) t
order by ord;
I think this could be a typo anyway your are trying to sum your COUNT() times, simply replace with money_earned
SELECT user_id,
COUNT(*) AS 'times',
SUM(money_earned) AS 'sum_money'
FROM mytable GROUP BY user_id;
SQL Fiddle

MySQL query, MAX() + GROUP BY

Daft SQL question. I have a table like so ('pid' is auto-increment primary col)
CREATE TABLE theTable (
`pid` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`cost` INT UNSIGNED NOT NULL,
`rid` INT NOT NULL,
) Engine=InnoDB;
Actual table data:
INSERT INTO theTable (`pid`, `timestamp`, `cost`, `rid`)
VALUES
(1, '2011-04-14 01:05:07', 1122, 1),
(2, '2011-04-14 00:05:07', 2233, 1),
(3, '2011-04-14 01:05:41', 4455, 2),
(4, '2011-04-14 01:01:11', 5566, 2),
(5, '2011-04-14 01:06:06', 345, 1),
(6, '2011-04-13 22:06:06', 543, 2),
(7, '2011-04-14 01:14:14', 5435, 3),
(8, '2011-04-14 01:10:13', 6767, 3)
;
I want to get the PID of the latest row for each rid (1 result per unique RID). For the sample data, I'd like:
pid | MAX(timestamp) | rid
-----------------------------------
5 | 2011-04-14 01:06:06 | 1
3 | 2011-04-14 01:05:41 | 2
7 | 2011-04-14 01:14:14 | 3
I've tried running the following query:
SELECT MAX(timestamp),rid,pid FROM theTable GROUP BY rid
and I get:
max(timestamp) ; rid; pid
----------------------------
2011-04-14 01:06:06; 1 ; 1
2011-04-14 01:05:41; 2 ; 3
2011-04-14 01:14:14; 3 ; 7
The PID returned is always the first occurence of PID for an RID (row / pid 1 is frst time rid 1 is used, row / pid 3 the first time RID 2 is used, row / pid 7 is first time rid 3 is used). Though returning the max timestamp for each rid, the pids are not the pids for the timestamps from the original table. What query would give me the results I'm looking for?
(Tested in PostgreSQL 9.something)
Identify the rid and timestamp.
select rid, max(timestamp) as ts
from test
group by rid;
1 2011-04-14 18:46:00
2 2011-04-14 14:59:00
Join to it.
select test.pid, test.cost, test.timestamp, test.rid
from test
inner join
(select rid, max(timestamp) as ts
from test
group by rid) maxt
on (test.rid = maxt.rid and test.timestamp = maxt.ts)
select *
from (
select `pid`, `timestamp`, `cost`, `rid`
from theTable
order by `timestamp` desc
) as mynewtable
group by mynewtable.`rid`
order by mynewtable.`timestamp`
Hope I helped !
SELECT t.pid, t.cost, to.timestamp, t.rid
FROM test as t
JOIN (
SELECT rid, max(tempstamp) AS maxtimestamp
FROM test GROUP BY rid
) AS tmax
ON t.pid = tmax.pid and t.timestamp = tmax.maxtimestamp
I created an index on rid and timestamp.
SELECT test.pid, test.cost, test.timestamp, test.rid
FROM theTable AS test
LEFT JOIN theTable maxt
ON maxt.rid = test.rid
AND maxt.timestamp > test.timestamp
WHERE maxt.rid IS NULL
Showing rows 0 - 2 (3 total, Query took 0.0104 sec)
This method will select all the desired values from theTable (test), left joining itself (maxt) on all timestamps higher than the one on test with the same rid. When the timestamp is already the highest one on test there are no matches on maxt - which is what we are looking for - values on maxt become NULL. Now we use the WHERE clause maxt.rid IS NULL or any other column on maxt.
You could also have subqueries like that:
SELECT ( SELECT MIN(t2.pid)
FROM test t2
WHERE t2.rid = t.rid
AND t2.timestamp = maxtimestamp
) AS pid
, MAX(t.timestamp) AS maxtimestamp
, t.rid
FROM test t
GROUP BY t.rid
But this way, you'll need one more subquery if you want cost included in the shown columns, etc.
So, the group by and join is better solution.
If you want to avoid a JOIN, you can use:
SELECT pid, rid FROM theTable t1 WHERE t1.pid IN ( SELECT MAX(t2.pid) FROM theTable t2 GROUP BY t2.rid);
Try:
select pid,cost, timestamp, rid from theTable order by timestamp DESC limit 2;