Wrong data output in SQL request - mysql

I have a table named payments
CREATE TABLE payments (
`id` INT AUTO_INCREMENT PRIMARY KEY NOT NULL,
`student_id` INT NOT NULL,
`datetime` DATETIME NOT NULL,
`amount` FLOAT DEFAULT 0,
INDEX `student_id` (`student_id`)
);
It is necessary to create a query that is find all student_id whose sum payment is less than the biggest one. (it can be more than one user with the same biggest amount of payments)
Let assume for instance this is a test data:
== Dumping data for table payments
id-student_id-datetime-amount
|1|4|2015-06-11 00:00:00|2
|2|5|2015-06-01 00:00:00|6
|3|1|2015-06-03 00:00:00|8
|4|2|2015-06-02 00:00:00|9
|5|4|2015-06-09 00:00:00|6
|6|5|2015-06-06 00:00:00|3
|7|2|2015-06-05 00:00:00|6
|8|3|2015-06-09 00:00:00|12
|14|1|2015-06-01 00:00:00|0
|15|1|2015-06-03 00:00:00|7
|16|6|2015-06-02 00:00:00|0
|17|6|2015-06-07 00:00:00|0
|18|6|2015-06-05 00:00:00|0
Next query shows all students with their sum payments
SELECT `student_id`, SUM(amount) as `sumamount`
FROM `payments`
GROUP BY `student_id`
ORDER BY `sumamount` DESC
Here is write output of this query ordered by sumamount
student_id sumamount
1 15
2 15
3 12
5 9
4 8
6 0
BUT the problem is when I try to get the user who paid less than the biggest one it gives me the wrong answer
Here is the query to get the second user:
SELECT `student_id`, SUM(amount) as `sumamount`
FROM `payments`
GROUP BY `student_id`
HAVING `sumamount` < MAX(sumamount)
ORDER BY `sumamount` DESC
Here is the result
student_id sumamount
3 12
4 8
6 0
As we can see student_id = 5 missed and I have no idea why.

You need to calcualate MAX(sumamount) in a subquery, so that MAX is not grouped by student_id.
SELECT `student_id`, SUM(amount) as `sumamount`, maxsum
FROM `payments`
CROSS JOIN (SELECT MAX(sumamount) AS maxsum
FROM (SELECT SUM(amount) AS sumamount
FROM payments
GROUP BY student_id) t1) t2
GROUP BY `student_id`
HAVING `sumamount` < maxsum
ORDER BY `sumamount` DESC
DEMO

Related

How to select last created record in a group by clause in mysql?

I am using mysql 8.0.23
I have three tables, chats, chat_users and chat_messages
I want to select the chat_id, the last message (with maximum createdAt date for a particular group. Said in other words, the message order by created_at desc within the group), from_user_id values for all the chats where user with id 1 is a member.
The tables sql and DDLs is are like below
create table chats
(
id int unsigned auto_increment primary key,
created_at timestamp default CURRENT_TIMESTAMP not null
);
create table if not exists chat_users
(
id int unsigned auto_increment
primary key,
chat_id int unsigned not null,
user_id int unsigned not null,
constraint chat_users_user_id_chat_id_unique
unique (user_id, chat_id),
constraint chat_users_chat_id_foreign
foreign key (chat_id) references chats (id)
);
create index chat_users_chat_id_index
on chat_users (chat_id);
create index chat_users_user_id_index
on chat_users (user_id);
create table chat_messages
(
id int unsigned auto_increment primary key,
chat_id int unsigned not null,
from_user_id int unsigned not null,
content varchar(500) collate utf8mb4_unicode_ci not null,
created_at timestamp default CURRENT_TIMESTAMP not null constraint chat_messages_chat_id_foreign
foreign key (chat_id) references chats (id),
);
create index chat_messages_chat_id_index
on chat_messages (chat_id);
create index chat_messages_from_user_id_index
on chat_messages (from_user_id);
The query that I tried so far and is not working properly is
SET #userId = 1;
select
c.id as chat_id,
content,
chm.from_user_id
from chat_users
inner join chats c on chat_users.chat_id = c.id
inner join chat_messages chm on c.id = chm.chat_id
where chat_users.user_id = #userId
group by c.id
order by c.id desc, max(chm.created_at) desc
My query above does not return the content field from the last created message, although I am trying to order by max(chm.created_at) desc. This order by after group by clause is executed after the grouping I think and not within the items from the group..
I know that I can probably select in the select statement the max date but I want to select last content value within the group not select max(ch.created_at) as last_created_at_msg_within_group
I don't know how to select the content field from the item that has the highest chm.created_at from within the group that I do by grouping with c.id
Example test data
chats
1 2021-07-23 20:51:01
2 2021-07-23 20:51:01
3 2021-07-23 20:51:01
chats_users
1 1 1
2 1 2
3 2 1
4 2 2
5 3 1
6 3 2
chat_messages
1 1 1 lastmsg 2021-07-28 21:50:31
1 1 2 themsg 2021-07-23 20:51:01
The logic in this case should return
chat_id content from_user_id
1 lastmsg 1
PS:
Before posting here I did my homework and studied similar questions in the forum, but they were trying to get last inserted row from a group and were not like mine.
Here's what I came up with, for a solution for MySQL 8.0 with window functions:
select * from (
select
c.id as chat_id,
content,
chm.from_user_id,
chm.created_at,
row_number() over (partition by c.id order by chm.created_at desc) as rownum
from chat_users
inner join chats c on chat_users.chat_id = c.id
inner join chat_messages chm on c.id = chm.chat_id
where chat_users.user_id = #userId
) as t
where rownum = 1;

Trying to write a query that will pull highest and lowest revenue and also print the account health?

We are trying to write a query that shows us:
Programs with highest and lowest revenue and print their account health
This is what we have to start:
Top
Select TOP 5 * From Health, Revenue
From Program_T, Account_T
Order by Revenue;
Bottom
Select BOTTOM 5 * From Health, Revenue
From Program_T, Account_T
Order by Revenue;
Below are the tables:
Program_T Table:
(AccountName varchar(150) not null unique,
ProgramID int not null,
Revenue int,
Advocates int,
Shares int,
Conversions int,
Impressions int,
LaunchDate date,
CSMID int not null,
constraint Program_PK primary key (AccountName, CSMID),
constraint Program_FK1 foreign key (AccountName) references Account_T(AccountName),
constraint Program_FK2 foreign key (CSMID) references CSM_T(CSMID));
Account_T Table:
create table Account_T
(AccountName varchar(150) not null unique,
Health varchar(10) not null,
EcommercePlatform varchar(50),
CSMID int not null,
Industry varchar(50),
Amount int not null,
constraint Accounts_PK primary key (AccountName),
constraint Accounts_FK foreign key (CSMID) references CSM_T(CSMID));
The clause to get the TOP rows in MySQL is LIMIT. You order ascending or descending to either get the top or the bottom rows. As you want both, this means two queries the results of which you'd glue with UNION ALL. And as each query has an ORDER BY clause, you need parentheses to show the DBMS what the ORDER BY clauses refer to. At last you want a final ORDER BY clause, because the result of a UNION ALL is not guaranteed to be ordered.
select revenue, health
from
(
(
select p.revenue, a.health
from program_t p
join account_t a using (accountname)
order by p.revenue asc limit 5
)
union all
(
select p.revenue, a.health
from program_t p
join account_t a using (accountname)
order by p.revenue desc limit 5
)
) glued
order by revenue;
As of MySQL 8 you could also use ROW_NUMBER to rank your rows, which may or may not be faster:
select revenue, health
from
(
select
p.revenue, a.health,
row_number() over (order by p.revenue asc) as rn1,
row_number() over (order by p.revenue desc) as rn2
from program_t p
join account_t a using (accountname)
) numbered
where rn1 <= 5 or rn2 <= 5
order by revenue;
As to the joins, you can either use the USING clause as shown above or use ON:
from program_t p
join account_t a on a.accountname = p.accountname
i think you found union all
select * from (
select * from
(Select Health, Revenue
From Program_T p join Account_T a on p.AccountName =a.AccountName
Order by Revenue desc limit 5
)a
union all
select * from
(
Select Health, Revenue
From Program_T p join Account_T a on p.AccountName =a.AccountName
Order by Revenue limit 5
)b) as w
You want UNION ALL :
Select TOP (5) AT.Health, PT.Revenue
From Program_T PT INNER JOIN
Account_T AT
ON PT.AccountName = AT.AccountName
Order by Revenue
LIMIT 5
UNION ALL
Select TOP (5) AT.Health, PT.Revenue
From Program_T PT INNER JOIN
Account_T AT
ON PT.AccountName = AT.AccountName
Order by Revenue DESC
LIMIT 5;

Fast group rank() function

There are various ways people try to emulate MSSQL RANK() or ROW_NUMBER() functions in MySQL, but all of them I've tried so far are slow.
I have a table that looks like this:
CREATE TABLE ratings
(`id` int, `category` varchar(1), `rating` int)
;
INSERT INTO ratings
(`id`, `category`, `rating`)
VALUES
(3, '*', 54),
(4, '*', 45),
(1, '*', 43),
(2, '*', 24),
(2, 'A', 68),
(3, 'A', 43),
(1, 'A', 12),
(3, 'B', 22),
(4, 'B', 22),
(4, 'C', 44)
;
Except it has 220,000 records. There are about 90,000 unique id's.
I wanted to rank the id's first by looking at the categories which were not * where a higher rating is a lower rank.
SELECT g1.id,
g1.category,
g1.rating,
Count(*) AS rank
FROM ratings AS g1
JOIN ratings AS g2 ON (g2.rating, g2.id) >= (g1.rating, g1.id)
AND g1.category = g2.category
WHERE g1.category != '*'
GROUP BY g1.id,
g1.category,
g1.rating
ORDER BY g1.category,
rank
Output:
id category rating rank
2 A 68 1
3 A 43 2
1 A 12 3
4 B 22 1
3 B 22 2
4 C 44 1
Then I wanted to take the smallest rank an id had, and average that with the rank they have within the * category. Giving a total query of:
SELECT X1.id,
(X1.rank + X2.minrank) / 2 AS OverallRank
FROM
(SELECT g1.id,
g1.category,
g1.rating,
Count(*) AS rank
FROM ratings AS g1
JOIN ratings AS g2 ON (g2.rating, g2.id) >= (g1.rating, g1.id)
AND g1.category = g2.category
WHERE g1.category = '*'
GROUP BY g1.id,
g1.category,
g1.rating
ORDER BY g1.category,
rank) X1
JOIN
(SELECT id,
Min(rank) AS MinRank
FROM
(SELECT g1.id,
g1.category,
g1.rating,
Count(*) AS rank
FROM ratings AS g1
JOIN ratings AS g2 ON (g2.rating, g2.id) >= (g1.rating, g1.id)
AND g1.category = g2.category
WHERE g1.category != '*'
GROUP BY g1.id,
g1.category,
g1.rating
ORDER BY g1.category,
rank) X
GROUP BY id) X2 ON X1.id = X2.id
ORDER BY overallrank
Giving me
id OverallRank
3 1.5000
4 1.5000
2 2.5000
1 3.0000
This query is correct and the output I want, but it just hangs on my real table of 220,000 records. How can I optimize it? My real table has an index on id,rating and category and id,category
Edit:
Result of SHOW CREATE TABLE ratings:
CREATE TABLE `rating` (
`id` int(11) NOT NULL,
`category` varchar(255) NOT NULL,
`rating` int(11) NOT NULL DEFAULT '1500',
`rd` int(11) NOT NULL DEFAULT '350',
`vol` float NOT NULL DEFAULT '0.06',
`wins` int(11) NOT NULL,
`losses` int(11) NOT NULL,
`streak` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`streak`,`rd`,`id`,`category`),
UNIQUE KEY `id_category` (`id`,`category`),
KEY `rating` (`rating`,`rd`),
KEY `streak_idx` (`streak`),
KEY `category_idx` (`category`),
KEY `id_rating_idx` (`id`,`rating`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The PRIMARY KEY is the most common use case of queries to this table, that is why it's the clustered key. It's worth noting that the server is a raid 10 of SSDs with a 9GB/s FIO random read. So I don't suspect the indices not being clustered will affect much.
Output of (select count(distinct category) from ratings) is 50
In the interest that this could be how the data is or an oversight on me, I am included the export of the entire table. It is only 200KB zipped: https://www.dropbox.com/s/p3iv23zi0uzbekv/ratings.zip?dl=0
The first query takes 27 seconds to run
You can use temporary tables with an AUTO_INCREMENT column to generate ranks (row number).
For example - to generate ranks for the '*' category:
drop temporary table if exists tmp_main_cat_rank;
create temporary table tmp_main_cat_rank (
rank int unsigned auto_increment primary key,
id int NOT NULL
) engine=memory
select null as rank, id
from ratings r
where r.category = '*'
order by r.category, r.rating desc, r.id desc;
This runs in something like 30 msec. While your approach with the selfjoin takes 45 seconds on my machine. Even with a new index on (category, rating, id) it still takes 14 seconds to run.
To generate ranks per group (per category) is a bit more complicated. We can still use an AUTO_INCREMENT column, but will need to calculate and substract an offset per category:
drop temporary table if exists tmp_pos;
create temporary table tmp_pos (
pos int unsigned auto_increment primary key,
category varchar(50) not null,
id int NOT NULL
) engine=memory
select null as pos, category, id
from ratings r
where r.category <> '*'
order by r.category, r.rating desc, r.id desc;
drop temporary table if exists tmp_cat_offset;
create temporary table tmp_cat_offset engine=memory
select category, min(pos) - 1 as `offset`
from tmp_pos
group by category;
select t.id, min(t.pos - o.offset) as min_rank
from tmp_pos t
join tmp_cat_offset o using(category)
group by t.id
This runs in about 220 msec. The selfjoin solution takes 42 sec or 13 sec with the new index.
Now you just need to combine the last query with the first temp table, to get your final result:
select t1.id, (t1.min_rank + t2.rank) / 2 as OverallRank
from (
select t.id, min(t.pos - o.offset) as min_rank
from tmp_pos t
join tmp_cat_offset o using(category)
group by t.id
) t1
join tmp_main_cat_rank t2 using(id);
Overall runtime is ~280 msec without an additional index and ~240 msec with an index on (category, rating, id).
A note to the selfjoin approach: It's an elegant solution and performs fine with a small group size. It's fast with an average group size <= 2. It can be acceptable for a group size of 10. But you have an average group size 447 (count(*) / count(distinct category)). That means every row is joined with 447 other rows (on average). You can see the impact by removing the group by clause:
SELECT Count(*)
FROM ratings AS g1
JOIN ratings AS g2 ON (g2.rating, g2.id) >= (g1.rating, g1.id)
AND g1.category = g2.category
WHERE g1.category != '*'
The result is more than 10M rows.
However - with an index on (category, rating, id) your query runs in 33 seconds on my machine.

mysql query with subquery loop?

I have a query
SELECT
count(product) as amount,
product,
sum(price) AS price
FROM `products`
WHERE
brid = 'broker'
AND
cancelled is null
GROUP BY product
WITH ROLLUP
Is it possible to query a table to get a brokers id and then for each broker run the query above written as 1 query?
Almost like:
SELECT brid FROM membership
THEN
SELECT
count(product) as amount,
product,
sum(price) AS price
FROM `products`
WHERE
brid = membership.brid
AND
cancelled is null
GROUP BY product
WITH ROLLUP
THEN
SELECT NEXT brid
Is this possible? i know how to do it in PHP but i would prefer 1 query that can create an array rather than tons of queries for each.
Thanks
Adam.
Sure, you can GROUP BY both the 'brid' field and the 'product' field. As noted below, WITH ROLLUP will cause it to sort by 'brid' and then by 'product':
SELECT
brid,
count(product) as amount,
product,
sum(price) AS price
FROM `products`
WHERE
brid IN (SELECT brid FROM membership)
AND
cancelled is null
GROUP BY brid, product
WITH ROLLUP
SELECT
count(product) as amount,
product,
sum(price) AS price
FROM `products`
INNER JOIN membership on membership.brid = products.brid
WHERE
cancelled is null
GROUP BY product
WITH ROLLUP
As far as I can understand from your example, all you need is inner join between membership and products on brid
Take the following example:
products table:
CREATE TABLE `test` (
`price` int(11) DEFAULT NULL,
`product` varchar(20) DEFAULT NULL,
`brid` varchar(20) DEFAULT NULL,
`cancelled` varchar(20) DEFAULT NULL
)
membership table:
CREATE TABLE `membership` (
`brid` varchar(20) DEFAULT NULL
)
And following is my query as you required:
SELECT
t.brid, count(t.product) as amount,
t.product,
sum(t.price) AS price
FROM products t, membership m
WHERE
t.brid = m.`brid`
AND
cancelled is null
GROUP BY product
Hope that helps!

SELECT newest record of any GROUP of records (ignoring records with one record)

Having trouble with a query to return the newest order of any grouped set of orders having more than 1 order. CREATE & INSERTs for the test data are below.
This query returns the unique customer id's I want to work with, along with the grouped order_id's. Of these records, I only need the most recent order (based on date_added).
SELECT COUNT(customer_id), customer_id, GROUP_CONCAT(order_id) FROM orderTable GROUP BY customer_id HAVING COUNT(customer_id)>1 LIMIT 10;
mysql> SELECT COUNT(customer_id), customer_id, GROUP_CONCAT(order_id) FROM orderTable GROUP BY customer_id HAVING COUNT(customer_id)>1 LIMIT 10;
+--------------------+-------------+------------------------+
| COUNT(customer_id) | customer_id | GROUP_CONCAT(order_id) |
+--------------------+-------------+------------------------+
| 2 | 0487 | F9,Z33 |
| 3 | 1234 | 3A,5A,88B |
+--------------------+-------------+------------------------+
2 rows in set (0.00 sec)
I'm looking for order Z33 (customer_id 0487) and 3A (customer_id 1234).
For clarification, I do not want orders for customers that have only ordered once.
Any help or tips to get me pointed in the right direction appreciated.
Sample table data:
--
-- Table structure for table orderTable
CREATE TABLE IF NOT EXISTS orderTable (
customer_id varchar(10) NOT NULL,
order_id varchar(4) NOT NULL,
date_added date NOT NULL,
PRIMARY KEY (customer_id,order_id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table orderTable
INSERT INTO orderTable (customer_id, order_id, date_added) VALUES
('1234', '5A', '1997-01-22'),
('1234', '88B', '1992-05-09'),
('0487', 'F9', '2002-01-23'),
('5799', 'A12F', '2007-01-23'),
('1234', '3A', '2009-01-22'),
('3333', '7FHS', '2009-01-22'),
('0487', 'Z33', '2004-06-23');
==========================================================
Clarification of the query.
The question was to only include those customers that had more... hence my query has it INSIDE with the GROUP BY... This way it ONLY GIVES the customer in question that HAD multiple orders, but at the same time, only gives the most recent date OF the last order for the person... Then the PreQuery is re-joined to the orders table by the common customer ID, but only for the order that matches the last date as detected in the prequery. If a customer only had a single order, its inner PreQuery count would have only been 1 and thus excluded from the final PreQuery result set.
select ot.*
from
( select
customer_id,
max( date_added ) as LastOrderDate,
from
orderTable
having
count(*) > 1
group by
customer_id ) PreQuery
join orderTable ot
on PreQuery.Customer_ID = ot.Customer_ID
and PreQuery.LastOrderDate = ot.date_added