MySQL - Find MAX of grouped SUM (without LIMIT) - mysql

I would like to get the user_id and the sum of amount for the users who have largest summed amount. I cannot use LIMIT because that will return only 1 record (summed amount may be same for multiple users)
Here is my data schema and some records
CREATE TABLE transactions (
id BIGINT(20) NOT NULL AUTO_INCREMENT,
user_id BIGINT(20) NOT NULL,
amount FLOAT NOT NULL, PRIMARY KEY (id)
);
INSERT INTO transactions (user_id, amount) VALUES
(1, 1000),
(1, 1000),
(1, 1000),
(2, 2000),
(2, 1000),
(3, 1000);
Here are the expected result.
+---------+------+
| user_id | sum |
+---------+------+
| 1 | 3000 |
| 2 | 3000 |
+---------+------+
I can get the above result by using the following sql. However, I don't know is there any better approach or not. Is it necessary to repeat the same subquery twice? Thanks.
SELECT T1.user_id, T1.sum
FROM (
SELECT user_id, SUM(amount) as sum
FROM transactions
GROUP BY user_id
) T1
WHERE T1.sum = (
SELECT MAX(T2.sum)
FROM (
SELECT user_id, SUM(amount) as sum
FROM transactions
GROUP BY user_id
) T2
)
GROUP BY T1.user_id;

Well you can simplify your query to
SELECT user_id, SUM(amount) as sum
FROM transactions
GROUP BY user_id
HAVING SUM(amount) = (
SELECT SUM(amount) as sum
FROM transactions
GROUP BY user_id
ORDER BY SUM(amount) DESC
LIMIT 1
)

Related

Determine growth in value in a specific time range

Given the following data set example, how should I structure my SQL query in order to determine if the value has grown over time (given a time range in the query) for a specific UserId by returning either a positive/negative growth percentage result or a true/false result
UserId
timestamp
value
1
1617711825
350
1
1617711829
400
1
1617711830
450
5
1617711831
560
Given the above example, we can observe that the value for UserId=1 has grown by a certain percentage.
The expected result would be:
UserId
growthPercentage
hasValueIncreased
1
50%
1
You can get the first and last values and then do whatever calculation you like. One method is:
select userId, value_first, value_last,
(value_first < value_last) as is_growing,
100 * ((value_last / value_first) - 1) as increase_percentage
from (select t.*,
first_value(value) over (partition by userId order by timestamp) as value_first,
first_value(value) over (partition by userId order by timestamp desc) as value_last
from t
) t
group by userId, value_first, value_last;
Schema and insert statements:
create table mytable(UserId int,timestamps timestamp,value int);
insert into mytable values(1, FROM_UNIXTIME(1617711825), 350);
insert into mytable values(1, FROM_UNIXTIME(1617711829), 400);
insert into mytable values(1, FROM_UNIXTIME(1617711830), 450);
insert into mytable values(5, FROM_UNIXTIME(1617711831), 560);
Query:
select usermin.userid, 100*(usermax.value-usermin.value)/usermin.value as growthPercentage,
(case when usermax.value>usermin.value then 1 else 0 end)hasValueIncreased
from
(SELECT USERID, VALUE FROM mytable where userid=1
order by timestamps
limit 1) usermin
inner join
(SELECT USERID, VALUE FROM mytable where userid=1
order by timestamps desc
limit 1) usermax
on usermin.userid=usermax.userid
Output:
|userid | growthPercentage | hasValueIncreased
|-----: | ---------------: | ----------------:
| 1 | 28.5714 | 1
db<>fiddle here

Selecting all rows with only one value in column with another common value

my table:
drop table if exists new_table;
create table if not exists new_table(
obj_type int(4),
user_id varchar(30),
payer_id varchar(30)
);
insert into new_table (obj_type, user_id, payer_id) values
(1, 'user1', 'payer1'),
(1, 'user2', 'payer1'),
(2, 'user3', 'payer1'),
(1, 'user1', 'payer2'),
(1, 'user2', 'payer2'),
(2, 'user3', 'payer2'),
(3, 'user1', 'payer3'),
(3, 'user2', 'payer3');
I am trying to select all the payer id's whose obj_type is only one value and not any other values. In other words, even though each payer has multiple users, I only want the payers who are only using one obj_type.
I have tried using a query like this:
select * from new_table
where obj_type = 1
group by payer_id;
But this returns rows whose payers also have other user's with other obj_types. I am trying to get a result that looks like:
obj | user | payer
----|-------|--------
3 | user1 | payer3
3 | user2 | payer3
Thanks in advance.
That is actually easy:
SELECT player_id
FROM new_table
GROUP BY player_id
HAVING COUNT(DISTINCT obj_type) = 1
Having filters rows just like WHERE but it does so after the aggregation.
The difference is best explained by an example:
SELECT dept_id, SUM(salary)
FROM employees
WHERE salary > 100000
GROUP BY dept_id
This will give you the sum of the salaries of people earning more than 100000 each.
SELECT dept_id, SUM(salary)
FROM employees
GROUP BY dept_id
HAVINF salary > 100000
The second query will give you the departments where all employees together earn more than 100000 even if no single employee earns that much.
If you want to return all rows without grouping them you can use analytic functions:
SELECT * FROM (
SELECT obj_type,user_id,
payer_id,
COUNT(DISTINCT obj_type) OVER (PARTITION BY payer_id) AS distinct_obj_type
FROM new_table)
WHERE distinct_obj_type = 1
Or you can use exist with the query above:
SELECT *
FROM new_table
WHERE payer_id IN (SELECT payer_id
FROM new_table
GROUP BY payer_id
HAVING COUNT(DISTINCT obj_type) = 1)

get the id of the row with the least value, group by an other column

I ran into a problem trying to pull one action per user with the least priority, the priority is based on other columns content and is an integer,
This is the initial query :
SELECT
CASE
...
END AS dummy_priority,
id,
user_id
FROM
actions
Result :
id user_id priority
1 2345 1
2 2345 3
3 2999 5
4 2999 2
5 3000 10
Desired result :
id user_id priority
1 2345 1
4 2999 2
5 3000 10
Following what i want i tried
SELECT x.id, x.user_id, MIN(x.priority)
FROM (
SELECT
CASE
...
END AS priority,
id,
user_id
FROM
actions
) x
GROUP BY x.user_id
Which didn't work
Error Code: 1055. Expression #1 of SELECT list is not in GROUP BY
clause and contains nonaggregated column 'x.id' which is not
functionally dependent on columns in GROUP BY clause;
Most examples of this I found were extracting just the user_id and priority and then doing an inner join with both of them to get the row, but I can't do that since (priority, user_id) isn't unique
A simple verifiable example would be
CREATE TABLE `actions` (
`id` int(11) NOT NULL,
`user_id` int(11) DEFAULT NULL,
`priority` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `actions` (`id`, `user_id`, `priority`) VALUES
(1, 2345, 1),
(2, 2345, 3),
(3, 2999, 5),
(4, 2999, 2),
(5, 3000, 10);
how to extract the desired result (please hold in mind that this table is a subquery)?
The proper way to do this would involve a subquery of some sort . . . and that would require repeating the case definition.
Here is another method, using the substring_index()/group_concat() trick:
SELECT SUBSTRING_INDEX(GROUP_CONCAT(x.id ORDER BY x.priority), ',', 1) as id,
x.user_id, MIN(x.priority)
FROM (SELECT (CASE ...
END) AS priority,
id, user_id
FROM actions a
) x
GROUP BY x.user_id;
And that proper way in full...
SELECT x...
, CASE...x... priority
FROM my_table x
JOIN
( SELECT user_id
, MIN(CASE...) priority
FROM my_table
GROUP
BY user_id
) y
ON y.user_id = x.user_id
AND y.priority = CASE...x...;
This should work ...
SELECT id , user_id, priority FROM actions act
INNER JOIN
(SELECT
user_id, MIN(priority) AS priority
FROM
actions
GROUP BY user_id) pri
ON act.user_id = pri.user_id AND act.priority = pri.prority

Limit count in sql

I have a query that looks like the below
SELECT
venueid as VENUES, venue2.venue AS LOCATION,
(SELECT COUNT(*) FROM events WHERE (VENUES = venueid) AND eventdate < CURDATE()) AS number
FROM events
INNER JOIN venues as venue2 ON events.venueid=venue2.id
GROUP BY VENUES
ORDER BY number DESC
I want to limit the count to count the last 5 rows in the table (sorting by id) however when I add a limt 0,5 the results don't seem to change. When counting where do you add in the limit to limit the amount of rows that are being counted?
CREATE TABLE venues (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
venue VARCHAR(255)
) DEFAULT CHARACTER SET utf8 ENGINE=InnoDB;
CREATE TABLE categories (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
category VARCHAR(255)
) DEFAULT CHARACTER SET utf8 ENGINE=InnoDB;
CREATE TABLE events (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
eventdate DATE NOT NULL,
title VARCHAR(255),
venueid INT,
categoryid INT
) DEFAULT CHARACTER SET utf8 ENGINE=InnoDB;
INSERT INTO venues (id, venue) VALUES
(1, 'USA'),
(2, 'UK'),
(3, 'Japan');
INSERT INTO categories (id, category) VALUES
(1, 'Jazz'),
(2, 'Rock'),
(3, 'Pop');
INSERT INTO events (id, eventdate, title, venueid, categoryid) VALUES
(1,20121003,'Title number 1',1,3),
(2,20121010,'Title number 2',2,1),
(3,20121015,'Title number 3',3,2),
(4,20121020,'Title number 4',1,3),
(5,20121022,'Title number 5',2,1),
(6,20121025,'Title number 6',3,2),
(7,20121030,'Title number 7',1,3),
(8,20121130,'Title number 8',1,1),
(9,20121230,'Title number 9',1,2),
(10,20130130,'Title number 10',1,3);
The expected result should look like the below
|VENUES |LOCATION |NUMBER |
|1 | USA | 3 |
|2 | UK | 1 |
|3 | Japan | 1 |
As of the time of posting id 9,8,7,6,5 are the last 5 events before the current date.
See SQL Fiddle link below for full table details.
http://sqlfiddle.com/#!2/21ad85/32
This query gives you the five rows that you are trying to group and count:
SELECT *
FROM events
WHERE eventdate < CURDATE()
ORDER BY eventdate DESC
LIMIT 5
Now you can use this query as a subquery. You can join with the result of a subquery just as if it were an ordinary table:
SELECT
venueid as VENUES,
venue2.venue AS LOCATION,
COUNT(*) AS number
FROM
(
SELECT *
FROM events
WHERE eventdate < CURDATE()
ORDER BY eventdate DESC
LIMIT 5
) AS events
INNER JOIN venues as venue2 ON events.venueid=venue2.id
GROUP BY VENUES
ORDER BY number DESC
http://sqlfiddle.com/#!2/21ad85/37

SELECT newest record of any GROUP of records (ignoring records with one record)

Having trouble with a query to return the newest order of any grouped set of orders having more than 1 order. CREATE & INSERTs for the test data are below.
This query returns the unique customer id's I want to work with, along with the grouped order_id's. Of these records, I only need the most recent order (based on date_added).
SELECT COUNT(customer_id), customer_id, GROUP_CONCAT(order_id) FROM orderTable GROUP BY customer_id HAVING COUNT(customer_id)>1 LIMIT 10;
mysql> SELECT COUNT(customer_id), customer_id, GROUP_CONCAT(order_id) FROM orderTable GROUP BY customer_id HAVING COUNT(customer_id)>1 LIMIT 10;
+--------------------+-------------+------------------------+
| COUNT(customer_id) | customer_id | GROUP_CONCAT(order_id) |
+--------------------+-------------+------------------------+
| 2 | 0487 | F9,Z33 |
| 3 | 1234 | 3A,5A,88B |
+--------------------+-------------+------------------------+
2 rows in set (0.00 sec)
I'm looking for order Z33 (customer_id 0487) and 3A (customer_id 1234).
For clarification, I do not want orders for customers that have only ordered once.
Any help or tips to get me pointed in the right direction appreciated.
Sample table data:
--
-- Table structure for table orderTable
CREATE TABLE IF NOT EXISTS orderTable (
customer_id varchar(10) NOT NULL,
order_id varchar(4) NOT NULL,
date_added date NOT NULL,
PRIMARY KEY (customer_id,order_id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table orderTable
INSERT INTO orderTable (customer_id, order_id, date_added) VALUES
('1234', '5A', '1997-01-22'),
('1234', '88B', '1992-05-09'),
('0487', 'F9', '2002-01-23'),
('5799', 'A12F', '2007-01-23'),
('1234', '3A', '2009-01-22'),
('3333', '7FHS', '2009-01-22'),
('0487', 'Z33', '2004-06-23');
==========================================================
Clarification of the query.
The question was to only include those customers that had more... hence my query has it INSIDE with the GROUP BY... This way it ONLY GIVES the customer in question that HAD multiple orders, but at the same time, only gives the most recent date OF the last order for the person... Then the PreQuery is re-joined to the orders table by the common customer ID, but only for the order that matches the last date as detected in the prequery. If a customer only had a single order, its inner PreQuery count would have only been 1 and thus excluded from the final PreQuery result set.
select ot.*
from
( select
customer_id,
max( date_added ) as LastOrderDate,
from
orderTable
having
count(*) > 1
group by
customer_id ) PreQuery
join orderTable ot
on PreQuery.Customer_ID = ot.Customer_ID
and PreQuery.LastOrderDate = ot.date_added