mysql finding the sum of subgroup maximums

mysql finding the sum of subgroup maximums - mysql

If I have the following table in MySQL:
date type amount
2017-12-01 3 2
2018-01-01 1 100
2018-02-01 1 50
2018-03-01 2 2000
2018-04-01 2 4000
2018-05-01 3 2
2018-06-01 3 1
...is there a way to find the sum of the amounts corresponding to the latest dates of each type? There are guaranteed to be no duplicate dates for any given type.
The answer I'd be looking to get from the data above could broken down like this:
The latest date for type 1 is 2018-02-01, where the amount is 50;
The latest date for type 2 is 2018-04-01, where the amount is 4000;
The latest date for type 3 is 2018-06-01, where the amount is 1;
50 + 4000 + 1 = 4051
Is there a way to arrive directly at 4051 in a single query? This is for a Django project using MySQL if that makes a difference; I wasn't able to find an ORM-related solution either, so figured a raw SQL query might be a better place to start.
Thanks!

Not sure for Django but in raw sql you could use a self join to pick latest row for each type based on latest date and then aggregate your results to get the sum of amounts for each type
select sum(a.amount)
from your_table a
left join your_table b on a.type = b.type
and a.date < b.date
where b.type is null
Demo
Or
select sum(a.amount)
from your_table a
join (
select type, max(date) max_date
from your_table
group by type
) b on a.type = b.type
and a.date = b.max_date
Demo
Or by using a correlated subuery
select sum(a.amount)
from your_table a
where a.date = (
select max(date)
from your_table
where type = a.type
)
Demo
For Mysql 8 you can use window functions to get you desired result as
select sum(amount)
from (select *, row_number() over (partition by type order by date desc) as seq
from your_table
) t
where seq = 1;
Demo

Related

sql optimization: count all rows through subquery or own query / other improvements

I'm trying to improve my mysql query. At first I'm trying to optimize that simple one:
SELECT * ,
(
SELECT COUNT(id)
FROM animal
WHERE type = :type AND timestampadopt > 0 AND (date BETWEEN DATE_FORMAT(CURDATE() , '%Y-%m-%d') - INTERVAL 1 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d'))
) AS countanimals
FROM animal
WHERE type = :type AND timestampadopt > 0 AND (date BETWEEN DATE_FORMAT(CURDATE() , '%Y-%m-%d') - INTERVAL 1 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d'))
ORDER BY timestamp DESC
LIMIT 1, 20;
COLUMNS:
id | timestampadd | timestampadopt | dateborn | animaltype | gender | chipped | smalldescger | smalldesceng | imagepath
On that affected site I loop all animals, with pagination. So you can see 20 animals and for the next 20 you have to use the next button.
I need to know for the pagination how many sites have to be displayed, so I have to count how many animals in total are, that is what the subquery does.
I measured with profiling the times and get following results:
0.0047s for the total query,
0.0023s for the subquery
In the database are only 5 rows!
On that site I offer some filters, like age +/- 1 year and is the animal already adopted, because of that I need the WHERE clause on both, which probably takes up the most performance, followed by the order by clause which is necessary to display the new ones first.
P.S. I need all columns from the table, I did some testings and SELECT * had same runtimes then selecting all 10 columns manually like some people recommend.
EDIT:
Would it be worth to exclude the smalltext (varchar 250), imagpath (varchar 50) columns in a own table and inner join them, the other columns I could probably need for later filter. But type, gender, chipped are tinyints.
Any improvement tips for me?
Should I do the subquery in a own query outside of the main one?
Edit: 31.07
SELECT a.* , c.cnt AS countanimals
FROM animal a
JOIN (
Select a1.date AS date1, a1.tmstmpadopt AS tmstmpadopt1, a1.type AS type1, COUNT(a1.id) as cnt
FROM animal a1
GROUP BY date1, tmstmpadopt1, type1
) c on (a.date = c.date1 AND a.tmstmpadopt = c.tmstmpadopt1 AND a.type = c.type1)
WHERE a.type = 1 AND tmstmpadopt = 0 AND (date BETWEEN DATE_FORMAT(CURDATE() , '%Y-%m-%d') - INTERVAL 100 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d')- INTERVAL 1 YEAR)
ORDER BY a.timestamp DESC
LIMIT 1, 20;

Inline view may help you. So try this
SELECT a.*,c.cnt AS countanimals
FROM animal a
join (Select a1.dateborn, a1.timestampadopt, count(a1.id) as cnt
from animals a1
Where a1.timestampadopt > 0
and a1.type = :type
group by a1.dateborn, a1.timestampadopt) c on (a.dateborn = c.dateborn and a.timestampadopt = c.timestampadopt)
WHERE a.type = :type
AND a.timestampadopt > 0
AND a.dateborn BETWEEN DATE_FORMAT(CURDATE(),'%Y-%m-%d')-INTERVAL 1 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d'))
ORDER BY a.timestamp DESC
LIMIT 1, 20;

Why don't you do the count on the script, as you process the rows, you can count them.

sql basic join table

ID Date Spend
1 01/01/1990 $x1
2 01/01/1990 $x2
2 01/03/1990 $x3
I'm a sql beginner and could someone help me to solve this question?I'd appreciate it!
if we want to just consider the date after the year of 2000, how can we find the ID which has tenth highest spend by using the method of Join?
Using my basic sql knowledge,this is what I have coded:
select ID, SUM(Spend)
From(
select ID,SUM(Spend)
from table A, table B
WHERE A.ID=B.ID
AND Date => 01/01/2000;
)
Order by SUM(Spend) DESC
LIMIT 10;

Here's an example:
select a.ID
, SUM(Spend) as SumSpend
from tableA a
join tableB b
on a.ID = b.ID
where '2000-01-01' <= date
group by
a.ID
order by
SumSpend desc
limit 1
offset 2
group by tells MySQL which groups of rows to sum
You can use limit 1 offset 9 to get the tenth row
If you select a column that is in two tables, you have to specify which table (a.ID above)
In SQL a date is a string in single quotes

Joining two tables by date MySQL

I have this:
SELECT * FROM history JOIN value WHERE history.the_date >= value.the_date
is it possible to somehow to ask this question like, where history.the_date is bigger then or equal to biggest possible value of value.the_date?
HISTORY
the_date amount
2014-02-27 200
2015-02-26 2000
VALUE
the_date interest
2010-02-10 2
2015-01-01 3
I need to pair the correct interest with the amount!

So value.the_date is the date since when the interest is valid. Interest 2 was valid from 2010-02-10 till 2014-12-31, because since 2015-01-01 the new interest 3 applies.
To get the current interest for a date you'd use a subquery where you select all interest records with a valid-from date up to then and only keep the latest:
select
the_date,
amount,
(
select v.interest
from value v
where v.the_date <= h.the_date
order by v.the_date desc
limit 1
) as interest
from history h;

use join condition after ON not in where clause...
SELECT * FROM history JOIN (select max(value.the_date) as d from value) as x on history.the_date >= x.d
WHERE 1=1

Presumably, you want this:
select h.*
from history h
where h.the_date >= (select max(v.the_date) from value v);

Get greatest common value in a column across tables

I have 4 tables (say A, B, C and D) all with the column 'date'. I need to find the greatest common date value across all four tables. That is, the greatest value of date that exists in all four tables. How can I do this?
For now, I'm making do with finding the MIN of the MAX date values of all four tables, but this fails in the cases where the MIN exists in one table but not in the second.
Here is an example to make things clearer :
A.date
------
2015-03-31
2015-03-30
2015-03-29
2015-03-27
B.date
------
2015-03-30
2015-03-29
2015-03-28
2015-03-27
C.date
------
2015-03-29
2015-03-27
2015-03-26
2015-03-25
D.date
------
2015-03-28
2015-03-27
2015-03-26
2015-03-25
What I was doing to find the highest common date was :
SELECT MIN(max_date) FROM (
SELECT MAX(date) AS max_date FROM A
UNION
SELECT MAX(date) AS max_date FROM B
UNION
SELECT MAX(date) AS max_date FROM C
UNION
SELECT MAX(date) AS max_date FROM D
) T;
This gives me 2015-03-28, but then I realized that some tables might not have this date at all. The date I actually want to get is 2015-03-27.

Here is one method:
select date
from (select date, 'a' as which from a union all
select date, 'b' as which from b union all
select date, 'c' as which from c union all
select date, 'd' as which from d
) x
group by date
having count(distinct which) = 4
order by date desc
limit 1;
The following version might perform a bit better, especially if you have an index on date in each table:
select date
from (select distinct date, 'a' as which from a union all
select distinct date, 'b' as which from b union all
select distinct date, 'c' as which from c union all
select distinct date, 'd' as which from d
) x
group by date
having count(*) = 4
order by date desc
limit 1;

You need to get an intersection of all date values across the 4 separate tables. Then, select the MAX of these values:
SELECT MAX(date)
FROM A
WHERE date IN (
SELECT date
FROM B
WHERE date IN (
SELECT date
FROM C
WHERE date IN (
SELECT date
FROM D)))
SQL Fiddle Demo here

mysql : Get latest value and sum of values from previous hour

I would like to return a product together with its latest value and values from last hour.
I have a product-table :
id, name, type (and so on)...
I have a values-table :
id_prod, timestamp, value
Something like :
12:00:00 = 10
12:15:00 = 10
12:30:00 = 10
12:45:00 = 10
13:00:00 = 10
13:15:00 = 10
13:30:00 = 10
I would like a query that returns the latest value (13:30:00) together with the sum of values one hour back. This should return:
time = 13:30:00
latestread = 10
lasthour = 40
What I almost got working was:
SELECT *,
(SELECT value FROM values S WHERE id_prod=P.id
ORDER BY timestamp DESC LIMIT 1) as latestread,
(SELECT sum(value) FROM values WHERE id_prod=D.id and
date_created>SUBTIME(S.date_created,'01:00:00')) as trendread
FROM prod P ORDER BY name
But this fails with "Unknown column 'S.date_created' in 'where clause'"
Any suggestions?

If I understand correctly what you're trying to do, then You would have something like:
SELECT p.id, max(date_created), sum(value), mv.max_value
FROM product p
JOIN values v on p.id = v.product_id
JOIN (SELECT product_id, value as max_value
FROM values v2
WHERE date_created = (SELECT max(date_created) FROM values WHERE product_id=v2.product_id)) mv on product_id=p.id
WHERE date_created between DATE_SUB(now(), INTERVAL 1 HOUR)) and now()
GROUP BY p.id
ORDER BY p.id

Aleks G and mhasan gave solutions, but not the reason why this fails. The reason this fails is because the alias S is not known inside the subquery. Subqueries have no knowledge about the tables outside their scope.

You have missed providing alias for table Values in subquery below
SELECT *,
(SELECT value FROM values S WHERE id_prod=P.id
ORDER BY timestamp DESC LIMIT 1) as latestread,
(SELECT sum(value) FROM values S WHERE id_prod=P.id and
date_created>SUBTIME(S.date_created,'01:00:00')) as trendread
FROM prod P ORDER BY name

I think this is the query that you are trying to write:
SELECT p.*,
(SELECT v.value
FROM values v
WHERE v.id_prod = p.id
ORDER BY v.timestamp DESC
LIMIT 1
) as latestread,
(SELECT sum(v.value)
FROM values v
WHERE v.id_prod = p.id and
v.timestamp > SUBTIME(now(), '01:00:00')
) as trendread
FROM prod p
ORDER BY p.name;
This changes all the aliases to be abbreviations for the table name. It also fixes the expression for the last hour by using now() and gets rid of date_created which doesn't seem to be in either table based on the question. The query conveniently assumes that timestamp is a datetime. If it is a unix timestamp, then somewhat different time logic is necessary.
This should be reasonably efficient with an index on values(id_prod, timestamp, value).

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

mysql finding the sum of subgroup maximums - mysql

Related

sql optimization: count all rows through subquery or own query / other improvements

sql basic join table

Joining two tables by date MySQL

Get greatest common value in a column across tables

mysql : Get latest value and sum of values from previous hour

Categories

Resources