Find MAX of a summed value on multiple group by clauses - mysql

As the title indicates, I am trying to find the maximum summed value in column C for an object in column A based on a subset of column B over a period of time (let's say column D). My current query looks something like this in which I return the summed values greater than 10,000.
select id_a, id_b, sum(column_c) from master_table where id_b in (1,2,3,4,5)
and ymdh >= '2017-11-01' group by 1,2 having sum(column_c) > 10000 order by 2,3
desc;
What I'm trying to get returned is the greatest value from sum(column_c). I tried using both the max() and distinct() functions. Specifically using max(sum(imps)), but aggregate function calls many not be nested. Would anyone be able to provide guidance here?

You can use a FROM ( select ) T
select max(my_sum)
from (
select id_a
, id_b
, sum(column_c) my_sum
from master_table
where id_b in (1,2,3,4,5)
and ymdh >= '2017-11-01'
group by 1,2 having my_sum > 10000
order by 2,3 desc;
) T

Does this do what you want?
select id_a, id_b, sum(column_c)
from master_table
where id_b in (1,2,3,4,5) and
ymdh >= '2017-11-01'
group by id_a, id_b
having sum(column_c) > 10000
order by sum(column_c) desc
limit 1;
That is, use order by and limit to get the value you want. (This query includes the group by keys as well, but that is not necessary.)

scaisEdge has the answer (and my +1) - but I just wanted to add a bit about the thought process when designing an SQL statement like you're working on.
Don't feel you need to compose the whole thing - that it's one big statement, or that it's one single query.
Instead, you'll often need to break up the problem into steps, solve the individual steps, and then use those steps as sources for a query - because you don't have to use tables in the FROM clause; you can use your own subqueries instead.
So for this problem? You've got the first step done - you figured out how to write the query that gets the Sum over a particular grouping:
select someCol, sum(otherCol) as groupSum from myTable
group by someCol
Great! Now, you can effectively use this like it's a table:
select someCol, groupSum
from
(
select someCol, sum(otherCol) as groupSum from myTable
group by someCol
) mySubquery
And in your case, you want to get the maximum sum?
select max(groupSum)
from
(
select someCol, sum(otherCol) as groupSum from myTable
group by someCol
) mySubquery
Not only will this help while composing the full SQL statement, it'll actually help the person trying to read/debug it down the line, especially if you name your subqueries/columns well:
select max(totalHitsForWeek) as maxWeeklyUsage
from
(
select week, sum(hits) as totalHitsForWeek
from requestsTable
) hitsPerWeekSubquery
Hope that helps add to scaisEdge's answer! :-)

Related

SQL code in order to select second value after first one

Help needed. Could someone help to generate code which would take only second value of IncurredAmount after first one from the same policid.
SELECT claims.claimid, claims.policyid, claims.IncurredAmount
FROM claims
GROUP BY claims.claimid, claims.policyid, claims.IncurredAmount
HAVING (((claims.policyid)=62));
That's what I have. I tried to take one policyid (62) in order to have less entries. But there I stuck. have no clue what clause can be used in order to take only second entries for all entries.
Try this, though whether it will work depends on the version of your database:
SELECT claimid, policyid, IncurredAmount
FROM (
SELECT *,
row_number() over (partition by policyid order by claimid) rn
FROM [MyTable]
) t
WHERE t.rn = 2
A solution exists for the old MySql versions (pre 8.0)
select *
from claims t
where exists (
select 1
from claims t2
where t2.policyid = t.policyid
and t2.claimid <= t.claimid
having count(distinct t2.claimid) = 2
)
order by policyid, claimid
db<>fiddle here
Although it's more equivalent to a DENSE_RANK.
I.e. if there's more with the 2nd lowest claimid then it'll get more than 1.

Adding Serial Number in mysql successfully but unknown column when use in where clause

I have a table in which I fetch results according to winning in descending order. But I need id for Load More function.
I have successfully added serial number but for load more function i want to use serial number id to get the result and not repeating the result. But when I use into where clause I get error unknown column.
Any help would be appreciated.
I have deleted the stuff, as I post my answer.
try this
select * from (
SELECT rank()over(order by pic_wins desc) as serial_number,
pic_id,
pic_caption,
pic_image,
pic_wins
FROM pics
WHERE pic_status = '1'
order by pic_wins desc
) as foo
where serial_number=1;
you can't use alias in where clause but you can use it an sub query
And if you want continuous series in serial_number use dense_rank()
Thanks for everyone, I got my answer to little modify #dgnk answer
select * from ( SELECT #s:=#s+1 serial_number, pic_id, pic_caption, pic_image, pic_wins FROM pics, (SELECT #s:= 0) AS s WHERE pic_status = '1' order by pic_wins desc ) as foo where serial_number > 8

SELECT count of subquery before applying LIMIT (clickhouse)

I have a subquery that aggregates some UNION ALL selects. Over that I prepare the SELECT to create cross-tab and limit it to let's say 20. I would like to be able to retrieve the total COUNT of sub query results before I am limiting them in main query. This is for the purpose of trying to build a pagination that receives the total number of records and then the specific page record grid.
Sample query:
SELECT
name,
sumIf(metric_value, metric_name = 'data') AS data,
sumif(....
FROM
(SELECT
name, metric_name, SUM(metric_value) as metric_value
FROM
(SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table2
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table3
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
.
.
.)
GROUP BY
name, metric_name)
GROUP BY
name
ORDER BY
name ASC
LIMIT 0,20;
The first subselect returns tons of data, so I thought I can count it and return as one column value, or row and it would propagate to main select that limits 20 results. Because I need to know the entire set of results but don;t want to call the same query twice without limit and with limit just to get COUNT. There are at least 12 UNION ALL third level sub selects, so why waste resources. I am looking to try generic SQL solutions not necessarily related to ClickHouse
I was thinking of using count(*) OVER (), however that is not supported, so if thats only option I know I need to run query twice.
The first thing that one should mention is that nobody is usually interested in the exact number of pages on a query. It can be easily estimated and almost no one will care how exact is the estimation. However, if you have a link to the last page in your GUI, people will often click to link just to see whether it works.
Nevertheless, there are cases when an analyst should visit all the pages, and then the GUI should display the exact amount of work. A good news is that in that latter case, a better strategy is to cache a snapshot of the whole results table and counting the rows in the table becomes not a problem anymore.
I mean, it makes sense to discuss with the customers whether they really need it, because unneeded full scans many times per day may have effect on the database load and billing sums.
Anyway, if you still need to estimate the number of rows, you can simplify the query just to count the number of rows. As I understand this is something like:
SELECT SUM(cnt) as row_count
FROM (
SELECT COUNT(DISTINCT name) as cnt FROM table1 WHERE date > ...
UNION ALL
SELECT COUNT(DISTINCT name) as cnt FROM table2 WHERE date > ...
...
) as counts;
or if data is a constant metric name
SELECT COUNT(DISTINCT name) as row_count
FROM (
SELECT DISTINCT name FROM table1 WHERE date > ...
UNION ALL
SELECT DISTINCT name FROM table2 WHERE date > ...
...
) as names;

SQL Distinct - Get all values

Thanks for looking, I'm trying to get 20 entries from the database randomly and unique, so the same one doesn't appear twice. But I also have a questionGroup field, which should also not appear twice. I want to make that field distinct, but then get the ID of the field selected.
Below is my NOT WORKING script, because it does the ID as distinct too which
SELECT DISTINCT `questionGroup`,`id`
FROM `questions`
WHERE `area`='1'
ORDER BY rand() LIMIT 20
Any advise is greatly appreciated!
Thanks
Try doing the group by/distinct first in a subquery:
select *
from (select distinct `questionGroup`,`id`
from `questions`
where `area`='1'
) qc
order by rand()
limit 20
I see . . . What you want is to select a random row from each group, and then limit it to 20 groups. This is a harder problem. I'm not sure if you can do this accurately with a single query in mysql, not using variables or outside tables.
Here is an approximation:
select *
from (select `questionGroup`
coalesce(max(case when rand()*num < 1 then id end), min(id)) as id
from `questions` q join
(select questionGroup, count(*) as num
from questions
group by questionGroup
) qg
on qg.questionGroup = q.questionGroup
where `area`='1'
group by questionGroup
) qc
order by rand()
limit 20
This uses rand() to select an id, taking, on average two per grouping (but it is random, so sometimes 0, 1, 2, etc.). It chooses the max() of these. If none appear, then it takes the minimum.
This will be slightly biased away from the maximum id (or minimum, if you switch the min's and max's in the equation). For most applications, I'm not sure that this bias would make a big difference. In other databases that support ranking functions, you can solve the problem directly.
Something like this
SELECT DISTINCT *
FROM (
SELECT `questionGroup`,`id`
FROM `questions`
WHERE `area`='1'
ORDER BY rand()
) As q
LIMIT 20

ORDER BY date and time BEFORE GROUP BY name in mysql

i have a table like this:
name date time
tom | 2011-07-04 | 01:09:52
tom | 2011-07-04 | 01:09:52
mad | 2011-07-04 | 02:10:53
mad | 2009-06-03 | 00:01:01
i want oldest name first:
SELECT *
ORDER BY date ASC, time ASC
GROUP BY name
(->doesn't work!)
now it should give me first mad(has earlier date) then tom
but with GROUP BY name ORDER BY date ASC, time ASC gives me the newer mad first because it groups before it sorts!
again: the problem is that i can't sort by date and time before i group because GROUP BY must be before ORDER BY!
Another method:
SELECT *
FROM (
SELECT * FROM table_name
ORDER BY date ASC, time ASC
) AS sub
GROUP BY name
GROUP BY groups on the first matching result it hits. If that first matching hit happens to be the one you want then everything should work as expected.
I prefer this method as the subquery makes logical sense rather than peppering it with other conditions.
As I am not allowed to comment on user1908688's answer, here a hint for MariaDB users:
SELECT *
FROM (
SELECT *
ORDER BY date ASC, time ASC
LIMIT 18446744073709551615
) AS sub
GROUP BY sub.name
https://mariadb.com/kb/en/mariadb/why-is-order-by-in-a-from-subquery-ignored/
I think this is what you are seeking :
SELECT name, min(date)
FROM myTable
GROUP BY name
ORDER BY min(date)
For the time, you have to make a mysql date via STR_TO_DATE :
STR_TO_DATE(date + ' ' + time, '%Y-%m-%d %h:%i:%s')
So :
SELECT name, min(STR_TO_DATE(date + ' ' + time, '%Y-%m-%d %h:%i:%s'))
FROM myTable
GROUP BY name
ORDER BY min(STR_TO_DATE(date + ' ' + time, '%Y-%m-%d %h:%i:%s'))
This worked for me:
SELECT *
FROM your_table
WHERE id IN (
SELECT MAX(id)
FROM your_table
GROUP BY name
);
Use a subselect:
select name, date, time
from mytable main
where date + time = (select min(date + time) from mytable where name = main.mytable)
order by date + time;
If you wont sort by max date and group by name, you can do this query:
SELECT name,MAX(date) FROM table group by name ORDER BY name
where date may by some date or date time string. It`s response to you max value of date by each one name
Another way to solve this would be with a LEFT JOIN, which could be more efficient. I'll first start with an example that considers only the date field, as probably it is more common to store date + time in one datetime column, and I also want to keep the query simple so it's easier to understand.
So, with this particular example, if you want to show the oldest record based on the date column, and assuming that your table name is called people you can use the following query:
SELECT p.* FROM people p
LEFT JOIN people p2 ON p.name = p2.name AND p.date > p2.date
WHERE p2.date is NULL
GROUP BY p.name
What the LEFT JOIN does, is when the p.date column is at its minimum value, there will be no p2.date with a smaller value on the left join and therefore the corresponding p2.date will be NULL. So, by adding WHERE p2.date is NULL, we make sure to show only the records with the oldest date.
And similarly, if you want to show the newest record instead, you can just change the comparison operator in the LEFT JOIN:
SELECT p.* FROM people p
LEFT JOIN people p2 ON p.name = p2.name AND p.date < p2.date
WHERE p2.date is NULL
GROUP BY p.name
Now, for this particular example where date+time are separate columns, you would need to add them in some way if you want to query based on the datetime of two columns combined, for example:
SELECT p.* FROM people p
LEFT JOIN people p2 ON p.name = p2.name AND p.date + INTERVAL TIME_TO_SEC(p.time) SECOND > p2.date + INTERVAL TIME_TO_SEC(p2.time) SECOND
WHERE p2.date is NULL
GROUP BY p.name
You can read more about this (and also see some other ways to accomplish this) on the The Rows Holding the Group-wise Maximum of a Certain Column page.
I had a different variation on this question where I only had a single DATETIME field and needed a limit after a group by or distinct after sorting descending based on the datetime field, but this is what helped me:
select distinct (column) from
(select column from database.table
order by date_column DESC) as hist limit 10
In this instance with the split fields, if you can sort on a concat, then you might be able to get away with something like:
select name,date,time from
(select name from table order by concat(date,' ',time) ASC)
as sorted
Then if you wanted to limit you would simply add your limit statement to the end:
select name,date,time from
(select name from table order by concat(date,' ',time) ASC)
as sorted limit 10
In Oracle, This work for me
SELECT name, min(date), min(time)
FROM table_name
GROUP BY name
work for me mysql
select * from (SELECT number,max(date_added) as datea FROM sms_chat group by number) as sup order by datea desc
This is not the exact answer, but this might be helpful for the people looking to solve some problem with the approach of ordering row before group by in mysql.
I came to this thread, when I wanted to find the latest row(which is order by date desc but get the only one result for a particular column type, which is group by column name).
One other approach to solve such problem is to make use of aggregation.
So, we can let the query run as usual, which sorted asc and introduce new field as max(doc) as latest_doc, which will give the latest date, with grouped by the same column.
Suppose, you want to find the data of a particular column now and max aggregation cannot be done.
In general, to finding the data of a particular column, you can make use of GROUP_CONCAT aggregator, with some unique separator which can't be present in that column, like GROUP_CONCAT(string SEPARATOR ' ') as new_column, and while you're accessing it, you can split/explode the new_column field.
Again, this might not sound to everyone. I did it, and liked it as well because I had written few functions and I couldn't run subqueries. I am working on codeigniter framework for php.
Not sure of the complexity as well, may be someone can put some light on that.
Regards :)