Sort on a WITH ROLLUP - mysql

I can do the following fine:
SELECT provider_id, count(*) cnt
FROM title
GROUP BY provider_id WITH ROLLUP
However, it doesn't seem to support ordering after a rollup:
SELECT provider_id, count(*) cnt
FROM title
GROUP BY provider_id WITH ROLLUP
ORDER BY count(*) DESC
Incorrect usage of CUBE/ROLLUP and ORDER BY
Two questions related to this:
Are you not allowed to use an order by in the same select statement as a with rollup? Or, is this something specific to mysql5.7 -- in that it doesn't support this feature -- but other DBs do dupport this?
Given this constraint, is the only way to do the sort by using a subselect?
SELECT * FROM (<rollup query>) _ ORDER BY cnt DESC
Related, but doesn't really answer the above question (as I already have the query above): https://stackoverflow.com/a/1768565/651174.

Related

The last value using GROUP BY

I need to take the last value from table where can_id equal.
So I've tried this SQL query
SELECT com.text, com.can_id
FROM (SELECT * FROM comments ORDER BY id DESC) as com
GROUP BY com.can_id
But if I change ASC / DESC in the first select, the second select will just group without sorting and take the value with the first id
This select will be used like left join in the query.
Example:
I need to get com.text with value "text2" (lasts)
If you are on MySql 8, you can use row_number:
SELECT com.text, com.can_id
FROM (SELECT comments.*,
row_number() over (partition by can_id order by id desc) rn
FROM comments) as com
WHERE rn = 1;
If you are on MySql 5.6+, you can (ab)use group_concat:
SELECT SUBSTRING_INDEX(group_concat(text order by id desc), ',', 1),
can_id
FROM comments
GROUP BY can_id;
In any version of MySQL, the following will work:
SELECT c.*
FROM comments c
WHERE c.id = (SELECT MAX(c2.id)
FROM comments c2
WHERE c2.can_id = c.can_id
);
With an index on comments(can_id, id), this should also have the best performance.
This is better than a group by approach because it can make use of an index and is not limited by some internal limitation on intermediate string lengths.
This should have better performance than row_number() because it does not assign a row number to each row, only then to filter things out.
The order by clause in the inner select is redundant since it's being used as a table, and tables in a relational database are unordered by nature.
While other databases such as SQL Server will treat is as an error, I guess MySql simply ignores it.
I think you are looking for something like this:
SELECT text, can_id
FROM comments
ORDER BY id DESC
LIMIT 1
This way you get the text and can_id associated with the highest id value.

mysql/postgres window function limit result without subquery

Is it possible to limit the result of a window function, with partitioning, without a subquery? This code is in postgres/mysql. I'm looking for solution in mysql and postgres.
For example: let's say the join is irrelevant to the point of the question.
select acct.name, we.channel, count(*) as cnt,
max(count(*)) over (partition by name order by count(*) desc) as max_cnt
from web_events we join accounts acct
on we.account_id=acct.id
group by acct.name, we.channel
order by name, max_cnt desc;
The result of this query gives:
I only want to show the first line of each of the window's partition.
For example: lines with cnt: [3M,19],[Abbott Labortories,20]
I tried the following that doesn't work (added limit 1 to the window function):
select acct.name, we.channel, count(*) as cnt,
max(count(*)) over (partition by name order by count(*) desc limit 1) as max_cnt
from web_events we join accounts acct
on we.account_id=acct.id
group by acct.name, we.channel
order by name, max_cnt desc;
I only want to show the first line of each of the window's partition. For example: lines with cnt: [3M,19],[Abbott Labortories,20]
You don't actually need a window function here, since the first row's max_cnt will always equal cnt. Instead use DISTINCT ON in combination with the GROUP BY.
From the postgresql documentation
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first
SELECT DISTINCT ON(acct.name)
acct.name
, we.channel
, COUNT(*) cnt
FROM web_events we
JOIN accounts acct
ON we.account_id=acct.id
GROUP BY 1, 2
ORDER BY name, cnt DESC;
Here's a quick demo in sqlfiddle. http://sqlfiddle.com/#!17/57694/8
1 way I always messed up when I first started using DISTINCT ON is to ensure that the order of expressions in the ORDER BY clause starts with the expressions in the DISTINCT ON. In the above example the ORDER BY starts with acct.name
If there is a tie for first position, the first row that meets the criteria will be returned. This is non-deterministic. It is possible to specify additional expressions in the ORDER BY to affect which rows are returned in this setting.
example:
ORDER BY name, cnt DESC, channel = 'direct'
will return the row containing facebook, if for a given account, both facebook and direct yield the same cnt.
However, note that with this approach, it is not possible to return all the rows that are tied for first position, i.e. both rows containing facebook & direct (without using a subquery).
DISTINCT ON may be combined in the same statement with GROUP BYs (above example) and WINDOW FUNCTIONS (example below). The DISTINCT ON clause is logically evaluated just before the LIMIT.
For instance, the following query (however pointless) shows off the combination of DISTINCT ON with WINDOW FUNCTION. It will return a distinct row per max_cnt
SELECT DISTINCT ON(mxcnt)
acct.name
, we.channel
, COUNT(*) cnt
, MAX(COUNT(*)) OVER (PARTITION BY acct.name) mxcnt
FROM web_events we
JOIN accounts acct
ON we.account_id=acct.id
GROUP BY 1, 2
ORDER BY mxcnt, cnt DESC;
Use a subquery. If you want exactly one row (even if there are ties), then use row_number():
select name, channel, cnt
from (select acct.name, we.channel, count(*) as cnt,
row_number() over (partition by acct.name order by count(*) desc) as seqnum
from web_events we join
accounts acct
on we.account_id = acct.id
group by acct.name, we.channel
) wea
order by name;
You can use rank() if you want multiple rows for an account, in the event of ties.

GROUP BY in subquery to get accurate ranking

I'm trying to get the rank of a particular lap time of a specific track owned by a particular user.
There are multiple rows (laps) in this table for a specific user. So I'm trying to GROUP BY as seen in the subquery of FIND_IN_SET.
Right now MySQL (latest version) is complaining that my session_id,user_id,track_id,duration are not aggregated for the GROUP BY.
Which I don't understand why its complaining about this since the GROUP BY is in a subquery.
session_lap_times schema:
session_id, int
user_id, int
track_id, int
duration, decimal
This is what I've got so far.
SELECT
session_id
user_id,
track_id,
duration,
FIND_IN_SET( duration,
(SELECT GROUP_CONCAT( duration ORDER BY duration ASC ) FROM
(SELECT user_id,track_id,min(duration)
FROM session_lap_times GROUP BY user_id,track_id) AS aa WHERE track_id=s1.track_id)
) as ranking
FROM session_lap_times s1
WHERE user_id=1
It seems like its trying to enforce the group by rules on the parent queries as well.
For reference, this is the error I'm getting: http://imgur.com/a/ILufE
Any help is greatly appreciated.
If I'm not mistaken, the problem is here (broken out for clarity):
SELECT user_id,track_id,any_value(duration)
FROM session_lap_times
GROUP BY user_id
The query is probably barfing because track_id is in the select and not in the group by. That means the subselect doesn't stand on its own and makes the whole thing fail.
Try adding track_id to your group by and adjust from there.
You are grouping by user_id but you do not do any aggregation in select or having in the following sub-query
SELECT
user_id,any_value(track_id),any_value(duration)
FROM session_lap_times GROUP BY user_id
You are using GROUP_CONCAT in a wrong context in the following sub-query because you do not group any column in ranking temporary table.
(SELECT GROUP_CONCAT( duration ORDER BY duration ASC ) FROM
(SELECT user_id,track_id,any_value(duration)
FROM session_lap_times GROUP BY user_id,track_id) AS aa WHERE track_id=s1.track_id)
) as ranking

Do I need inner ORDER BY when there is an outer ORDER BY?

Here is my query:
( SELECT id, table_code, seen, date_time FROM events
WHERE author_id = ? AND seen IS NULL
) UNION
( SELECT id, table_code, seen, date_time FROM events
WHERE author_id = ? AND seen IS NOT NULL
LIMIT 2
) UNION
( SELECT id, table_code, seen, date_time FROM events
WHERE author_id = ?
ORDER BY (seen IS NULL) desc, date_time desc -- inner ORDER BY
LIMIT 15
)
ORDER BY (seen IS NULL) desc, date_time desc; -- outer ORDER BY
As you see there is an outer ORDER BY and also one of those subqueries has its own ORDER BY. I believe that ORDER BY in subquery is useless because final result will be sorted by that outer one. Am I right? Or that inner ORDER BY has any effect on the sorting?
Also my second question about query above: in reality I just need id and table_code. I've selected seen and date_time just for that outer ORDER BY, Can I do that better?
You need the inner order by when you have a limit in the query. So, the third subquery is choosing 15 rows based on the order by.
In general, when you have limit, you should be using order by. This is particularly true if you are learning databases. You might seem to get the right answer -- and then be very surprised when it doesn't work at some later point in time. Just because something seems to work doesn't mean that it is guaranteed to work.
The outer order by just sorts all the rows returned by the subqueries.

Converting sub-query to join to create a view

I have a nested SQL query :
SELECT *
FROM
(
SELECT *
FROM asset_status
ORDER BY session_id DESC
) tmp
GROUP BY asset_id, workflow_element_id
I would like to create a view from this query but MySQL doesn't seem to allow subqueries in views. How can convert this to a join?
SQL Server does allow sub-queries in views. What you can't do, is SELECT * and GROUP BY a, b
Have you tried... (I'll assume this isn't your whole query so I'll make the minimum possible changes)
SELECT asset_id, workflow_element_id
FROM
(
SELECT *
FROM asset_status
-- ORDER BY session_id DESC (Removed as innefective in a view)
) tmp
GROUP BY asset_id, workflow_element_id
Also, note that the ORDER BY in the inner query is innefective (and possibly even dis-allowed), as the outer query is then allowed to re-order it (it won't always come back in a different order, but this layout doesn't guarnatee the order you seem to want). Even in the outer query, it may cause your results to be order when using the view, but again the optimiser is allowed to re-order the results. Unless the ORDER BY is in the query using the view, the order is never absolutely guaranteed...
SELECT * FROM view ORDER BY x
Finally, you tagged this as a LEFT JOIN question. If you have a more complete example of the code, I'm sure someone will suggest an alternative layout. But I'm off out for a few days now. Good luck! :)
There is no need for subquery since the inner order by is not guaranteed to be used at all. You can write:
SELECT DISTINCT asset_id, workflow_element_id
FROM asset_status
If you need to order by session_id you would have to include it in an aggregate, max for example. (or in the group by)
SELECT asset_id, workflow_element_id
FROM asset_status
GROUP BY asset_id, workflow_element_id
ORDER BY MAX(session_id) DESC
According to the MySQL reference manual, you can create views that use sub-queries, but not in the From clause.
Therefore, I think you need to create your view like the following:
select a.*
from asset_status a
join (select asset_id, workflow_element_id, MAX(session_id) session_id
from asset_status
group by asset_id, workflow_element_id) sq
on a.session_id = sq.session_id
However, it probably won't perform as well as your original query.