Limit by different field than order in MySQL statement? - mysql

Okay, lets say I have the following MySQL query:
SELECT table1.*, COUNT(table2.link_id) AS count
FROM table1
LEFT JOIN table2 on (table1.key = table2.link_id)
GROUP BY table1.key
ORDER BY table1.name ASC
LIMIT 20
Simple right? It returns the table1 info, with the number of times each row is linked in table2.
However, you'll notice that it limits the resulting rows to 20... and sorts the resulting rows by table1.name. What this does is return the top 20 results in alphabetical order.
What I was wondering if there was a way I could limit to the top 20 results based on count in descending order; while ALSO getting the remaining 20 results in alphabetical order. I know I can simply sort the returned array in a followup code, but I'm wondering if there is a way to do this in a single query.

Use subselect for limit, and sort in the outer select
SELECT * FROM (SELECT table1.*, COUNT(table2.link_id) AS count
FROM table1
LEFT JOIN table2 on (table1.key = table2.link_id)
GROUP BY table1.key
ORDER BY count DESC
LIMIT 20 ) t
ORDER BY name ASC

Related

MySQL 5.7 How to do GROUP BY with sorting?

Similar to this issue: MySQL 5.7 group by latest record
I'm not sure how to do this properly in 5.7. Also with possibility of 2nd sort column. Working query in 5.6 that I'm trying to replicate in 5.7:
SELECT id FROM test
GROUP BY category
ORDER BY sort1 DESC, sort2 DESC
id is not always the highest, so MAX(id) does not work.
Looking into the link above, the solution for single sort should be:
SELECT t1.*
FROM test t1
INNER JOIN (
SELECT category, max(sort) AS sort FROM test GROUP BY category
) t2 ON t2.category = t1.category AND t2.sort = t1.sort
But how will it work with 2 sorting?
You are using GROUP BY the wrong way.
Think of group by as a way to separate data row into different groups. Each group has multiple rows, based on the value of group by column.
Once you get those groups, selecting table columns (as in: select *) is like picking any row from that group randomly. This is not helpful nor useful.
Usually once we group records (or rows), we need to find meta information about those records. For example: get us the count of records in that group (as in: select count(*)), or the sum of values of a specific column in that group (as in: select sum(price)), or get the min, max or avg values.
So in a nutshell, when you use group by you should use on of the aggregation functions with it, otherwise it's not going to do you any good.
Why don't you have the ORDER BY at your outer query, instead?
SELECT *
FROM (
SELECT 100 AS id, 1 AS category, NULL AS sort
UNION
SELECT 200 AS id, 1 AS category, 2 AS sort
) dt
GROUP BY category
ORDER BY sort DESC;
It seems that what happened to the data when it was grouped, it took the first data while neglecting the ORDER BY DESC. On your first query, it ordered descending first then group by took the first record which is 200. And yes, this shouldn't be the way you should use GROUP BY. It is used in conjunction with aggregate functions.
when you select a column in a group by query that is not one of the columns you are grouping by, (ie, your id) you have no control over the value unless you use another aggregate function. If you want to sort, use MIN or MAX:
SELECT MAX(id), category, FROM `test2`
GROUP BY category; -- always returns 200
SELECT MIN(id), category, FROM `test2`
GROUP BY category; -- always returns 100

SELECT all but the last 5 items in MySQL

I'm trying to run a query that will SELECT all but the 5 items in my table.
I'm currently using the following query to get the last 5 items.
SELECT * FROM articles ORDER BY id DESC LIMIT 5
And I would like another query to get all the other items, so excluding the last 5.
You select the last 5 items by conveniently sorting them in the reverse order.
SELECT * FROM articles ORDER BY id DESC LIMIT 5
LIMIT 5 is, in fact, a short form of LIMIT 0, 5.
You can use the same trick to skip the first 5 items and select the rest of them:
SELECT * FROM articles ORDER BY id DESC LIMIT 5, 1000000
Unfortunately MySQL doesn't provide a way to get all the rows after it skips the first 5 rows. You have to always tell it how many rows to return. I put a big number (1 million) in the query instead.
For both queries, the returned articles will be sorted in the descending order. If you need them in the ascending order you can save the smallest value of id returned by the first query and use it in the second query:
SELECT * FROM articles WHERE id < [put the saved id here] ORDER BY id ASC
There is no need for limit on the second query and you can even sort the records by other columns if you need.
You can do it like this:
SELECT * FROM articles
ORDER BY id ASC
LIMIT (SELECT count(*)-5 FROM articles)
You can also use NOT EXISTS() or NOT IN() but I'll have to see the columns names to adjust the sql for you, something like this:
SELECT * FROM articles a
WHERE a.id NOT IN(SELECT id FROM articles ORDER BY id DESC LIMIT 5)
Can also be done with a left join:
SELECT t.* FROM articles t
LEFT JOIN (SELECT id FROM articles ORDER BY id DESC LIMIT 5) s
ON(t.id = s.id)
WHERE s.id is null
Note that if the table has more then one key(the ID column) you have to add it to the relations of the ON clause.
Try
SELECT * FROM articles a NOT EXIST (SELECT * FROM articles b WHERE a.id=b.id ORDER BY id DESC LIMIT 5);

MySQL: First select 10 newest rows in descending PK order, then the rest of records alphabetically

I have a simple table USERS:
id | name
----+------
Can you help me with the query that would fetch all rows from the table and:
a) Place 10 rows with highest PK values on top, in id DESC order;
b) Place all remaining rows ordered by name ASC order.
Thank you!
This is a bit of a tricky question. The approach I would take is a join approach. Identify the primary keys for the first group using a join (this is happily fast because you are working with primary keys). Then use the match to that table for the order by:
select t.*
from table t left outer join
(select id
from table t
order by id desc
limit 10
) t10
on t.id = t10.id
order by t10.id desc,
t.name asc;
First question would be: do you really need this in one single query? I'm really not seeing the use case for such a query to be honest.
It'd be easier to just fetch the 10 biggest ids (storing somewhere the 10th biggest), and then fetch the rest in ascending name order (with a restriction on ids being smaller than the 10th biggest).
Otherwise in a single query, something like this would work, but it doesn't seem very efficient to me (maybe someone will have a better idea).
(
SELECT
id, name
from
USERS
ORDER BY id DESC LIMIT 0,10
)
UNION
(
SELECT
id, name
from
USERS
WHERE
id NOT IN (
SELECT id, name from USERS ORDER BY id DESC LIMIT 0,10
)
ORDER BY name ASC
)
(or maybe with a NOT EXISTS - the inner query will be different - instead of the NOT IN)

Is this an inefficient query?

Assuming table1 and table2 both have a large number of rows (ie several hundred thousand), is the following an inefficient query?
Edit: Order by field added.
SELECT * FROM (
SELECT title, updated FROM table1
UNION
SELECT title, updated FROM table2
) AS query
ORDER BY updated DESC
LIMIT 25
If you absolutely need distinct results, another possibility is to use union all and a group by clause instead:
SELECT title FROM (
SELECT title FROM table1 group by title
UNION ALL
SELECT title FROM table2 group by title
) AS query
group by title
LIMIT 25;
Testing this without the limit clause on an indexed ID column from two tables with ~920K rows each in a test database (at $work) resulted in a bit over a second with the query above and about 17 seconds via a union.
this should be even faster - but then I see no ORDER BY so what 25 records do you actually want?
SELECT * FROM (
SELECT title FROM table1 LIMIT 25
UNION
SELECT title FROM table2 LIMIT 25
) AS query
LIMIT 25
UNION must make an extra pass to fetch the distinct records, so you should use UNION ALL.
Yes, use order by and limits in the inner queries.
SELECT * FROM (
(SELECT title FROM table1 ORDER BY title ASC LIMIT C)
UNION
(SELECT title FROM table2 ORDER BY title ASC LIMIT C)
) AS query
LIMIT 25
This will only go through C rows instead of N (hundreds of thousands). The ORDER BY is necessary and should be on an indexed column.
C is a heuristic constant that should be tuned according to the domain. If you only expect a few duplicates, C=50-100 is probably ok.
You can also find out this for yourself by using EXPLAIN.

MySQL query to get the longest texts that occurr most often

I have a MySQL table t1 a text field f1. I have this query to find the top 100 most common values of f1 along with their frequency:
SELECT COUNT(*) AS c, f1 FROM t1 GROUP BY f1 ORDER BY c DESC LIMIT 100;
What I need now is a query to find out what are the longest values of f1 that occur most often. That is, I want to first order the records of the table by frequencies (like the query above does) and then I want to order them by length and grab the top 100. I tried doing that with this query but it doesn't return what I want, it simply returns the records with the longest values of f1 (most of them with only 1 occurrences):
SELECT f1, LENGTH(f1) AS l, COUNT(*) AS c FROM t1 GROUP BY f1, LENGTH(f1) ORDER BY l DESC, c DESC LIMIT 100;
My table has more than 44M records in case that matters.
Thanks.
You said you want to order by the frequency then the length, but you ask for the order by length then frequency. Reverse your ORDER BY clause.