Assuming table1 and table2 both have a large number of rows (ie several hundred thousand), is the following an inefficient query?
Edit: Order by field added.
SELECT * FROM (
SELECT title, updated FROM table1
UNION
SELECT title, updated FROM table2
) AS query
ORDER BY updated DESC
LIMIT 25
If you absolutely need distinct results, another possibility is to use union all and a group by clause instead:
SELECT title FROM (
SELECT title FROM table1 group by title
UNION ALL
SELECT title FROM table2 group by title
) AS query
group by title
LIMIT 25;
Testing this without the limit clause on an indexed ID column from two tables with ~920K rows each in a test database (at $work) resulted in a bit over a second with the query above and about 17 seconds via a union.
this should be even faster - but then I see no ORDER BY so what 25 records do you actually want?
SELECT * FROM (
SELECT title FROM table1 LIMIT 25
UNION
SELECT title FROM table2 LIMIT 25
) AS query
LIMIT 25
UNION must make an extra pass to fetch the distinct records, so you should use UNION ALL.
Yes, use order by and limits in the inner queries.
SELECT * FROM (
(SELECT title FROM table1 ORDER BY title ASC LIMIT C)
UNION
(SELECT title FROM table2 ORDER BY title ASC LIMIT C)
) AS query
LIMIT 25
This will only go through C rows instead of N (hundreds of thousands). The ORDER BY is necessary and should be on an indexed column.
C is a heuristic constant that should be tuned according to the domain. If you only expect a few duplicates, C=50-100 is probably ok.
You can also find out this for yourself by using EXPLAIN.
Related
I want to use a same subquery multiple times into UNION. This subquery is time consumed and I think that using it a lot of times may will be increased the total time of execution.
For example
(SELECT * FROM (SELECT * FROM A INNER JOIN B ... AND SOME COMPLEX WHERE CONDITIONS) as T ORDER BY column1 DESC LIMIT 10)
UNION
(SELECT * FROM (SELECT * FROM A INNER JOIN B ... AND SOME COMPLEX WHERE CONDITIONS) as T ORDER BY column2 DESC LIMIT 10)
UNION
(SELECT * FROM (SELECT * FROM A INNER JOIN B ... AND SOME COMPLEX WHERE CONDITIONS) as T ORDER BY column3 DESC LIMIT 10)
Does the (SELECT * FROM A INNER JOIN B ... AND SOME COMPLEX WHERE CONDITIONS) executed 3 times ?
If mysql is smart enough the internal subquery will be executed only one so I don't need any optimization, but if not I have to use something else to optimize it (like using a temporary table, but I want to avoid it)
Do I have to optimize this query by other syntax ? Any suggestion ?
In practice I want to filter some data from huge records and get some of them in 3 group-sections, each section in different order
Plan A:
A TEMPORARY TABLE cannot be referenced more than once. So, build a permanent table and DROP it when finished. (If you might have multiple connections doing the same thing, it will be a hassle to make sure you are not using the same table name.)
Plan B:
With MySQL 8.0, you can do
WITH T AS ( SELECT ... )
SELECT ... FROM T ORDER BY col1
UNION ...
Plan C:
If it is possible to do this:
SELECT id FROM A
ORDER BY col1 LIMIT 10
You could use that as a 'derived' table inside
(SELECT * FROM A INNER JOIN B ... AND SOME COMPLEX WHERE CONDITIONS)
Something like
SELECT A.*, B.*
FROM ( SELECT id FROM A
ORDER BY col1 LIMIT 10 ) AS x1
JOIN A USING(id)
JOIN B ... AND SOME COMPLEX WHERE CONDITIONS
Similarly for the other two SELECTs, then UNION them together.
Better yet, UNION together the 3 sets of ids, then JOIN to A and B once.
This may have the advantage of dealing with fewer rows.
I'm trying to run a query that will SELECT all but the 5 items in my table.
I'm currently using the following query to get the last 5 items.
SELECT * FROM articles ORDER BY id DESC LIMIT 5
And I would like another query to get all the other items, so excluding the last 5.
You select the last 5 items by conveniently sorting them in the reverse order.
SELECT * FROM articles ORDER BY id DESC LIMIT 5
LIMIT 5 is, in fact, a short form of LIMIT 0, 5.
You can use the same trick to skip the first 5 items and select the rest of them:
SELECT * FROM articles ORDER BY id DESC LIMIT 5, 1000000
Unfortunately MySQL doesn't provide a way to get all the rows after it skips the first 5 rows. You have to always tell it how many rows to return. I put a big number (1 million) in the query instead.
For both queries, the returned articles will be sorted in the descending order. If you need them in the ascending order you can save the smallest value of id returned by the first query and use it in the second query:
SELECT * FROM articles WHERE id < [put the saved id here] ORDER BY id ASC
There is no need for limit on the second query and you can even sort the records by other columns if you need.
You can do it like this:
SELECT * FROM articles
ORDER BY id ASC
LIMIT (SELECT count(*)-5 FROM articles)
You can also use NOT EXISTS() or NOT IN() but I'll have to see the columns names to adjust the sql for you, something like this:
SELECT * FROM articles a
WHERE a.id NOT IN(SELECT id FROM articles ORDER BY id DESC LIMIT 5)
Can also be done with a left join:
SELECT t.* FROM articles t
LEFT JOIN (SELECT id FROM articles ORDER BY id DESC LIMIT 5) s
ON(t.id = s.id)
WHERE s.id is null
Note that if the table has more then one key(the ID column) you have to add it to the relations of the ON clause.
Try
SELECT * FROM articles a NOT EXIST (SELECT * FROM articles b WHERE a.id=b.id ORDER BY id DESC LIMIT 5);
I am trying to get my head around using GROUP_CONCAT within MYSQL.
Basically I have the following table, table1:
id, field1, field2, active
I want to bring back 5 rows within the table but in random order. So I'm using this:
SELECT GROUP_CONCAT(id ORDER BY rand()) FROM table1 WHERE active=1
This behaves as I would expect. I then want to use the output to select the other columns (field1, field2) from the table and display the results.
So I've tried using:
SELECT *
FROM table1
WHERE id IN
(
SELECT GROUP_CONCAT(id ORDER BY rand()) as id FROM table1 WHERE active=1
);
I expected something like the above to work but I cant figure out why it doesn't. It DOES bring back results but not all of them, (i.e.) my table contains 10 rows. 6 rows are set to active=1. Therefore I would expect 6 rows to be returned ... this isn't happening I may get 1,2 or 0.
Additionally if it helps I'd like to limit the number of results returned by the sub-query to 3 but adding LIMIT doesn't seem to have any affect on the results returned.
Thank you in advance for your help
I think this is what you are looking for. This will bring back 5 random active rows.
SELECT *
FROM table1
WHERE active=1
ORDER BY RAND()
LIMIT 5;
why not use this :
SELECT *, GROUP_CONCAT(id ORDER BY rand()) as randoms FROM table1 WHERE active=1
If I understand correctly, you are trying to build a query like this:
select *
from table1
where id in (1,2,3,4,5) -- Just an example
and you are trying to "fill" the in condition with a group_concat() result.
That's not the way to do it.
You only need to specify the subquery in the parenthesis:
select *
from table1
where id in (select id from table1 where active=1)
Notice some additional things:
The order by rand() is irrelevant, because the in () will be evaluated regardless of the order of the values.
In this particular scenario, I would recommend to use a join instead of in.
Using join:
select t1.*
from
table1 as t1
inner join table1 as t2 on t1.id = t2.id
where t2.active=1
It is not clear to me where a LIMIT applies to a UNION
If I have:
SELECT * From table A
where conditions
UNION
SELECT * From table B
where conditions
LIMIT 10
Does the LIMIT 10 apply to the result of the UNION? Or to the select of table B?
What I need is to apply to the result of the UNION
Where is the UNION operator? Anyway the LIMIT of 10 probably applies on the result of the UNION at least in the place you put it.
Straight from the manual:
To use an ORDER BY or LIMIT clause to sort or limit the entire UNION result, parenthesize the individual SELECT statements and place the ORDER BY or LIMIT after the last one. The following example uses both clauses:
(SELECT a FROM t1 WHERE a=10 AND B=1)
UNION
(SELECT a FROM t2 WHERE a=11 AND B=2)
ORDER BY a LIMIT 10;
A statement without parentheses is equivalent to one parenthesized as just shown.
In your case:
SELECT * From table A
where conditions
UNION --assuming the union is here
SELECT * From table B
where conditions
LIMIT 10
The limit will be applied to the result of the union.
I'am trying to understand what causes the following, maybe you could help me:
I have a query like:
select field1,fieldDate from table1
union all
select field1,fieldDate from table2
order by fieldDate desc
and the another one like this:
select field1,field2,fieldDate from table1
union all
select field1,field2,fieldDate from table2
order by fieldDate desc
So basically they are the same with the exception that in the second I retrieve an extra field.
Now, both results come with a diferent ordering, but just for the cases that the dates are exacly the same. For example there are 2 rows (row1,row2) with date 2009-11-25 09:41:55. For query 1 row1 comes before row2 and for query 2 row2 comes before row1.
Does somebody knows why this happens?
Thanks,
Regards
The ordering based on any fields that you don't explicitly order by is undefined, and the optimizer can change the ordering if it thinks that results in a better execution plan. Given two rows with the exact same value in the order by field you can not depend on them being in any particularly order in relation to each other unless you explicitly order by another field with different values.
Can you do this
select * from ( select
field1,field2,fieldDate, 0 as ordercol from table1
union all select
field1,field2,fieldDate, 1 as ordercol from table2) t1
order by fieldDate desc, ordercol asc
Straight from the MySQl manual, to user order by on a union you have to parenthesis the individual tables.
(select field1,fieldDate from table1)
union all
(select field1,fieldDate from table2)
order by fieldDate desc
This is not SQL standards compliant! The code you entered should order the union of both tables but to my surprise MySQL has the above syntax.
The order in which rows with the same fieldDate are returned can differ for each query execution. Usually this order will be the same but you should not count on it. If you want any extra ordering state more order by fields.
EDIT: This answer is wrong: the order by works on the entire union. I'll leave it here to save others the trouble :)
Your order by only works on the second part of the union. You can use a subquery to make the order by work on the entire union:
select field1,field2,fieldDate
from (
select field1,field2,fieldDate
from table1
union all
select field1,field2,fieldDate
from table2
) SubQueryName
order by fieldDate desc