I have 3 tables (ex. a,b,c) which indicates activities for different items (ex. commenting, liking, etc) as well as the time for each activity. I am trying to essentially do a sort of news feed that shows the most recent activities first. I constructed a UNION ALL for all three tables to group all the activities together and then a GROUP BY to ensure that activities for the same items are not shown twice and order by time DESC. This function uses an infinite scroll so the query must also be able to shift appropriately.
I am wondering if there is any way to optimize this (Each table is about 500-900K and growing). Truncated code is shown below.
SELECT time,item_id FROM (
SELECT a.time AS time, a.item_id FROM a
UNION ALL
SELECT b.time AS time, b.item_id FROM b
UNION ALL
SELECT c.time AS time, c.item_id FROM c
) temp
GROUP BY item_id
ORDER BY time DESC
LIMIT 10
The query you've written will create a very large temporary table. You're then sorting by a column in that temporary table. You should try to limit each table, perhaps like this:
SELECT time,item_id FROM (
SELECT a.time AS time, a.item_id FROM a LIMIT 10 ORDER BY time DESC
UNION ALL
SELECT b.time AS time, b.item_id FROM b LIMIT 10 ORDER BY time DESC
UNION ALL
SELECT c.time AS time, c.item_id FROM c LIMIT 10 ORDER BY time DESC
) temp
GROUP BY item_id
ORDER BY time DESC
LIMIT 10
You'll want to make sure time has an index on each table.
I don't really like doing this though, as it may be difficult to "scroll" through the results accurately.
When going to the "next page" you may want to consider adding a WHERE clause like WHERE a/b/c.item_id > num instead of LIMIT offset, length. That will help with the accuracy.
When writing the query you should prefix the query with EXPLAIN to see how the query is being handled. This will give you a better idea of what's happening: Are temporary tables being created? How large is it? What indexes are being used? etc...
Another approach could be to use a MySQL trigger to populate a single "feed" table.
Related
I have a table called ticket_log which has millions of records in it. Each ticket log is for a ticket, so the ticket_log table has a ticket_id column. I need to find out which ticket has the maximum number of logs.
If the table had only a couple of thousand entries, then the following query would have easily worked -
select ticket_id, count(ticket_log_id) as myCount
from ticket_log
group by ticket_id
order by myCount desc limit 1
However, when I try running this on a table that has millions of records in it, the query takes forever. Some optimization techniques suggest that we can add a filter of sorts to the query like where ticket_created > '2014' for example, but that is not an option.
Given this scenario, how can the query be optimized for a very large number of records?
Update: the query took slightly over an hour to run for the table with millions of records.
If you have a tickets table, then the following might be faster:
select ticket_id,
(select count(*)
from ticket_log tl
where t.ticket_id = tl.ticket_id
) as mycount
from tickets t
order by myCount desc
limit 1;
This can take advantage of an index on ticket_log(ticket_id).
I have a simple table USERS:
id | name
----+------
Can you help me with the query that would fetch all rows from the table and:
a) Place 10 rows with highest PK values on top, in id DESC order;
b) Place all remaining rows ordered by name ASC order.
Thank you!
This is a bit of a tricky question. The approach I would take is a join approach. Identify the primary keys for the first group using a join (this is happily fast because you are working with primary keys). Then use the match to that table for the order by:
select t.*
from table t left outer join
(select id
from table t
order by id desc
limit 10
) t10
on t.id = t10.id
order by t10.id desc,
t.name asc;
First question would be: do you really need this in one single query? I'm really not seeing the use case for such a query to be honest.
It'd be easier to just fetch the 10 biggest ids (storing somewhere the 10th biggest), and then fetch the rest in ascending name order (with a restriction on ids being smaller than the 10th biggest).
Otherwise in a single query, something like this would work, but it doesn't seem very efficient to me (maybe someone will have a better idea).
(
SELECT
id, name
from
USERS
ORDER BY id DESC LIMIT 0,10
)
UNION
(
SELECT
id, name
from
USERS
WHERE
id NOT IN (
SELECT id, name from USERS ORDER BY id DESC LIMIT 0,10
)
ORDER BY name ASC
)
(or maybe with a NOT EXISTS - the inner query will be different - instead of the NOT IN)
do you think a query like this will create problem in the execution of my software?
I need to delete the all the table, except the last 2 groups of entries, grouped by the same time of insert.
delete from tableA WHERE time not in
(
SELECT time FROM
(select distinct time from tableA order by time desc limit 2
) AS tmptable
);
Do you have better solution? I'm using mysql 5.5
I don't see anything wrong with your query, but I prefer using an OUTER JOIN/NULL check (plus it alleviates the need for one of the nested subqueries):
delete a
from tableA a
left join
(
select distinct time
from tableA
order by time desc
limit 2
) b on a.time = b.time
where b.time is null
SQL Fiddle Demo
i want to get last two row of a table in one query as new data and previous data
i got
select tbl.x , tbl2.x as last_x
from tbl left join tbl tbl2 ON tbl.id!= tbl2.id
order by tbl.id desc , tbl2.id desc limit 1
it works fine but i think it might get slow in a big DB
is there any way to make this faster ?
A LIMIT should work in a basic subquery, and so the following will possibly be more efficient
SELECT Sub1.x , Sub2.x as last_x
FROM (SELECT x FROM tbl ORDER BY tbl.id DESC LIMIT 1) Sub1
CROSS JOIN (SELECT x FROM tbl ORDER BY tbl.id DESC LIMIT 2, 1) Sub2
You can take a look at the execution plan and try to optimize your query, but usually you do this when you face a problem so you can determine which parts are taking long.
Chick this thread to: How to optimise MySQL queries based on EXPLAIN plan
But as saied i would not try to solve a problm which still does not exist, i do not actually see a problem aith your query.
I had 3 tables which are not identical to each other. According to one of my requirement I had to copy all these tables records to another table.
That part is okay.What my problem is that the records I inserted is in a order now.
Like
first 100 records from table1
second 100 records from table2
third 100 records from table3
what I wanted to do is change/mix the record positions.Like if i selected first 100 records there should be records from all three table.
selecting data from ORDER BY Rand() is not I want.I just need to select data and display those data.
Is there any way that i can solve this out?Thanks
A great post handling several cases, from simple, to gaps, to non-uniform with gaps.
http://jan.kneschke.de/projects/mysql/order-by-rand/
For most general case, here's how you do it:
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1
This supposes that the distribution of ids is equal, and that there can be gaps in the id list. See the article for more advanced examples
If you don't want to query later on with rand() you could create the table by inserting from a union select ordered by rand() in the first place:
INSERT INTO merged (a, b)
SELECT a, b FROM (
SELECT a, b, rand() AS r FROM t1
UNION ALL
SELECT a, b, rand() AS r FROM t2
) ORDER BY r
However, also consider this post I just came across: INSERT INTO SELECT strange order using UNION, perhaps someone can comment.