I need a unique list of parent_threads based on the desc order of postID, postID is always unique but often the parent_thread field is the same for multiple posts.
So what i need is a list of posts in order they were replied to.
so for example in the image below i need to disregard posts 400 and 399 as they're repeats. i've got a query to work using a subquery but the problem with this subquery is that it can sometimes take up to 1 second to query, i was wondering if there was a more efficient way to do this. i've tried group by and distinct but keep getting the wrong results.
imge of the table
Here is the query that i have which produces the results i want, which is often slow.
SELECT `postID`
FROM `posts`
ORDER BY
(
SELECT MAX(`postID`)
FROM `posts` `sub`
WHERE `sub`.`parent_thread` = `posts`.postID
)
DESC
Your subquery is known as a dependent subquery. They can make queries very slow because they get repeated a lot.
JOIN to your subquery instead. That way it will be used just once, and things will speed up. Try this subquery to generate a list of max post ids, one for each parent thread.
SELECT MAX(postID) maxPostID, parent_thread
FROM posts
GROUP BY parent_thread
Then use it in your main query like this
SELECT posts.postID
FROM posts
LEFT JOIN (
SELECT MAX(postID) maxPostID, parent_thread
FROM posts
GROUP BY parent_thread
) m ON posts.postID = m.parent_thread
ORDER BY m.maxPostID DESC
Related
I have this query in my program, when I do some sorting with select count(*) field from the query, I dont know why, it very slow when running that query.
The problem is when i do some ordering from posts_count, it run more slower than i do ordering with the other field.
Here's the query:
select 'tags'.*, (select count(*) from 'posts' inner join 'post_tag' on 'posts'.'id' = 'post_tag'.'post_id' where 'tags'.'id' = 'post_tag'.'tag_id') as 'posts_count' from 'tags' order by 'posts_count' asc limit 15 offset 0;
Here's the execution time :
Please someone help me to improve this query , Thank you.
What i expect is the query can be run faster.
SELECT t.*, COUNT(*) AS count
FROM tags AS t
LEFT OUTER JOIN post_tag AS pt ON t.id = pt.tag_id
GROUP BY t.id
ORDER BY count ASC LIMIT 15 OFFSET 0;
You should make sure post_tag has an index starting with the tag_id column. You didn't include your table definition in your question, so I must assume the index is there. If the primary key starts with tag_id, that's okay too.
You don't need to join to posts, if I can assume that a row exists in post_tag means it must reference an existing row in posts. You can get the information you need only by joining to post_tag.
Suppose I have two tables, users and posts. Posts has the following fields, userid, postid, etc and userid can appear multiple times as one user can write multiple posts....I'm just trying sort the users table based off the # of occurrences per userid in the posts table. I can get the # of occurrences per user using this
SELECT userid, COUNT(*)
FROM posts
GROUP BY userid;
I would like to use the values under COUNT(*) column, maybe add it to my other table because then I can simply to something like this
SELECT * FROM users
ORDER BY newcolumn ASC;
but I'm having trouble doing that. Or can I do it without having to add an extra column? Hints please. Thanks
Left join is the key here!
SELECT users.userid,count(posts.userid) AS total_count
FROM users
LEFT JOIN posts on posts.userid = users.userid
GROUP BY users.userid
ORDER BY total_count DESC;
We are taking the left join on two tables with same user_id and we are counting the total number of posts per user using group by. Finally sort by count and show results.
try an left join:
select users.userid, [user fields],count(postid) as posts_count
from users
left join posts on posts.userid = users.userid
group by users.userid,[user fields]
order by posts_count desc.
You want to select users (FROM users) but you want to sort based on criteria in another table (COUNT(*) FROM posts) -- therefore you need to use a JOIN
Off-hand I can't seem to recall if "JOIN" or "RIGHT JOIN" or "FULL JOIN" is what you need if you wanted to get a cartesian product of the tables then group and aggregate on a single field, but I can avoid the need to remember with a subquery (hopefully someone will soon post a smaller and smarter answer):
SELECT users.* FROM users
JOIN (
SELECT userid, COUNT(*) as count
FROM posts
GROUP BY userid
) as subquery ON users.id = subquery.userid
ORDER BY subquery.count
Note: I haven't tested this query, but it looks good to me. Again: hopefully someone will post a better answer soon as I'm not doing my due dilligence, but you definitely need a JOIN :)
You could add a post_count column to the users table, but you would also have to update that count column every time a user creates a new post and you would have to build that logic into your application.
Otherwise, it looks like the answer from FallAndLearn will get you what you need.
I know this was discussed many times but my research did not help me with my problem.
I have a table (innodb) with about 3k records. I need to pick 1 row random with some filters, which i do it like this:
select id, title, topic_id
from posts
where id not in
(select post_id from records where user_id='$my_id' and checked='1')
and topic_id='$topic_id' and status='1'
order by RAND() limit 1
This gives me the result i wanted. The problem is this takes too much time even with 3k records. It will get slower when records are increased.
I have to find a solution for this. Any suggestions?
Update: Both tables are indexed with id columns.
Instead of using where id not in, I would use a LEFT JOIN:
SELECT id,
title,
topic_id
FROM posts p
LEFT JOIN records r
ON p.id = r.post_id
AND r.user_id='$my_id'
AND r.checked = '1'
WHERE p.topic_id='$topic_id'
AND status='1'
AND r.post_id IS NULL
ORDER BY RAND()
LIMIT 1;
With this, you will want an index on posts.id and another index on records.post_id, records.user_id, records.checked
I have a nested subquery that selects a random AlbumID that the selected video is in (videos can be in multiple albums), and the outer query then returns the videos and album information based on that AlbumID.
The problem is that the query is returning mixed results; sometimes it gives me some of the videos from one album, sometimes it gives videos from multiple albums, sometimes it returns nothing.
The outer query works if I specify a specific AlbumID instead of the subquery, and the subquery by itself correctly returns 1 random AlbumID. But put together, it's giving me mixed results. What am I missing? Why is it returning varying amounts of rows, and multiple albums?
I've replicated the issue with test data, you can find the CREATE queries here: http://pastebin.com/raw.php?i=e6HaaSGK
The SELECT SQL:
SELECT
Videos_Demo.VideoID,
VideosInAlbums_Demo.AlbumID
FROM
VideosInAlbums_Demo
LEFT JOIN
Videos_Demo
ON Videos_Demo.VideoID = VideosInAlbums_Demo.VideoID
WHERE
VideosInAlbums_Demo.AlbumID = (
SELECT
AlbumID
FROM
VideosInAlbums_Demo
WHERE
VideoID = '1'
ORDER BY
RAND()
LIMIT 1
)
Try this. Moving the subquery to the JOIN seems to fix the problem. I think the problem has to do with having the subquery in the WHERE clause. I think that in the WHERE clause, the subquery and RAND function is being getting executed for each record. This is probably why the results are varying.
SELECT a.AlbumID,
Videos_Demo.VideoID,
VideosInAlbums_Demo.AlbumID
FROM VideosInAlbums_Demo
LEFT JOIN Videos_Demo
ON Videos_Demo.VideoID = VideosInAlbums_Demo.VideoID
JOIN
(
SELECT AlbumID
FROM VideosInAlbums_Demo
WHERE VideoID = '1'
ORDER BY RAND()
LIMIT 1
) AS a ON VideosInAlbums_Demo.AlbumID = a.AlbumID
I have two tables, one for downloads and one for uploads. They are almost identical but with some other columns that differs them. I want to generate a list of stats for each date for each item in the table.
I use these two queries but have to merge the data in php after running them. I would like to instead run them in a single query, where it would return the columns from both queries in each row grouped by the date. Sometimes there isn't any download data, only upload data, and in all my previous tries it skipped the row if it couldn't find log data from both rows.
How do I merge these two queries into one, where it would display data even if it's just available in one of the tables?
SELECT DATE(upload_date_added) as upload_date, SUM(upload_size) as upload_traffic, SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
SELECT DATE(download_date_added) as download_date, SUM(download_size) as download_traffic, SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
I want to get result rows like this:
date, upload_traffic, upload_files, download_traffic, download_files
All help appreciated!
Your two queries can be executed and then combined with the UNION cluase along with an extra field to identify Uploads and Downloads on separate lines:
SELECT
'Uploads' TransmissionType,
DATE(upload_date_added) as TransmissionDate,
SUM(upload_size) as TransmissionTraffic,
SUM(upload_files) as TransmittedFileCount
FROM
packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
'Downloads',
DATE(download_date_added),
SUM(download_size),
SUM(download_files)
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC;
Give it a Try !!!
What you're asking can only work for rows that have the same add date for upload and download. In this case I think this SQL should work:
SELECT
DATE(u.upload_date_added) as date,
SUM(u.upload_size) as upload_traffic,
SUM(u.upload_files) as upload_files,
SUM(d.download_size) as download_traffic,
SUM(d.download_files) as download_files
FROM
packages_uploads u, packages_downloads d
WHERE u.upload_date_added = d.download_date_added
AND u.upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY date
ORDER BY date DESC
Without knowing the schema is hard to give the exact answer so please see the following as a concept not a direct answer.
You could try left join, im not sure if the table package exists but the following may be food for thought
SELECT
p.id,
up.date as upload_date
dwn.date as download_date
FROM
package p
LEFT JOIN package_uploads up ON
( up.package_id = p.id WHERE up.upload_date = 'etc' )
LEFT JOIN package_downloads dwn ON
( dwn.package_id = p.id WHERE up.upload_date = 'etc' )
The above will select all the packages and attempt to join and where the value does not join it will return null.
There is number of ways that you can do this. You can join using primary key and foreign key. In case if you do not have relationship between tables,
You can use,
LEFT JOIN / LEFT OUTER JOIN
Returns all records from the left table and the matched
records from the right table. The result is NULL from the
right side when there is no match.
RIGHT JOIN / RIGHT OUTER JOIN
Returns all records from the right table and the matched
records from the left table. The result is NULL from the left
side when there is no match.
FULL OUTER JOIN
Return all records when there is a match in either left or right table records.
UNION
Is used to combine the result-set of two or more SELECT statements.
Each SELECT statement within UNION must have the same number of,
columns The columns must also have similar data types The columns in,
each SELECT statement must also be in the same order.
INNER JOIN
Select records that have matching values in both tables. -this is good for your situation.
INTERSECT
Does not support MySQL.
NATURAL JOIN
All the column names should be matched.
Since you dont need to update these you can create a view from joining tables then you can use less query in your PHP. But views cannot update. And you did not mentioned about relationship between tables. Because of that I have to go with the UNION.
Like this,
CREATE VIEW checkStatus
AS
SELECT
DATE(upload_date_added) as upload_date,
SUM(upload_size) as upload_traffic,
SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
DATE(download_date_added) as download_date,
SUM(download_size) as download_traffic,
SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
Then anywhere you want to select you just need one line:
SELECT * FROM checkStatus
learn more.