MySQL query with RAND() subquery condition - mysql

I have a nested subquery that selects a random AlbumID that the selected video is in (videos can be in multiple albums), and the outer query then returns the videos and album information based on that AlbumID.
The problem is that the query is returning mixed results; sometimes it gives me some of the videos from one album, sometimes it gives videos from multiple albums, sometimes it returns nothing.
The outer query works if I specify a specific AlbumID instead of the subquery, and the subquery by itself correctly returns 1 random AlbumID. But put together, it's giving me mixed results. What am I missing? Why is it returning varying amounts of rows, and multiple albums?
I've replicated the issue with test data, you can find the CREATE queries here: http://pastebin.com/raw.php?i=e6HaaSGK
The SELECT SQL:
SELECT
Videos_Demo.VideoID,
VideosInAlbums_Demo.AlbumID
FROM
VideosInAlbums_Demo
LEFT JOIN
Videos_Demo
ON Videos_Demo.VideoID = VideosInAlbums_Demo.VideoID
WHERE
VideosInAlbums_Demo.AlbumID = (
SELECT
AlbumID
FROM
VideosInAlbums_Demo
WHERE
VideoID = '1'
ORDER BY
RAND()
LIMIT 1
)

Try this. Moving the subquery to the JOIN seems to fix the problem. I think the problem has to do with having the subquery in the WHERE clause. I think that in the WHERE clause, the subquery and RAND function is being getting executed for each record. This is probably why the results are varying.
SELECT a.AlbumID,
Videos_Demo.VideoID,
VideosInAlbums_Demo.AlbumID
FROM VideosInAlbums_Demo
LEFT JOIN Videos_Demo
ON Videos_Demo.VideoID = VideosInAlbums_Demo.VideoID
JOIN
(
SELECT AlbumID
FROM VideosInAlbums_Demo
WHERE VideoID = '1'
ORDER BY RAND()
LIMIT 1
) AS a ON VideosInAlbums_Demo.AlbumID = a.AlbumID

Related

MySQL Left-Join vs Join with subquery return different results

I wrote 2 queries expecting them to yield same results, yet they turned out to be different.
I would like to ask why they return different results?
I am more confident that the 1st query returns what I want, so how should I amend the 2nd query? Thx!
1st SQL query:
SELECT
Product.*,
Status.*,
Price.*
FROM Product
LEFT JOIN Status
ON Product.MarketplaceId = Status.ListingId
LEFT JOIN Price
ON Product.ProductId = Price.Id
LIMIT 15;
2nd SQL query:
SELECT
Product.*,
Status.*,
Price.*
FROM Product
LEFT JOIN Status
ON Product.MarketplaceId IN
(
SELECT ListingId FROM Status
)
LEFT JOIN Price
ON Product.ProductId IN
(
SELECT Id FROM Price
)
LIMIT 15;
Without seeing the data, I don't understand why a different result. However, if you intend to have only one per product, I would change the second query to use DISTINCT. If the subquery returns multiple rows for whatever the condition, it will return that many rows even if a single product.
Don't use IN ( SELECT ... ) if there is an alternative; it is often slower.
Don't use LEFT JOIN if the matching row in the 'right' table will always be there. It confuses readers.
The reason for different results is the lack of ORDER BY. (As Akina mentioned.) Removing the LIMIT would probably cause the two queries to deliver all the same rows, though probably in a different order.

select unique values from column but order based on another

I need a unique list of parent_threads based on the desc order of postID, postID is always unique but often the parent_thread field is the same for multiple posts.
So what i need is a list of posts in order they were replied to.
so for example in the image below i need to disregard posts 400 and 399 as they're repeats. i've got a query to work using a subquery but the problem with this subquery is that it can sometimes take up to 1 second to query, i was wondering if there was a more efficient way to do this. i've tried group by and distinct but keep getting the wrong results.
imge of the table
Here is the query that i have which produces the results i want, which is often slow.
SELECT `postID`
FROM `posts`
ORDER BY
(
SELECT MAX(`postID`)
FROM `posts` `sub`
WHERE `sub`.`parent_thread` = `posts`.postID
)
DESC
Your subquery is known as a dependent subquery. They can make queries very slow because they get repeated a lot.
JOIN to your subquery instead. That way it will be used just once, and things will speed up. Try this subquery to generate a list of max post ids, one for each parent thread.
SELECT MAX(postID) maxPostID, parent_thread
FROM posts
GROUP BY parent_thread
Then use it in your main query like this
SELECT posts.postID
FROM posts
LEFT JOIN (
SELECT MAX(postID) maxPostID, parent_thread
FROM posts
GROUP BY parent_thread
) m ON posts.postID = m.parent_thread
ORDER BY m.maxPostID DESC

SQL INNER JOIN and AVG() returning wrong data

I am trying to select all rows from a table containing data about a video and then afterwards i am joining all of their ratings as an AVG() from another table.
The thing is there is only 1 row for each video but many ratings for each video, so i have to get all the ratings and find the average for each video.
I have this piece of SQL
SELECT t1.video_id,
t1.video_title,
t1.video_url,
t1.video_views,
AVG(t2.videos_rating_rating) AS rating
FROM videos_approved t1
INNER JOIN videos_rating t2
ON t1.video_id = t2.videos_rating_video_fk
WHERE 1
ORDER BY video_id
DESC LIMIT 12
The SQL returns a result but it only returns 1 row with a wrong Average value?
Can someone explain to me why this is going on and what i could do instead?
You need to use GROUP BY here. In your current query you are taking an average over the entire table.
SELECT
t1.video_id,
t1.video_title,
t1.video_url,
t1.video_views,
AVG(t2.videos_rating_rating) AS rating
FROM videos_approved t1
INNER JOIN videos_rating t2
ON t1.video_id = t2.videos_rating_video_fk
GROUP BY
t1.video_id
ORDER BY
t1.video_id DESC
LIMIT 12
Note that my answer assumes that video_id is the primary key of the videos_approved table, in which case we may select any column from that table even when grouping by the video_id. If not, then strictly speaking we would have to do another join.
Try replacing INNER JOIN with LEFT JOIN.
See more details in this answer and on this page (search for AVG + GROUP BY + JOINS)

How can I make these two queries into one?

I have two tables, one for downloads and one for uploads. They are almost identical but with some other columns that differs them. I want to generate a list of stats for each date for each item in the table.
I use these two queries but have to merge the data in php after running them. I would like to instead run them in a single query, where it would return the columns from both queries in each row grouped by the date. Sometimes there isn't any download data, only upload data, and in all my previous tries it skipped the row if it couldn't find log data from both rows.
How do I merge these two queries into one, where it would display data even if it's just available in one of the tables?
SELECT DATE(upload_date_added) as upload_date, SUM(upload_size) as upload_traffic, SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
SELECT DATE(download_date_added) as download_date, SUM(download_size) as download_traffic, SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
I want to get result rows like this:
date, upload_traffic, upload_files, download_traffic, download_files
All help appreciated!
Your two queries can be executed and then combined with the UNION cluase along with an extra field to identify Uploads and Downloads on separate lines:
SELECT
'Uploads' TransmissionType,
DATE(upload_date_added) as TransmissionDate,
SUM(upload_size) as TransmissionTraffic,
SUM(upload_files) as TransmittedFileCount
FROM
packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
'Downloads',
DATE(download_date_added),
SUM(download_size),
SUM(download_files)
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC;
Give it a Try !!!
What you're asking can only work for rows that have the same add date for upload and download. In this case I think this SQL should work:
SELECT
DATE(u.upload_date_added) as date,
SUM(u.upload_size) as upload_traffic,
SUM(u.upload_files) as upload_files,
SUM(d.download_size) as download_traffic,
SUM(d.download_files) as download_files
FROM
packages_uploads u, packages_downloads d
WHERE u.upload_date_added = d.download_date_added
AND u.upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY date
ORDER BY date DESC
Without knowing the schema is hard to give the exact answer so please see the following as a concept not a direct answer.
You could try left join, im not sure if the table package exists but the following may be food for thought
SELECT
p.id,
up.date as upload_date
dwn.date as download_date
FROM
package p
LEFT JOIN package_uploads up ON
( up.package_id = p.id WHERE up.upload_date = 'etc' )
LEFT JOIN package_downloads dwn ON
( dwn.package_id = p.id WHERE up.upload_date = 'etc' )
The above will select all the packages and attempt to join and where the value does not join it will return null.
There is number of ways that you can do this. You can join using primary key and foreign key. In case if you do not have relationship between tables,
You can use,
LEFT JOIN / LEFT OUTER JOIN
Returns all records from the left table and the matched
records from the right table. The result is NULL from the
right side when there is no match.
RIGHT JOIN / RIGHT OUTER JOIN
Returns all records from the right table and the matched
records from the left table. The result is NULL from the left
side when there is no match.
FULL OUTER JOIN
Return all records when there is a match in either left or right table records.
UNION
Is used to combine the result-set of two or more SELECT statements.
Each SELECT statement within UNION must have the same number of,
columns The columns must also have similar data types The columns in,
each SELECT statement must also be in the same order.
INNER JOIN
Select records that have matching values in both tables. -this is good for your situation.
INTERSECT
Does not support MySQL.
NATURAL JOIN
All the column names should be matched.
Since you dont need to update these you can create a view from joining tables then you can use less query in your PHP. But views cannot update. And you did not mentioned about relationship between tables. Because of that I have to go with the UNION.
Like this,
CREATE VIEW checkStatus
AS
SELECT
DATE(upload_date_added) as upload_date,
SUM(upload_size) as upload_traffic,
SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
DATE(download_date_added) as download_date,
SUM(download_size) as download_traffic,
SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
Then anywhere you want to select you just need one line:
SELECT * FROM checkStatus
learn more.

mysql intersection, comparison, opposite of UNION?

I'm trying to compare two set of resutls aving hard time to undesrtand how subqueries work and if they are efficient. I'm not gonna explain all my tables, but just think i have apair of arrays...i might do it in php but i wonder if i can do it in mysql right away...
this is my query to check how many items user 1 has in lists he owns
SELECT DISTINCT *
FROM list_tb
INNER JOIN item_to_list_tb
ON list_tb.list_id = item_to_list_tb.list_id
WHERE list_tb.user_id = 1
ORDER BY item_to_list_tb.item_id DESC
this is my query to check how many items user 2 has in lists he owns
SELECT DISTINCT *
FROM list_tb
INNER JOIN item_to_list_tb
ON list_tb.list_id = item_to_list_tb.list_id
WHERE list_tb.user_id = 1
ORDER BY item_to_list_tb.item_id DESC
now the problem is that i would intersect those results to check how many item_id they have in common...
thanks!!!
Unfortunately, MySQL does not support the Intersect predicate. However, one way to accomplish that goal would be to exclude List_Tb.UserId from your Select and Group By and then count by distinct User_Id:
Select ... -- everything except List_Tb.UserId
From List_Tb
Inner Join Item_To_List_Tb
On List_Tb.List_Id = Item_To_List_Tb.List_Id
Where List_Tb.User_Id In(1,2)
Group By ... -- everything except List_Tb.UserId
Having Count( Distinct List_Tb.User_Id ) = 2
Order By item_to_list_tb.item_id Desc
Obviously you would replace the ellipses with the actual columns you want to return and on which you wish to group.