not showing DISTINCT record on 3 table join - mysql

I try following query for display DISTINCT record from 3 table.But there also display repeat user record.
SELECT DISTINCT users.sid, users.username, users.registration_date, users.FirstName,
users.LastName, users.phoneNumber, listings.Resume, uploaded_files.saved_file_name
FROM users JOIN listings
ON users.sid = listings.user_sid
JOIN uploaded_files
ON listings.Resume=uploaded_files.id
WHERE listings.listing_type_sid = '7' AND listings.Resume != 'NULL'

Making some massive assumptions on the structure of your tables.
Obvious way is to join against a sub query that gets the latest listing date for each user, and then join that against listings to get the listing fields for that date.
SELECT users.sid,
users.username,
users.registration_date,
users.FirstName,
users.LastName,
users.phoneNumber,
listings.Resume,
uploaded_files.saved_file_name
FROM users
INNER JOIN
(
SELECT user_sid, MAX(resume_date) AS latest_resume
FROM listings
GROUP BY user_sid
) sub0
ON users.sid = sub0.user_sid
INNER JOIN listings
ON sub0.sid = listings.user_sid
AND sub0.latest_resume = listings.resume_date
INNER JOIN uploaded_files
ON listings.Resume=uploaded_files.id
WHERE listings.listing_type_sid = '7'
AND listings.Resume != 'NULL'
A bit of a fiddle is to use GROUP_CONCAT to get all the saved files ordered by the date, then use SUBSTRING_INDEX to get the first one (I have just used the default comma to split the files up - but you should really use a delimited that will never be in any of the file names)
SELECT users.sid,
users.username,
users.registration_date,
users.FirstName,
users.LastName,
users.phoneNumber,
listings.Resume,
SUBSTRING_INDEX(GROUP_CONCAT(uploaded_files.saved_file_name ORDER BY listings.resume_date DESC), ',', 1) AS saved_file_name
FROM users
INNER JOIN listings
ON users.sid = listings.user_sid
INNER JOIN uploaded_files
ON listings.Resume=uploaded_files.id
WHERE listings.listing_type_sid = '7'
AND listings.Resume != 'NULL'
GROUP BY users.sid,
users.username,
users.registration_date,
users.FirstName,
users.LastName,
users.phoneNumber,
listings.Resume

Related

Optimize Query Mysql to count data in each district

i have this query for calculate success total in each district. this query works but its take until 2min to output data, i have 15k rows in orders.
SELECT
nsf.id,
nsf.province,
nsf.city,
nsf.district,
nsf.shipping_fee,
IFNULL((SELECT COUNT(orders.id) FROM orders
JOIN users ON orders.customer_id = users.id
JOIN addresses ON addresses.user_id = users.id
JOIN subdistricts ON subdistricts.id = addresses.subdistrict_id
WHERE orders.status_tracking IN ("Completed","Successful Delivery")
AND subdistricts.ninja_fee_id = nsf.id
AND orders.transfer_to = "cod"),0) as success_total
from ninja_shipping_fees nsf
GROUP BY nsf.id
ORDER BY nsf.province;
the output should be like this
can you help me to improve the peformance? Thanks
Try performing the grouping/calculation in a joined "derived table" instead of a "correlated subquery"
SELECT
nsf.id
, nsf.province
, nsf.city
, nsf.district
, nsf.shipping_fee
, COALESCE( g.order_count, 0 ) AS success_total
FROM ninja_shipping_fees nsf
LEFT JOIN (
SELECT
subdistricts.ninja_fee_id
, COUNT( orders.id ) AS order_count
FROM orders
JOIN users ON orders.customer_id = users.id
JOIN addresses ON addresses.user_id = users.id
JOIN subdistricts ON subdistricts.id = addresses.subdistrict_id
WHERE orders.status_tracking IN ('Completed', 'Successful Delivery')
AND orders.transfer_to = 'cod'
GROUP BY subdistricts.ninja_fee_id
) AS g ON g.ninja_fee_id = nsf.id
ORDER BY nsf.province;
"Correlated subqueries" are often a source of poor performance.
Other notes, I prefer to use COALESCE() because it is ANSI standard and available in most SQL implementations now. Single quotes are more typically used to denote strings literals.

MySQL Group By with multi-join

here is my SQL query:
`SELECT subject, threadpost.date, threadpost.idThreadPost as id,
threadcategories.category, users.userName, COUNT(idThreadSubs) AS subs, COUNT(idThreadReplies) as replies
FROM threadpost
JOIN threadcategories
ON idthreadcategories = threadpost.category
JOIN users
ON idUsers = UserId
LEFT JOIN threadsubs
ON threadpost.idThreadPost = threadsubs.ThreadId
LEFT JOIN threadreplies
ON threadpost.idThreadPost = threadreplies.ThreadId
WHERE idthreadcategories LIKE ?
GROUP BY idThreadPost
ORDER BY date desc
LIMIT 20;`
the problem comes with adding COUNT(idThreadReplies). As you can see, I'm grouping by idThreadPost. This is because I want to retrieve both the count of subscriptions to the thread, and the count of replies.
However, the result gives me the incorrect number of replies (the same number as subscriptions).
How would I formulate this query correctly?
figured it out. solution is to use subqueries in the joins that require group by:
`SELECT subject, threadpost.date, threadpost.idThreadPost as id,
threadcategories.category, users.userName, tsubs.subs AS subs, trep.replies as replies
FROM threadpost
JOIN threadcategories
ON idthreadcategories = threadpost.category
JOIN users
ON idUsers = UserId
LEFT JOIN (
SELECT threadsubs.ThreadId as tsubId, COUNT(idThreadSubs) as subs
FROM threadsubs
GROUP BY idThreadSubs
) as tsubs
ON tsubs.tsubId = threadpost.idThreadPost
LEFT JOIN (
SELECT threadreplies.ThreadId as tId, COUNT(threadreplies.idThreadReplies) as replies
FROM threadreplies
GROUP BY threadreplies.ThreadId
) AS trep
ON trep.tId = threadpost.idThreadPost
WHERE idthreadcategories LIKE ?
ORDER BY date desc
LIMIT 20;`

SQL query that limits the results to one when using count inside count

I am trying to select the count of likes on a specific project. The idea i came up with is
CAST(count(uploads.ID in (SELECT uploadID from votes)) as decimal) as numberoflikes
this works but the query then only returns one thing.
Entire query
SELECT DISTINCT users.NAME AS username
,users.ID AS userID
,subjects.NAME AS subjectname
,uploads.TIME
,uploads.description
,uploads.NAME
,uploads.ID
,CASE
WHEN uploads.ID IN (
SELECT uploadID
FROM votes
WHERE userID = 2
)
THEN CAST(1 AS DECIMAL)
ELSE CAST(0 AS DECIMAL)
END AS liked
,CASE
WHEN uploads.ID IN (
SELECT uploadID
FROM bookmarks
WHERE userID = 2
)
THEN CAST(1 AS DECIMAL)
ELSE CAST(0 AS DECIMAL)
END AS bookmarked
,CAST(count(uploads.ID IN (
SELECT uploadID
FROM votes
)) AS DECIMAL) AS numberoflikes
FROM uploads
INNER JOIN subjects ON (subjects.ID = uploads.subjectID)
INNER JOIN users ON (users.ID = uploads.userID)
INNER JOIN uploadGrades ON (uploads.ID = uploadGrades.uploadID)
INNER JOIN grades ON (grades.ID = uploadGrades.gradeID)
WHERE uploads.active = 1
AND subjects.ID IN (
SELECT subjectID
FROM userSubjects
INNER JOIN users ON (users.ID = userSubjects.userID)
WHERE userSubjects.userID = 2
)
AND grades.ID IN (
SELECT userGrades.gradeID
FROM uploadGrades
INNER JOIN userGrades ON (uploadGrades.gradeID = userGrades.gradeID)
WHERE userGrades.userID = 2
)
ORDER BY uploads.trueRating DESC;
Lets try a reduce version of your query, That is the base to get better answers
I reduce the initial query to user and upload to start. Also remove the fields you already know how to calculate.
.
SELECT DISTINCT users.NAME AS username
,users.ID AS userID
,uploads.NAME
,uploads.ID
,CAST(count(uploads.ID IN (
SELECT uploadID
FROM votes
)) AS DECIMAL) AS numberoflikes
FROM uploads
INNER JOIN users ON (users.ID = uploads.userID)
WHERE uploads.active = 1
ORDER BY uploads.trueRating DESC;
Then add votes with LEFT JOIN to replace the SELECT in the COUNT that way if not match you will get NULL and as I say in my comment COUNT doesnt count NULL's
.
SELECT DISTINCT users.NAME AS username
,users.ID AS userID
,uploads.NAME
,uploads.ID
,CAST(count(votes.uploadID)) AS DECIMAL) AS numberoflikes
FROM uploads
INNER JOIN users ON (users.ID = uploads.userID)
LEFT JOIN votes ON (uploads.ID = votes.uploadID)
WHERE uploads.active = 1
ORDER BY uploads.trueRating DESC;
Try something like this...
SELECT users.name as username, users.ID as userID, subjects.name as subjectname,
uploads.time, uploads.description, uploads.name, uploads.ID,
count(userVotes.userId), count(bookmarksMade.userId),
FROM uploads
join subjects on(subjects.ID = uploads.subjectID)
join users on(users.ID = uploads.userID)
join uploadGrades on(uploads.ID = uploadGrades.uploadID)
join grades on(grades.ID = uploadGrades.gradeID)
left join (select userId, uploadId from votes where userId = 2) as userVotes on uploads.id = userVotes.uploadId
left join (select userId, uploadId from bookmarks where userId = 2) as bookmarksMade on uploads.id = bookmarksMade.uploadId
join userSubjects on subjects.id = userSubjects.subjectID
WHERE uploads.active = 1 AND
userSubjects.userID = 2
ORDER BY uploads.trueRating DESC;
But, I am leaving out the userGrades thing, because you are doing a funky join there that I don't really understand (joining two tables on something that looks like it is not the whole primary key on either table).
Anyway, you really need to go to something more like this or what Oropeza suggests in his answer. Get more direct about what you want. This query looks like a monster that has been growing and getting things added in with "IN" clauses, as you needed them. Time to go back to the drawing board and think about what you want and how to get at it directly.
count(uploads.ID in (SELECT uploadID from votes)) as numberoflikes
group by uploads.Id ORDER BY uploads.trueRating DESC
I managed to do it like this. If i added the group by then it split the numberoflikes into rows and returned more then one row. Thanks for the help!

sql counts wrong number of likes

I have written an sql statement that besides all the other columns should return the number of comments and the number of likes of a certain post. It works perfectly when I don't try to get the number of times it has been shared too. When I try to get the number of time it was shared instead it returns a wrong number of like that seems to be either the number of shares and likes or something like that. Here is the code:
SELECT
[...],
count(CS.commentId) as shares,
count(CL.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN
account A ON A.id = `AS`.accountId
INNER JOIN
comment C ON C.accountId = A.id
LEFT JOIN
commentLikes CL ON C.commentId = CL.commentId
LEFT JOIN
commentShares CS ON C.commentId = CS.commentId
GROUP BY
C.time
ORDER BY
year, month, hour, month
Could you also tell me if you think this is an efficient SQL statement or if you would do it differently? thank you!
Do this instead:
SELECT
[...],
(select count(*) from commentLikes CL where C.commentId = CL.commentId) as shares,
(select count(*) from commentShares CS where C.commentId = CS.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN account A ON A.id = `AS`.accountId
INNER JOIN comment C ON C.accountId = A.id
GROUP BY C.time
ORDER BY year, month, hour, month
If you use JOINs, you're getting back one result set, and COUNT(any field) simply counts the rows and will always compute the same thing, and in this case the wrong thing. Subqueries are what you need here. Good luck!
EDIT: as posted below, count(distinct something) can also work, but it's making the database do more work than necessary for the answer you want to end up with.
Quick fix:
SELECT
[...],
count(DISTINCT CS.commentId) as shares,
count(DISTINCT CL.commentId) as numberOfLikes
Better approach:
SELECT [...]
, Coalesce(shares.numberOfShares, 0) As numberOfShares
, Coalesce(likes.numberOfLikes , 0) As numberOfLikes
FROM [...]
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfShares
FROM commentShares
GROUP
BY commentId
) As shares
ON shares.commentId = c.commentId
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfLikes
FROM commentLikes
GROUP
BY commentId
) As likes
ON likes.commentId = c.commentId

MySQL using Aliases

I have the following syntactically incorrect query with aliases in_Degree and out_degree:
insert into userData
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree,
in_degree + out_degree(freq)
from users u
The problem in the query is the the 4th item in the select list aliased as freq. I want the 4th item to have the value in_degree + out_degree. The brute force extremely slow solution would be to copy and past both subqueries and add them.
How can I make this fast and as simple as in_degree + out_degree?
You could use a subquery:
insert into userData
select user_name,
in_degree,
out_degree,
in_degree + out_degree
from
(
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree
from users u
) src
Or you might be able to use:
insert into userData
select user_name,
count(distinct in_t.*) in_degree,
count(distinct out_t.*) out_degree,
count(distinct in_t.*) + count(distinct out_t.*)
from users u
left join tweets in_t
on u.USER_NAME = in_t.rt_user_name
left join tweets out_t
on u.USER_NAME = out_t.source_user_name
group by u.user_name
As you have discovered, you can't reference the aliases given in that select list, except in a HAVING clause or an ORDER BY clause.
One option is to use your query as an "inline view", and write a wrapper query around that.
remove the 4th (invalid) expression from the select list in your query,
wrap your query in a set of parens
follow the closing paren with an alias (e.g.) s
write a query around that, referencing the inline view as if it were a table
the select list on the outer query can reference the "aliases" defined in the inline view.
However, if you want to make this "fast", you might consider (as an option) taking an entirely different tack. Rather than using correlated subqueries to get the count for each individal user, you could get the counts for all users, and then use LEFT JOIN operator, e.g.
SELECT u.user_name
, IFNULL(i.cnt,0) AS in_degree
, IFNULL(o.cnt,0) AS out_degree
, IFNULL(i.cnt,0)+IFNULL(o.cnt,0) AS freq
FROM users u
LEFT
JOIN (SELECT rt_user_name, COUNT(*) AS cnt FROM tweets
GROUP BY rt_user_name) i
ON i.rt_user_name = u.user_name
LEFT
JOIN (SELECT source_user_name, COUNT(*) AS cnt FROM tweets
GROUP BY source_user_name) o
ON o.source_user_name = u.user_name
This should work:
insert into userData
SELECT T.user_name,
T.in_degree,
T.out_degree,
(T.in_degree + T.out_degree) as freq
FROM (SELECT user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME) as in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name) as out_degree
FROM users u) T
In a fast way, I would do something like:
insert into userData
select
TMP.user_name,
TMP.in_degree,
TMP.out_degree,
(TMP.in_degree + TMP.out_degree) degreeSum
from(
select user_name,
(select COUNT(*) from tweets where rt_user_name = u.USER_NAME)in_degree,
(select COUNT(*) from tweets where source_user_name = u.user_name)out_degree
from users u
) TMP