I am currently refreshing my SQL knowledge.
I have a table - Sessions. It stores information about user log activity. ie the duration of how long they are logged in for. See the table below.
So I am trying to select all repeated rows from a table (not just validate that repeated rows exist).
So far I have managed to get the output of the entire table, however, I only need the userId and duration columns. How can I go about selecting only these two rows?
I thought it would have been SELECT a.userId instead of a.* etc however I get the error "ambiguous column name: userId". Not sure what is going on. Sorry if it's a stupid question but any help is appreciated. Thanks.
SELECT a.*
FROM sessions a
JOIN ( SELECT userId,duration
FROM sessions
GROUP BY userId
HAVING COUNT(userId) > 1 ) b
ON a.userId = b.userId
ORDER BY userId;
The problem is due to the ORDER BY clause, which does not scope the userId reference to one of the tables. Use this version:
ORDER BY a.userId;
Here is your updated query, with the select clause of the subquery also corrected by removing the incorrect (and unnecessary) reference to duration:
SELECT a.*
FROM sessions a
INNER JOIN
(
SELECT userId
FROM sessions
GROUP BY userId
HAVING COUNT(userId) > 1
) b
ON a.userId = b.userId
ORDER BY
a.userId;
Related
So I'm kinda new to SQL joins and was thinking on going full overkill probably.
What I want to do is join my four tables together.
What I want to accomplish is that I want all the information from category, and I want it to be matched to the replies with the newest timestamp and then I want to join the t.title which t.id matches r.thread_id
SELECT c.*, t.id, t.title, r.timestamp, u.id, u.username
FROM forum_category AS c
LEFT JOIN forum_threads AS t ON (c.id = t.category_id)
LEFT JOIN forum_replies AS r ON (t.id = r.thread_id
AND r.timestamp =
(
SELECT timestamp
FROM forum_replies
ORDER BY timestamp DESC LIMIT 1
))
LEFT JOIN users AS u ON (r.user_id = u.id)
GROUP BY c.id
As it is now this code seems to work, not having tested it alot.
However I need to expand it to check if t.timestamp is newer than latest r.timestamp and JOIN that one instead then. with the t.title, t.timestamp and t.user_id.
So if a thread is newer than the latest reply.
I know I could make the first post a reply and solve it that way. But I'm not doing that right now if it's possible to solve in the SQL statement.
SQL layout imgur here:
https://imgur.com/a/nCn2a
forum_category:
forum_threads:
forum_replies:
One helpful technique is to use Subqueries to break up the mental logic of what your query is trying to do. Basically, a subquery takes the place of a regular table in any query.
So, first up, we need to get the most recent time stamp in the replies for each thread:
select thread_id, max(timestamp) as LatestReply
from forum_replies
group by thread_id
Let's call this our MostRecentThreadSubquery. So, it would let us do something like:
select * from
forum_threads t
LEFT JOIN
(
select thread_id, max(timestamp) as LatestReply
from forum_replies
group by thread_id
) as MostRecentThreadSubquery
on t.thread_id = MostRecentThreadSubquery.thread_id
Make sense? We're no longer joining the forum_threads table against the forum_replies table - we've made a subquery to help us list the most recent reply for each thread id.
Now, we add the SQL CASE statement, to get something like:
select
thread_id,
CASE WHEN t.timestamp > MostRecentThreadSubquery.LatestReply
THEN t.timestamp
ELSE MostRecentThreadSubquery.LatestReply
END as MostRecentTimestamp
from -- ... the rest of that earlier SQL statement
Okay, so now we've got a query that, for every thread_id, has the most recent timestamp - whether that's from the forum_replies or from the forum_threads table.
... and you guessed it. We're going to make it another subquery. Let's call it our MostRecentPerThread
select *
from forum_category AS c
LEFT JOIN
(
-- ... that previous query ...
) as MostRecentPerThread
on c.thread_id = MostRecentPerThread.thread_id
Make sense? You're using subqueries as a way of logically breaking down your query into smaller components. You no longer have one gigantic query. You've got a small subquery that simply gets the timestamp of the most recent reply. You've got a small subquery that compares that first subquery to the threads table to get the most recent timestamp. And you've got a main query that uses the second subquery to merge it with the categories table.
Sql fidle here.
SELECT UserId,totalLikes FROM Users
LEFT JOIN(select ownerId, PostId from Posts) a ON ownerId = UserId
LEFT JOIN(select idOfPost, count(idOfPost) AS totalLikes from Likes) b ON idOfPost = PostId
WHERE UserId = 120 GROUP BY UserId
This is a simplified part of the query that i am using, on the fiddle it works exactly how i need it to, it counts every idOfPost as a like for every post that belongs to the user specified, in this case where UserId = 120
and it groups the result in a single row.
But when i run this in WAMP i am getting the following error #1140 this is incompatible with sql_mode=only_full_group_by witch i think is because i need to group by PostId as well, but if i do that i get multiple rows, naturally because the id of the posts are different but i want to have it in a single row.
So my questions are: Should i disable the sql_mode=only_full_group_by witch i'm not really sure what impact would have, or is my tables structure at fault and it needs to be changed, maybe including the UserId in the Likes table, or my query is at fault and needs to be changed?
mysql version 5.7.14 on WAMP
Use GROUP BY in the subquery and sum() aggregate in the main query:
SELECT UserId, sum(totalLikes) AS totalLikes
FROM Users
LEFT JOIN Posts a ON ownerId = UserId
LEFT JOIN (
select idOfPost, count(idOfPost) AS totalLikes
from Likes
group by idOfPost) b ON idOfPost = PostId
WHERE UserId = 120
GROUP BY UserId
SqlFiddle.
Been looking into this for awhile. Hoping someone might be able to provide some insight. I have 3 tables. All of which I'm grabbing multiple columns, but the 3rd I need to limit the output to just the most recent timestamp entry, BUT still display multiple columns.
If I have the following data [ Please see SQL Fiddle ]:
http://sqlfiddle.com/#!2/84b91/6
The fiddle is a list of (names) in Table1(users), (job_name,years) in Table2(job), and then (score, timestamp) in Table3(job_details). All linked together by the users id.
I am definitely not great at MYSQL. I know I'm missing something.. possibly a series of JOINs. I have been able to get Table 1, Table 2 and one column of Table 3 by doing this:
select a.id, a.name, b.job_name, b.years,
(select c.timestamp
from job_details as c
where c.user_id = a.id
order by c.timestamp desc limit 1) score
from users a, job as b where a.id = b.user_id;
At this point, I can get multiple column data on the first two columns, limit the 3rd to one value and sort that value on the last timestamp...
My question is: How does one go about adding a second column to the limit? In the example in the fiddle, I'd like to add the score as well as the timestamp to the output.
I'd like the output to be:
NAME, JOB, YEARS, SCORE, TIMESTAMP. The last two columns would only be the last entry in job_details sorted by the most recent TIMESTAMP.
Please let me know if more information is required! Thank you for your time!
T
Try this:
select a.id, a.name, b.job_name, b.years, c.timestamp, c.score
from users a
INNER JOIN job as b ON a.id = b.user_id
INNER JOIN (SELECT jd.user_id, jd.timestamp, jd.score
FROM job_details as jd
INNER JOIN (select user_id, MAX(timestamp) as tstamp
from job_details
GROUP BY user_id) as max_ts ON jd.user_id = max_ts.user_id
AND jd.timestamp = max_ts.tstamp
) as c ON a.id = c.user_id
;
I have three tables: users, groups and relation.
Table users with fields: usrID, usrName, usrPass, usrPts
Table groups with fields: grpID, grpName, grpMinPts
Table relation with fields: uID, gID
User can be placed in group in two ways:
if collect group minimal number of points (users.usrPts > group.grpMinPts ORDER BY group.grpMinPts DSC LIMIT 1)
if his relation to the group is manually added in relation tables (user ID provided as uID, as well as group ID provided as gID in table named relation)
Can I create one single query, to determine for every user (or one specific), which group he belongs, but, manual relation (using relation table) should have higher priority than usrPts compared to grpMinPts? Also, I do not want to have one user shown twice (to show his real group by points, but related group also)...
Thanks in advance! :) I tried:
SELECT * FROM users LEFT JOIN (relation LEFT JOIN groups ON (relation.gID = groups.grpID) ON users.usrID = relation.uID
Using this I managed to extract specified relations (from relation table), but, I have no idea how to include user points, respecting above mentioned priority (specified first). I know how to do this in a few separated queries in php, that is simple, but I am curious, can it be done using one single query?
EDIT TO ADD:
Thanks to really educational technique using coalesce #GordonLinoff provided, I managed to make this query to work as I expected. So, here it goes:
SELECT o.usrID, o.usrName, o.usrPass, o.usrPts, t.grpID, t.grpName
FROM (
SELECT u.*, COALESCE(relationgroupid,groupid) AS thegroupid
FROM (
SELECT u.*, (
SELECT grpID
FROM groups g
WHERE u.usrPts > g.grpMinPts
ORDER BY g.grpMinPts DESC
LIMIT 1
) AS groupid, (
SELECT grpUID
FROM relation r
WHERE r.userUID = u.usrID
) AS relationgroupid
FROM users u
)u
)o
JOIN groups t ON t.grpID = o.thegroupid
Also, if you are wondering, like I did, is this approach faster or slower than doing three queries and processing in php, the answer is that this is slightly faster way. Average time of this query execution and showing results on a webpage is 14 ms. Three simple queries, processing in php and showing results on a webpage took 21 ms. Average is based on 10 cases, average execution time was, really, a constant time.
Here is an approach that uses correlated subqueries to get each of the values. It then chooses the appropriate one using the precedence rule that if the relations exist use that one, otherwise use the one from the groups table:
select u.*,
coalesce(relationgroupid, groupid) as thegroupid
from (select u.*,
(select grpid from groups g where u.usrPts > g.grpMinPts order by g.grpMinPts desc limit 1
) as groupid,
(select gid from relations r where r.userId = u.userId
) as relationgroupid
from users u
) u
Try something like this
select user.name, group.name
from group
join relation on relation.gid = group.gid
join user on user.uid = relation.uid
union
select user.name, g1.name
from group g1
join group g2 on g2.minpts > g1.minpts
join user on user.pts between g1.minpts and g2.minpts
I want to select currently active record based on a date. What I need to do is to create a view so a 'select everything' subquery is not going to work.
but right now I can only select one column because it says
Operand should contain 1 column(s)
I could duplicate the select for all the fields I want to get but its going to be performance extensive, plus I have a lot of columns on the history table.
For this simple example (link below), lets say that I need to get the phone number as well.. any idea how I should go about doing it? Thanks.
SQL Fiddler:
http://www.sqlfiddle.com/#!2/1ff7e/1
I think I got it! Seems to work but need more records to try.
SQL Fiddler:
http://www.sqlfiddle.com/#!2/1ff7e/3
Still not working... zzz...I added a few more test data turned out it didn't work!
SQL Fiddler:
http://www.sqlfiddle.com/#!2/360c1/1
My bad. IT IS WORKING! I inserted two duplicate primary keys for the history. Thanks all!!!!
SQL Fiddler:
http://www.sqlfiddle.com/#!2/274c5/1
You can add another subquery to the select clause:
SELECT user.username, user.password,
(SELECT uh.name FROM user_history uh WHERE uh.user_id = user.user_id AND effective_date <= '2013-04-18' ORDER BY effective_date DESC LIMIT 1),
(SELECT uh.phone_number FROM user_history uh WHERE uh.user_id = user.user_id AND effective_date <= '2013-04-18' ORDER BY effective_date DESC LIMIT 1)
FROM user
ORDER BY username;