join 2 mysql tables and get the first and last date - mysql

I have 2 mysql tables, one with the users details and the second with all the pages that the users saw (1:N)
TABLE "users"
id int(10) auto_increment primay
ip varchar(15)
lang char(2)
...
TABLE "pages"
id int(10) auto_increment primay
uid int(10) index
datetime datetime
url varchar(255)
I know is possibile to join the 2 tables, but i'm a little confused how to get the first and last datetime, and the first url from the "pages" table...
SELECT * FROM users, pages WHERE users.id = pages.uid
I think with GROUP BY / MIN(pages.datetime), MAX(pages.datetime) but I have no idea where to use it, and how I can get the first pages.url

As you mentioned you need to use Group by with MIN & MAX aggregate function to find the first and last datetime per user.
Also don't use comma separated join syntax which is quite old and not much readable use proper INNER JOIN syntax
SELECT U.ID,
MIN(pages.datetime) as First_date,
MAX(pages.datetime) as Last_date
FROM users U
INNER JOIN pages P
ON U.id = P.uid
Group by U.ID
If you want to see the other information like first visited url,etc.. Then you can join above result to the main table to get the related information.
select A.uid,A.url First_URL,C.url as Last_url,First_date,Last_date
from pages A
INNER JOIN
(
SELECT U.ID,
MIN(pages.datetime) as First_date,
MAX(pages.datetime) as Last_date
FROM users U
INNER JOIN pages P
ON U.id = P.uid
Group by U.ID
) B
ON A.ID =B.ID
and A.datetime = B.First_date
INNER JOIN pages C
on C.ID =B.ID
and C.datetime = B.Last_date

Related

Using an SQL LEFT JOIN with the MAX() and MIN() functions

Let's assume I have the following two tables:
CREATE TABLE users (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
name CHAR(30) NOT NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM;
CREATE TABLE logins (
user_id NOT NULL,
day DATE NOT NULL,
PRIMARY KEY (`user_id, `day`)
) ENGINE=MyISAM;
What I'm trying to do here is get a query for all users with the first day they logged in and the last day they logged in. The query I was executing to achieve this looks like the following:
SELECT u.id AS id, u.name AS name, MIN(l.day) AS first_login,
MAX(l.day) AS last_login
FROM users u
LEFT JOIN logins l ON u.id = l.user_id
The problem is that because of the use of MIN() and MAX(), I'm only receiving one row back in the entire result. I'm sure it's my use of those functions that's causing this. I should have one row per user, even if they do not have any login entries. This is the reason for me using a LEFT JOIN vs an INNER JOIN.
in order to use aggregate functions (min, max, ...) you need grouping. Try something like this:
SELECT u.id AS id, u.name AS name, MIN(l.day) AS first_login, MAX(l.day) AS last_login
FROM users u
LEFT JOIN logins l ON u.id = l.user_id
GROUP BY u.id
Any sensible database except MySQL would have given you an error on mixing row-terms and aggregate terms, making the error clearer. MySQL, unfortunately allows this behavior, making it harder to notice that you forgot the group by clause needed to generate a row per user:
SELECT u.id AS id,
u.name AS name,
MIN(l.day) AS first_login,
MAX(l.day) AS last_login
FROM users u
LEFT JOIN logins l ON u.id = l.user_id
GROUP BY u.id, u.name -- missing in the original query
Grouping is a waste of resources.
Use nested select statement instead.
eg.
SELECT
u.id AS id,
u.name AS name,
(
SELECT MAX(logins.day) FROM logins WHERE logins.user_id=u.id
) AS last_login
FROM users u;
MIN and MAX are aggregate functions.
You should use GROUP BY with some field from u, like id.

Find unique values that do not exist in multiple columns and tables

A misconfigured manual import imported our entire AD into our help desk user database, creating a bunch of extraneous/duplicate accounts. Of course, no backup to restore from.
To facilitate the cleanup, I want to run a query that will find users not currently linked to any current or archived tickets. I have three tables, USER, HD_TICKET, and HD_ARCHIVE_TICKET. I want to compare the ID field in USER to the OWNER_ID and SUBMITTER_ID fields in the other two tables, returning the only the values in USER.ID that do not exist in any of the other four columns.
How can this be accomplished?
Do a left join for each relationship where the right table id is null:
select user.*
from user
left join hd_ticket on user.id = hd_ticket.owner_id
left join hd_ticket as hd_ticket2 on user.id = hd_ticket2.submitter_id
left join hd_archive_ticket on user.id = hd_archive_ticket.owner_id
left join hd_archive_ticket as hd_archive_ticket2 on user.id = hd_archive_ticket2.submitter_id
where hd_ticket.owner_id is null
and hd_ticket2.submitter_id is null
and hd_archive_ticket.owner_id is null
and hd_archive_ticket2.submitter_id is null
How about something like:
SELECT id
FROM user
WHERE id NOT IN
(
SELECT owner_id
FROM hd_ticket
UNION ALL
SELECT submitter_id
FROM hd_ticket
UNION ALL
SELECT owner_id
FROM hd_archive_ticket
UNION ALL
SELECT submitter_id
FROM hd_archive_ticket
)
If I understood you situation I would do this:
SELECT a.id FROM user a, hd_ticket b, hd_archive_ticket c WHERE a.id != b.id AND a.id != c.id
You would want to try something like below. Inner query where I am doing Inner join with other 2 tables, will return only those user id which exist in all 3 tables. Then in your outer query I am just filtering out those ID's returned by inner query; since your goal is to get only those USER ID which is not present in other tables.
select ID
FROM USER
WHERE ID NOT IN
(
select u.ID
from user u
inner join HD_TICKET h on u.ID = h.OWNER_ID
inner join HD_ARCHIVE_TICKET ha on u.ID = ha.SUBMITTER_ID
)

mysql complicated join

I have run into some troubles while writing a query for MySQL. I don't know how to describe my problem well enough to search the web for it, so sorry if my question is stupid.
I have 3 tables:
CREATE TABLE posts( id INT, author INT );
CREATE TABLE users( id INT, nick varchar(64) );
CREATE TABLE groups( id INT, name varchar(64) );
CREATE TABLE membership (user INT, group INT, date INT ) ;
Membership contains info about users that have joined some groups. "Date" in the membership table is the time when a user joined that group.
I need a query which will return a post, its author's nick and the name of the group with the least joining date.
All I have currently is:
SELECT p.id, u.nick, g.name
FROM posts AS p
LEFT JOIN users AS u ON u.id = p.author
LEFT JOIN membership AS m ON m.user = p.author
LEFT JOIN groups AS g ON g.id = m.group
WHERE 1;
but of course it returns a random group's name, not the one with earliest joining date.
I also tried the following variant:
SELECT p.id, u.nick, g.name
FROM posts AS p
LEFT JOIN users AS u ON u.id = p.author
LEFT JOIN
(SELECT * FROM membership WHERE 1 ORDER BY date ASC)
AS m ON m.user = p.author
LEFT JOIN groups AS g ON g.id = m.group
WHERE 1;
but it gave me same result.
I would appreciate even pointers to where I could start, because at the moment I have no idea what to do with it.
I don't know why you want what you do, however, if you want the information for the earliest membership date (since there's no date for posting itself), no problem. Now, we have the earliest membership which will always point to the same one person as you are not asking for a specific group.. (or did you want the earliest person PER membership group -- which is what I'll write the query for). Now, we have the earliest user and can link to the posts table (by apparently the author), but what if someone has 20 posts under their name... Do you also want the FIRST ID for that author.
Just copying from your supplied tables as a reference...
posts: id (int), author(int)
users: id (int), nick (varchar)
groups: id (int), name (varchar)
membership: user (int), group (int), date (int)
select
u1.nick,
m2.date,
g1.name,
p1.id as PostID
from
( select m.group,
min( m.date ) as EarliestMembershipSignup
from
Membership m
group by
m.group ) EarliestPerGroup
join Membership m2
on EarliestPerGroup.Group = m2.Group
AND EarliestPerGroup.EarliestMembershipSignup = m2.Date
join groups g1
on m2.group = g1.id
join users u1
on m2.user = u1.ID
join posts p1
on u1.id = p1.author
Something like this
SELECT p.id, u.nick, g.name
FROM posts p,
users u,
membership m,
groups g
WHERE p.author = u.id
AND m.user = u.id
AND m.group = g.id
ORDER BY m.date ASC
LIMIT 1;
Take care to have good indexes when joining these 4 tables.
I'd recommend moving your date column from the membership table into your groups table since that seems to be where you're tracking that information. The membership table is just an intersection table for the many-to-many users<->groups tables. It should only contain user ID and the group ID columns.
What about this?
SELECT p.id, u.nick, g.name
FROM
users u,
posts p,
groups g
INNER JOIN membership m
ON u.id = m.user
INNER JOIN groups
ON m.group = groups.id
ORDER BY g.timestamp DESC
LIMIT 1;

Somewhat Complex MySQL Statement

I am creating a forum, and have gotten stuck creating the page that will display all the topics for a given forum. The three relevant tables & fields are structured as follows:
Table: forums_topics Table: forums_posts Table: users
-------------------- ------------------- ------------
int id int id int id
int forum_id int topic_id varchar name
int creator int poster
tinyint sticky varchar subject
timestamp posted_on
I've started with the following SQL:
SELECT t.id,
t.sticky,
u.name AS creator,
p.subject,
COUNT(p.id) AS posts,
MAX(p.posted_on) AS last_post
FROM forums_topics AS t
JOIN users AS u
LEFT JOIN forums_posts AS p ON p.topic_id = t.id
WHERE t.forum_id = 1
AND u.id = t.creator
GROUP BY t.id
ORDER BY t.sticky
This appears to be getting me what I want (topic's id number, if its a sticky, who made the topic, the subject of the topic, number of posts for each topic, and timestamp of latest post). If there is a mistake though please let me know.
What I am having trouble with now is how I can add to this to get the name of the lastest poster. Can someone explain how I would edit my SQL to do that? I can provide more details if needed, or restructure my tables if that will make it simpler.
Here is a simple way to do this:
SELECT t.id,
t.sticky,
u.name AS creator,
p.subject,
COUNT(p.id) AS posts,
MAX(p.posted_on) AS last_post,
(SELECT name FROM users
JOIN forums_posts ON forums_posts.poster = users.id
WHERE forums_posts.id = MAX(p.id)) AS LastPoster
FROM forums_topics AS t
JOIN users AS u
LEFT JOIN forums_posts AS p ON p.topic_id = t.id
WHERE t.forum_id = 1
AND u.id = t.creator
GROUP BY t.id
ORDER BY t.sticky
Basically, you do a sub-query to find the user based upon the max id. If your IDs are GUIDs or are not in order for some other reason, you could do the lookup based upon the posted_on timestamp instead.

Finding data that's missing from the database

I need to figure out some clever MySQL snippet that will allow me to easily see two tables, the ids in the table if they exist or NULL or empty if they don't exist.
I have a users table and a legacy table and outside of manual comparison I can't figure out how to make them appear in a table together so I can compare. What I would love to see is something like this:
+----------------------------+
| user_id | email | uid |
| 14 | me#me.com | 26 |
| 16 | ug#ug.com | NULL |
+----------------------------+
I know there's a way to include NULL or empty values but I'm not sure what it is. Here's my deranged SQL query so far, yes, I know it's horrible to do subselects inside of subselects:
select uid from users where mail IN (
select email from legacy_users where id NOT IN (
select sourceid from migrate_map_users
)
);
There are three tables involved here, legacy_users => migrate_map_users => users. The middle is just an m2m which joins the two. legacy_users and users both have an email column. and their own version of an id.
Thank you all!
You need to learn about join types, in particular left and outer joins:
SELECT u.uid, u.mail, lu.id
FROM users u
LEFT OUTER JOIN legacy_users lu
ON u.email = lu.mail
WHERE lu.id NOT IN
(
SELECT sourceid
FROM migrate_map_users
);
The LEFT OUTER JOIN will make sure all records in the LEFT table will be returned, whether there is a corresponding one in the right one or not.
??
select u.uid, u.mail, l.email, l.id
from users u
left outer join legacy_users
on u.mail = l.email
-- two queries to get you going
select u.uid, u.mail, l.email, l.id
from users u
left outer join legacy_users
on u.mail = l.email
Where l.id is null
select l.email, l.id, u.uid, u.mail
from legacy_users l
left outer join users u
on l.email = u.mail
Where u.uid is null
Thanks to Oded's answer this is what I ended up with:
SELECT *
FROM (
SELECT id, mail, uid
FROM users
LEFT OUTER JOIN
legacy_users lu ON users.mail = lu.email
UNION DISTINCT
SELECT id, email, uid
FROM users
RIGHT OUTER JOIN
legacy_users lu ON users.mail = lu.email
) j
WHERE uid IS NULL
OR id IS NULL;
This also allowed me to do a where on the results. Bonus.
Note that it's using mail in the left join and email in the right join. Since mail wouldn't exist in the right outer join we have to use the email column from legacy_users and vice versa.