SQL JOIN with conditions - mysql

I have a conversation table which contains two users ids as foreign keys, and the user table which contains the users details. I want to write a query which returns the conversation table joined to the user table but displaying the name and surname of the user whose id wasn't sent as the parameter.
CREATE TABLE `conversation` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_one_id` int(11) NOT NULL,
`user_two_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE `user` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`surname` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
)
For example I have
Conversation:
id user_one_id user_two_id
1 1 2
User:
id name surname
1 userone_name userone_surname
2 usertwo_name usertwo_surname
I want a query that will return user_two's name and surname in the join, not user one.
My current query:
SELECT c.id, c.user_one_id, c.user_two_id, u.name, u.surname * FROM conversation c
JOIN user u
WHERE c.user_one_id = 1
OR c.user_two_id = 1
AND IF (c.user_one_id = u.id, c.user_two_id = u.id, c.user_one_id = u.id)
GROUP BY c.id
ORDER BY c.date DESC;

[INNER] JOIN should have an ON clause. (I consider it a flaw that MySQL allows you to omit it.)
The join criteria would have to be: Give me the user of the conversation that is not user 1.
SELECT c.id, c.user_one_id, c.user_two_id, u.name, u.surname
FROM conversation c
JOIN user u ON u.id IN (c.user_one_id, c.user_two_id) AND u.id <> 1
WHERE c.user_one_id = 1 OR c.user_two_id = 1
ORDER BY c.date DESC;

Related

Optimizing a query for loading message history in a chat app

I have 2 tables, which are a users table, and a messages table
`users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(35) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `username` (`username`)
) ENGINE=MyISAM AUTO_INCREMENT=859312 DEFAULT CHARSET=utf8
`messages` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sender_id` int(11) NOT NULL,
`receiver_id` int(11) NOT NULL,
`message` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `by_sender_id_and_receiver_id` (`sender_id`,`receiver_id`),
KEY `by_sender_id` (`sender_id`),
KEY `by_receiver_id` (`receiver_id`)
) ENGINE=MyISAM AUTO_INCREMENT=56762871 DEFAULT CHARSET=latin1
When a user (whose user id is 108) loads their chat history, I am currently using the following query to list all the people that user has messaged, ordered by most recent.
SELECT u.username, m.sender_id, m.receiver_id, m.date
FROM messages m
JOIN users u ON ( u.id = m.sender_id
AND m.receiver_id = 108
OR u.id = m.receiver_id
AND m.sender_id = 108 )
GROUP BY u.id
ORDER BY m.date DESC
When I use EXPLAIN, I get the following results
I am wondering if there are any obvious ways to optimize this query, whether it is by altering indexes or rewriting the query itself. My messages table has over 50 million rows.
(from Comment) The GROUP BY is to only select the last message from each user.
The OR criterion in the join is a real performance killer. One way to workaround this would be to phrase the query using a union:
SELECT u.username, m.sender_id, m.receiver_id, m.date
FROM messages m
INNER JOIN users u ON u.id = m.receiver_id
WHERE m.sender_id = 108
UNION ALL
SELECT u.username, m.receiver_id, m.sender_id, m.date
FROM messages m
INNER JOIN users u ON u.id = m.sender_id
WHERE m.receiver_id = 108;
The above query can be optimized by adding the following indices to the messages table:
CREATE INDEX msg_idx_1 ON messages (sender_id, receiver_id, date);
CREATE INDEX msg_idx_2 ON messages (receiver_id, sender_id, date);
These indices should speed up the joins in the two halves of the union query above.
Note that I dropped the GROUP BY clause, which wasn't doing anything and also seemed not needed.

Tweets implementation MySQL and queries

I am implementing a simple follow/followers system in MySQL. So far I have three tables that look like:
CREATE TABLE IF NOT EXISTS `User` (
`user_id` INT AUTO_INCREMENT PRIMARY KEY,
`username` varchar(40) NOT NULL ,
`pswd` varchar(255) NOT NULL,,
`email` varchar(255) NOT NULL ,
`first_name` varchar(40) NOT NULL ,
`last_name` varchar(40) NOT NULL,
CONSTRAINT uc_username_email UNIQUE (username , email)
);
-- Using a middle table for users to follow others on a many-to-many base
CREATE TABLE Following (
follower_id INT(6) NOT NULL,
following_id INT(6) NOT NULL,
KEY (`follower_id`),
KEY (`following_id`)
)
CREATE TABLE IF NOT EXISTS `Tweet` (
`tweet_id` INT AUTO_INCREMENT PRIMARY KEY,
`text` varchar(280) NOT NULL ,
-- I chose varchar vs TEXT as the latter is not stored in the database server’s memory.
-- By querying text data MySQL has to read from it from the disk, much slower in comparison with VARCHAR.
`publication_date` DATETIME NOT NULL,,
`username` varchar(40),
FOREIGN KEY (`username`) REFERENCES `user`(`username`)
ON DELETE CASCADE
);
Lets say I want to write a query that returns the 10 latest tweets by users followed by the user with username "Tom". What is the best way to writhe that query and return results with username, first name, last name, text and publication date.
Also if one minute later I want to query again 10 latest tweets and assuming someone Tom follows tweets during that minute, how do I query the database to not select tweets that have already shown in the first query?
To answer your first question:
SELECT u1.username, u1.first_name, u1.last_name, t.text, t.publication_date
FROM Tweet t
JOIN User u1 ON t.username = u1.username
JOIN Following f ON f.following_id = u1.user_id
JOIN User u2 ON u2.user_id = f.follower_id
WHERE u2.username = 'Tom'
ORDER BY t.publication_date DESC
LIMIT 10
For the second part, simply take the tweet_id from the first row of the first query (so the latest tweet_id value) and use it in the WHERE clause for the next query i.e.
WHERE u2.username = 'Tom'
AND t.tweet_id > <value from previous query>
To get latest 10 tweets for Tom:
select flg.username, flg.first_name, flg.last_name, t.tweet_id, t.text, t.publication_date
from user flr
inner join following f on f.follower_id = flr.user_id
inner join user flg on flg.user_id = f.following_id
inner join tweet t on t.username = flg.username
where flr.username = 'Tom'
order by tweet_id desc
limit 10
To get the next 10 tweets, pass in the max tweet_id, and apply an additional condition in the where clause:
where flr.username = 'Tom'
and t.tweet_id > <previous_max_tweet_id>

Using rows from the first query in the sub-query?

I have a bit complicated query:
SELECT SQL_CALC_FOUND_ROWS DISTINCT l1.item_id, l1.uid, l2.id, l2.uid, u.prename, l1.item_id, l2.item_id,
(SELECT SUM(cnt) FROM
(
SELECT DISTINCT
p1.item_id,
COUNT(*) AS cnt
FROM pages_likes AS p1
JOIN pages_likes AS p2 ON p1.item_id = p2.item_id AND p1.status = p2.status
WHERE p1.uid = 391 AND p2.uid = 1091
GROUP BY p1.id
ORDER BY p1.date DESC
) AS t) AS total
FROM pages_likes l1
JOIN users u on u.id = l1.uid
JOIN pages_likes l2 on l1.item_id = l2.item_id
JOIN users_likes ul on l1.uid = ul.uid
WHERE ul.date >= DATE_SUB(NOW(),INTERVAL 1 WEEK)
AND l1.uid != 1091 AND l2.uid = 1091
AND (l1.status = 1 AND l2.status = 1)
AND u.gender = 2
GROUP BY l1.uid
ORDER BY
total DESC,
l1.uid DESC,
l1.date DESC
What I expect: It should display all users, sorted by total page likes we have in common that also are the most liked users this week.
The thing is that I inserted values (391 and 1091) as user id to test the query. But since it should be dynamic I'll need to use the row of the first query l1.uid in the subquery, so it should be WHERE p1.uid = l1.uid AND p2.uid = 1091 but mysql can't find the row.
status = 1 means user liked this page, status = 0 means user disliked this page.
Table structure here:
CREATE TABLE pages_likes
(
id BIGINT PRIMARY KEY NOT NULL AUTO_INCREMENT,
uid INT NOT NULL,
date DATETIME NOT NULL,
item_id INT,
status TINYINT
);
CREATE INDEX item_index ON pages_likes (item_id);
CREATE INDEX uid_index ON pages_likes (uid);
CREATE TABLE users
(
id BIGINT PRIMARY KEY NOT NULL AUTO_INCREMENT,
fb_uid VARCHAR(255),
email VARCHAR(100) NOT NULL,
pass VARCHAR(50) NOT NULL,
gender TINYINT NOT NULL,
birthdate DATE,
signup DATETIME NOT NULL,
lang VARCHAR(10) NOT NULL,
username VARCHAR(255),
prename VARCHAR(255) NOT NULL,
surname VARCHAR(255) NOT NULL,
projects VARCHAR(255) NOT NULL,
views INT DEFAULT 0,
verified DATETIME
);
CREATE UNIQUE INDEX id_index ON users (id);
CREATE INDEX uid_index ON users (id);
CREATE TABLE users_likes
(
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
uid INT NOT NULL,
date DATETIME NOT NULL,
item_id INT,
status TINYINT
);
CREATE INDEX item_index ON users_likes (item_id);
CREATE INDEX uid_index ON users_likes (uid);
Have you tried different alias names for your subquery and then use alias from outer query? It works for me in this simple example: http://rextester.com/MJOL87502
Sadly I cannot test in your sqlfiddle, since that site often doesn't respond or throws errors (like it does now).
You could also use Window Functions and replace your subselect with something as simple as SUM(*) OVER (PARTITION BY p1.id, p1.item_id), but MySQL does not support Window Functions.

Can't get my query to run any faster on MySQL database with 2M entries

I have this payments table, with about 2M entries
CREATE TABLE IF NOT EXISTS `payments` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned NOT NULL,
`date` datetime NOT NULL,
`valid_until` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `date_id` (`date`,`id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=2113820 ;
and this users table from ion_auth plugin/library for CodeIgniter, with about 320k entries
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`ip_address` varbinary(16) NOT NULL,
`username` varchar(100) NOT NULL,
`password` varchar(80) NOT NULL,
`salt` varchar(40) DEFAULT NULL,
`email` varchar(100) NOT NULL,
`activation_code` varchar(40) DEFAULT NULL,
`forgotten_password_code` varchar(40) DEFAULT NULL,
`forgotten_password_time` int(11) unsigned DEFAULT NULL,
`remember_code` varchar(40) DEFAULT NULL,
`created_on` int(11) unsigned NOT NULL,
`last_login` int(11) unsigned DEFAULT NULL,
`active` tinyint(1) unsigned DEFAULT NULL,
`first_name` varchar(50) DEFAULT NULL,
`last_name` varchar(50) DEFAULT NULL,
`company` varchar(100) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `name` (`first_name`,`last_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=322435 ;
I'm trying to get both the user information and his last payment. Ordering(ASC or DESC) by ID, first and last name, the date of the payment, or the payment expiration date. To create a table showing users with expired payments, and valid ones
I've managed to get the data correctly, but most of the time, my queries take 1+ second for a single user, and 40+ seconds for 30 users. To be honest I have no idea if it's possible to get the information under 1 second. Also probably my application is never going to reach this number of entries, probably a maximum of 10k payments and 300 users
My query, works pretty well with few entries and it's easy to change the ordering:
SELECT users.id, users.first_name, users.last_name, users.email, final.id AS payment_id, payment_date, final.valid_until AS payment_valid_until
FROM users
LEFT JOIN (
SELECT * FROM (
SELECT payments.id, payments.user_id, payments.date AS payment_date, payments.valid_until
FROM payments
ORDER BY payments.valid_until DESC
) AS p GROUP BY p.user_id
) AS final ON final.user_id = users.id
ORDER BY id ASC
LIMIT 0, 30"
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY users ALL NULL NULL NULL NULL 322269 Using where; Using temporary; Using filesort
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 50
4 DEPENDENT SUBQUERY users_deactivated unique_subquery user_id user_id 4 func 1 Using index
2 DERIVED <derived3> ALL NULL NULL NULL NULL 2072327 Using temporary; Using filesort
3 DERIVED payments ALL NULL NULL NULL NULL 2072566 Using filesort
I'm open to any suggestions and tips, since I'm new to PHP, MySQL and stuff, and don't really know if I'm doing the correct way
I would first suggest removing the ORDER BY clause from your subquery -- I don't see how it's helping as you're reordering by id in your outer query.
You should also be able to move your GROUP BY statement into your subquery:
SELECT users.id, users.first_name, users.last_name, users.email, final.id AS payment_id, payment_date, final.valid_until AS payment_valid_until
FROM users
LEFT JOIN (
SELECT payments.id, payments.user_id, payments.date AS payment_date, payments.valid_until
FROM payments
GROUP BY payments.user_id
) AS final ON final.user_id = users.id
ORDER BY users.id ASC
LIMIT 0, 30
Given your comments, how about this -- not sure it would be better than your current query, but ORDER BY can be expensive:
SELECT users.id, users.first_name, users.last_name, users.email, p.id AS payment_id, p.payment_date, p.valid_until AS payment_valid_until
FROM users
LEFT JOIN payments p ON p..user_id = users.id
LEFT JOIN (
SELECT user_id, MAX(valid_until) Max_Valid_Until
FROM payments
GROUP BY user_id
) AS maxp ON p.user_id = maxp.user_id and p.valid_until = maxp.max_valid_until
ORDER BY users.id ASC
LIMIT 0, 30
use an index on the payments table for users, that and do the group by on the payments table...
alter table payments add index (user_id);
your query
ORDER BY users.id ASC
alter table payments drop index user_id;
and why don't you use the payments "id" instead of "valid_until" ? Is there a reason to not trust the ids are sequential? if you don't trust the id add index to the valid_until field:
alter table payments add index (valid_until) desc;
and don't forget to drop it later
alter table payments drop index valid_intil;
if the query is still slow you will need to cache the results... this means you need to improve your schema, here is a suggestion:
create table last_payment
(user_id int,
constraint pk_last_payment primary key user_id references users(id),
payment_id int,
constraint fk_last_payment foreign key payment_id references payments(id)
);
alter table payments add index (user_id);
insert into last_payment (user_id, payment_id)
(select user_id, max(id) from payments group by user_id);
#here you probably use your own query if the max (id) does not refer to the last payment...
alter table payments drop index user_id;
and now comes the magic:
delimiter |
CREATE TRIGGER payments_trigger AFTER INSERT ON payments
FOR EACH ROW BEGIN
DELETE FROM last_payment WHERE user_id = NEW.user_id;
INSERT INTO last_payment (user_id, payment_id) values (NEW.user_id, NEW.id);
END;
|
delimiter ;
and now every-time you want to know the last payment made you need to query the payments_table.
select u.*, p.*
from users u inner join last_payment lp on (u.id = lp.user_id)
inner join payments on (lp.payment_id = p.id)
order by user_id asc;
Maybe something like this...
SELECT u.id
, u.first_name
, u.last_name
, u.email
, p.id payment_id
, p.payment_date
, p.payment_valid_until
FROM users u
JOIN payments p
ON p.user_id = u.id
JOIN
( SELECT user_id,MAX(p.valid_until) max_valid_until FROM payments GROUP BY user_id ) x
ON x.user_id = p.user_id
AND x.may_valid_until = p.valid_until;
The problem with joining to a sub query is that MySql internally generates the result of the sub query before performing the join. This is expensive in resources and is probably taking the time. Best solution is to change the query to avoid sub queries.
SELECT users.id, users.first_name, users.last_name, users.email, max(payments.id) AS payment_id, max(payments.date) as payment_date, max(payments.valid_until) AS payment_valid_until
FROM users
LEFT JOIN payments use index (user_id) on payments.user_id=users.id
group by users.id
ORDER BY id ASC
LIMIT 0, 30
This query is only correct , however, if the largest values for valid_until, payment_date and payment_date are always in the same record.
SELECT payments.users_id, users.first_name, users.last_name,
users.email, (final.id), MAX(payment.date), MAX(final.valid_until)
FROM payments final
JOIN users ON final.user_id = users.id
GROUP BY final.user_id
ORDER BY final.user_id ASC
LIMIT 0, 30
The idea is to flatten the payments first.
The MAX fields of course are of different payment records.
Speed up
Above I did a MySQL specific thing: final.id without MAX. Better not use the field at all.
If you could leave out the payments.id, it would be faster (with the appropiate index).
KEY `user_date` (`user_id`, `date` DESC ),
KEY `user_valid` (`user_id`, `valid_until` DESC ),

Select 3 tables with count and join

I've 3 tables tb1, users, users_credits.
My gol is to combine two select (sel1, sel2) into a single view and
display 0 in the sel2 where there isn't rows (left join?)
sel1
SELECT
users.userid,
users.datareg,
users_credits.credits,
FROM
users,
users_credits,
WHERE
users.userid = users_credits.userid
Sel2
SELECT COUNT(*) FROM tb1 where tb1.id_user = users.userid
table structure
tb1
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_user` decimal(11,0) NOT NULL,
`datains` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
users
`userid` int(4) unsigned NOT NULL AUTO_INCREMENT,
`datareg` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`userid`)
users_credits
`id` int(11) NOT NULL AUTO_INCREMENT,
`userid` int(11) NOT NULL,
`credits` decimal(5,0) NOT NULL,
`data` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
What is the best way to do this?
Thanks.
SELECT users.userid,
users.datareg,
users_credits.credits,
COALESCE(c.totalCount,0) totalCount
FROM users
LEFT JOIN users_credits
ON users.userid = users_credits.userid
LEFT JOIN
(
SELECT id_user, COUNT(*) totalCount
FROM tb1
GROUP BY id_user
) c ON c.id_user = users.userid
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
UPDATE 1
SELECT users.userid,
users.datareg,
users_credits.credits,
COALESCE(c.totalCount,0) totalCount,
c.max_datains
FROM users
LEFT JOIN users_credits
ON users.userid = users_credits.userid
LEFT JOIN
(
SELECT id_user, MAX(datains) max_datains, COUNT(*) totalCount
FROM tb1
GROUP BY id_user
) c ON c.id_user = users.userid
UPDATE 2
you need to create two views for this:
1st View:
CREATE VIEW tbl1View
AS
SELECT id_user, MAX(datains) max_datains, COUNT(*) totalCount
FROM tb1
GROUP BY id_user
2nd View
CREATE VIEW FullView
AS
SELECT users.userid,
users.datareg,
users_credits.credits,
COALESCE(c.totalCount,0) totalCount,
c.max_datains
FROM users
LEFT JOIN users_credits
ON users.userid = users_credits.userid
LEFT JOIN tbl1View c ON c.id_user = users.userid