Improve MySQL nested select performance with join - mysql

I have seen many samples to improve MySQL nested selects with joins, but I can't figure this out for this query:
SELECT * FROM messages WHERE answer = 'SuccessSubscribed' AND phone NOT IN
(SELECT phone FROM messages WHERE answer = 'SuccessUnSubscribed');
the query finds people who have subscribed but never unsubscribed.
Table structure:
CREATE TABLE `messages` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`phone` varchar(12) COLLATE utf8_persian_ci NOT NULL,
`content` varchar(300) COLLATE utf8_persian_ci NOT NULL,
`flags` int(10) unsigned NOT NULL DEFAULT '0',
`answer` varchar(50) COLLATE utf8_persian_ci DEFAULT NULL,
....,
PRIMARY KEY (`id`),
....
) ENGINE=InnoDB CHARSET=utf8 COLLATE=utf8_persian_ci

Instead of the NOT IN, you can use LEFT JOIN with NULL check.
SELECT M1.*
FROM messages M1
LEFT JOIN messages M2 ON M2.phone = M1.phone AND M2.answer = 'SuccessUnSubscribed'
WHERE M1.answer = 'SuccessSubscribed' AND M2.phone IS NULL

Related

Mysql - PHP - Checking a users posts and comments likes

Would anyone be able to recommend the best way to check if a user has liked a post or comment?
I am currently building a website that has similair features to Facebooks wall.
My website will show a 'wall' of posts from people you follow that you can like or comment on.
For example, comments I have:
Comments table containing: id, user_id, text (plus other columns)
Comments Likes table: comment_id, user_id, created
This is the current query I use to get the comments and checks if user has liked it using an inner join on the likes table. It uses an IF() to return liked as either 1 or empty, which works fine:
SELECT comments.id, comments.post_id, comments.user_id, comments.reply_id, comments.created, comments.text, comments.likes, comments.replies, comments.flags, user.name, user.tagline, user.photo_id, user.photo_file, user.public_key,
**IF(likes.created IS NULL, '', '1') as 'liked'**
FROM events_feed_comments AS comments
INNER JOIN user AS user ON comments.user_id = user.id
**LEFT JOIN events_feed_comments_likes AS likes ON comments.id = likes.comment_id AND likes.user_id = :user**
WHERE comments.post_id = :post_id AND comments.reply_id IS NULL
ORDER BY comments.created DESC
LIMIT :limit OFFSET :offset
However, I realise that this will not be cacheable for anyone else as it contains the logged in users likes. There may end up being a lot of posts and so will need to introduce caching.
I am wondering what the best way to check the likes will be?
At the moment these are the solutions i can think of:
I could either select all the comments limited to say 30 at a time (cacheable)
Then loop over each result doing a fetch/count query in the likes table to see if a user has liked it.
I could do a fetch from the likes table doing a where in clause using the returned 30 id results.
Then do some sort of looping to see if the likes value matches the returned results.
Fetch all of the comments (cacheable), fetch all of a users likes (could be cacheable?), then do some looping / comparing to see if the values match.
I am just not sure what would be the best solution, or if there is some other recommended way to achieve this?
I am thinking the second approach may be best but i'm interested to see what you think?
Updates to show the table Create statements
CREATE TABLE `events_feed_comments` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`post_id` int(11) NOT NULL,
`reply_id` int(11) DEFAULT NULL,
`text` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
`likes` int(11) NOT NULL,
`replies` int(11) NOT NULL,
`flags` smallint(6) NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
CREATE TABLE `events_feed_comments_likes` (
`comment_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`created` datetime NOT NULL,
PRIMARY KEY (`comment_id`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`photo_id` int(11) DEFAULT NULL,
`email` varchar(180) COLLATE utf8mb4_unicode_ci NOT NULL,
`roles` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`roles`)),
`password` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`name` varchar(80) COLLATE utf8mb4_unicode_ci NOT NULL,
`tagline` varchar(120) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`biography` varchar(2000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`social` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`social`)),
`specialties` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`available` smallint(6) NOT NULL DEFAULT 0,
`theme` varchar(7) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`photo_file` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`following` int(11) NOT NULL,
`followers` int(11) NOT NULL,
`is_private` smallint(6) NOT NULL,
`public_key` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
`show_groups` smallint(6) NOT NULL,
`show_feed` smallint(6) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_8D93D649E7927C74` (`email`),
UNIQUE KEY `UNIQ_8D93D64966F9D463` (`public_key`),
UNIQUE KEY `UNIQ_8D93D6497E9E4C8C` (`photo_id`),
CONSTRAINT `FK_8D93D6497E9E4C8C` FOREIGN KEY (`photo_id`) REFERENCES `photos` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Instead of using If in your SQL query, consider using SQL Case to simplify the query.
CASE --
WHEN ---- THEN '----'
ELSE '---'
END
For performance:
comments: INDEX(post_id, reply_id, created)
likes: INDEX(comment_id, user_id, created)
Those improvements may eliminate the need for "caching".
Please put "filtering" in WHERE, such as AND likes.user_id = :user** and put "relations" in ON. It can matter when using LEFT, and does help a human reading the query.
If events_feed_comments_likes is a many-to-many mapping table, you may want INDEX(user_id) also.
I assume this is the query in question:
SELECT comments.id, comments.post_id, comments.user_id, comments.reply_id,
comments.created, comments.text, comments.likes, comments.replies,
comments.flags, user.name, user.tagline,
user.photo_id, user.photo_file, user.public_key,
IF(likes.created IS NULL, '', '1') as 'liked'
FROM events_feed_comments AS comments
INNER JOIN user AS user ON comments.user_id = user.id
LEFT JOIN events_feed_comments_likes AS likes ON comments.id = likes.comment_id
AND likes.user_id = :user
WHERE comments.post_id = :post_id
AND comments.reply_id IS NULL
ORDER BY comments.created DESC
LIMIT :limit OFFSET :offset

MySql query is slow with join - how to speed it up

I have to export 554k records from our mysql db. At the current rate it will take 5 days to export the data and the slowness is mainly caused by the query below. The data structure consists of
Companies
--Contacts
----(Contact)Activities
For the contacts, we have an index on company_id. On the activities table, we have an index for contact_id and company_id which map back to the respective contacts and companies tables.
I need to grab each contact and the latest activity date that they have. This is the query that I'm running and it takes about .5 second to execute.
Select *
from contacts
left outer join (select occurred_at
,contact_id
from activities
where occurred_at is not null
group by contact_id
order by occurred_at desc) activities
on contacts.id = activities.contact_id
where company_id = 20
If I remove the join and just select * from contacts where company_id=20 the query executes in .016 sec.
If I use Explain for info on the join query I get this
Any ideas on how I can speed this up?
Edit:
Here are the table definitions.
CREATE TABLE `companies` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`street_address` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`city` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`state` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`county` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`website` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`external_id` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`falloff_date` date DEFAULT NULL,
`zipcode` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`phone` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`company_id` int(11) DEFAULT NULL,
`order_count` int(11) NOT NULL DEFAULT '0',
`active_job_count` int(11) NOT NULL DEFAULT '0',
`duplicate_of` int(11) DEFAULT NULL,
`warm_date` datetime DEFAULT NULL,
`employee_size` int(11) DEFAULT NULL,
`dup_checked` tinyint(1) DEFAULT '0',
`rating` int(11) DEFAULT NULL,
`delinquent` tinyint(1) DEFAULT '0',
`cconly` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `index_companies_on_name` (`name`),
KEY `index_companies_on_user_id` (`user_id`),
KEY `index_companies_on_company_id` (`company_id`),
KEY `index_companies_on_external_id` (`external_id`),
KEY `index_companies_on_state_and_dup_checked` (`id`,`state`,`dup_checked`,`duplicate_of`),
KEY `index_companies_on_dup_checked` (`id`,`dup_checked`),
KEY `index_companies_on_dup_checked_name` (`dup_checked`,`name`),
KEY `index_companies_on_county` (`county`,`state`)
) ENGINE=InnoDB AUTO_INCREMENT=15190300 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `contacts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`last_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`phone` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`extension` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`fax` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`email` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`active` tinyint(1) DEFAULT NULL,
`main` tinyint(1) DEFAULT NULL,
`company_id` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`external_id` int(11) DEFAULT NULL,
`second_phone` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_contacts_on_company_id` (`company_id`),
KEY `index_contacts_on_first_name` (`first_name`),
KEY `index_contacts_on_last_name` (`last_name`),
KEY `index_contacts_on_phone` (`phone`),
KEY `index_contacts_on_email` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=11241088 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `activities` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`kind` int(11) DEFAULT NULL,
`contact_id` int(11) DEFAULT NULL,
`call_status` int(11) DEFAULT NULL,
`occurred_at` datetime DEFAULT NULL,
`notes` text COLLATE utf8_unicode_ci,
`user_id` int(11) DEFAULT NULL,
`scheduled_for` datetime DEFAULT NULL,
`priority` tinyint(1) DEFAULT NULL,
`company_id` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`from_user_id` int(11) DEFAULT NULL,
`to_user_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_activities_on_contact_id` (`contact_id`),
KEY `index_activities_on_user_id` (`user_id`),
KEY `index_activities_on_company_id` (`company_id`)
) ENGINE=InnoDB AUTO_INCREMENT=515340 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
This is a greatest-n-per-group query, which comes up frequently on Stack Overflow.
Here's a solution that uses a MySQL 8.0 window function:
WITH latest_activities AS (
SELECT contact_id, occurred_at,
ROW_NUMBER() OVER (PARTITION BY contact_id ORDER BY occurred_at DESC) AS rn
FROM activities
)
SELECT *
FROM contacts AS c
LEFT OUTER JOIN latest_activities
ON c.id = latest_activities.contact_id AND latest_activities.rn = 1
WHERE c.company_id = 20
Here's a solution that should work on pre-8.0 versions:
SELECT c.*, a.*
FROM contacts AS c
LEFT OUTER JOIN activities AS a ON a.contact_id = c.id
LEFT OUTER JOIN activities AS a2 ON a2.contact_id = c.id
AND a2.occurred_at > a.occurred_at
WHERE c.company_id = 20
AND a2.contact_id IS NULL;
Another solution:
SELECT c.*, a.*
FROM contacts AS c
LEFT OUTER JOIN activities AS a ON a.contact_id = c.id
LEFT OUTER JOIN (
SELECT c2.contact_id, MAX(a2.occurred_at) AS occurred_at
FROM activities AS a2
INNER JOIN contacts AS c2 ON a2.contact_id = c2.id
WHERE c2.company_id = 20
GROUP BY c2.contact_id ORDER BY NULL
) AS latest_activities
ON latest_activities.contact_id = c.id
AND latest_activities.occurred_at = a.occurred_at
WHERE c.company_id = 20
It would be helpful to create a new index on activities (contact_id, occurred_at).
Don't use subqueries in the FROM clause if you can help it. They impede the MySQL optimizer. So, if you want one row:
Select c.*, a.occurred_at
from contacts c left outer join
from activities a
on c.id = a.contact_id and
a.occurred_at is not null
where c.company_id = 20
order by a.occurred_at desc
limit 1;
If you want one row per contact_id:
Select c.*, a.occurred_at
from contacts c left outer join
from activities a
on c.id = a.contact_id and
a.occurred_at is not null and
a.occurred_at = (select max(a2.occurred_at)
from activities a2
where a2.contact_id = a.contact_id
)
where c.company_id = 20
order by a.occurred_at desc
limit 1;
This can make use of an index on activities(contact_id, occured_at). and contact(company_id, contact_id).
Your query is doing one thing that is a clear no-no -- and no longer supported by the default settings in the most recent versions of MySQL. You have unaggregated columns in a select that are not in the group by. The contact_id should be generating an error.
I feel like I am overlooking something with how complicated the other answers are, but I would think this would be all you need.
SELECT c.*
, MAX(a.occurred_at) AS occurred_at
FROM contacts AS c
LEFT JOIN activities AS a
ON c.id = a.contact_id AND a.occurred_at IS NOT NULL
WHERE c.company_id = 20
GROUP BY c.id;
Notes: (1) this assumes you didn't actually want the duplicate contact_id from your original subquery to be in the final results. (2) This also assumes your server is not configured to require a full group by; if it is, you will need to manually expand c.* into the full column list, and copy that list to the GROUP BY clause as well.
Expanding on dnoeth's comments to your question; if you are not querying each company separately for a particular reason (chunking for load, code structure handling this also handles other stuff company by company, whatever), you could tweak the above query like so to get all your results in one query.
SELECT con.*
, MAX(a.occurred_at) AS occurred_at
FROM companies AS com
INNER JOIN contacts AS con ON com.id = con.company_id
LEFT JOIN activities AS a
ON con.id = a.contact_id AND a.occurred_at IS NOT NULL
WHERE [criteria for companies chosen to be queried]
GROUP BY con.id
ORDER BY con.company_id, con.id
;

MySQL performance with nested sub query

I have two tables messages and members. I tried joining tables without having a nested query but it does not reflect the join on members. So, I initially thought that I could do the following
SELECT M1.*, COUNT(M2.emid) AS replies FROM messages M1
LEFT JOIN messages M2
ON M2.thread = M1.emid
INNER JOIN members M
ON M.meid = M1.emitter
WHERE
M1.thread is NULL AND
M1.receiver = 2
GROUP BY
M1.emid
but it does not seem to join the corresponding member. Then I tried this and it gives me the result that I need but I would like to know if there is a way to accomplish the same result using joins without the nested query
SELECT * FROM (
SELECT M1.*, COUNT(M2.emid) AS replies FROM messages M1
LEFT JOIN messages M2
ON M2.thread = M1.emid
WHERE
M1.thread is NULL AND
M1.receiver = 2
GROUP BY
M1.emid
) O INNER JOIN members M ON O.receiver = M.meid
-- Table structure for table members
CREATE TABLE `members` (
`meid` bigint(64) NOT NULL,
`name` varchar(32) DEFAULT NULL,
`lastname` varchar(32) DEFAULT NULL,
`email` varchar(128) NOT NULL,
`mobile` char(10) DEFAULT NULL,
`college` bigint(64) NOT NULL,
`major` bigint(64) NOT NULL,
`password` varchar(256) NOT NULL,
`oauth` varchar(128) DEFAULT NULL,
`confirmed` tinyint(4) DEFAULT NULL,
`active` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`joined` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-- Table structure for table messages
CREATE TABLE `messages` (
`emid` bigint(20) NOT NULL,
`emitter` bigint(20) NOT NULL,
`receiver` bigint(20) NOT NULL,
`thread` bigint(20) DEFAULT NULL,
`opened` tinyint(4) DEFAULT '0',
`message` blob NOT NULL,
`timecard` datetime DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

multiple indexes on table not used in query

i have multiple indexes on my table. Two of the indexes relate to the latitude and longitude of the members and is used to calculate the distance between the members.
the problem is that after i calculate the distance between the members i then need to filter the result set by other conditions i.e
i first want to select all member who live within a 5 miles radius.
BASED ON that result i then want to filter the result by member who meet other criteria.
i am not sure if this is possible and would really welcome advise on the best way to proceed with this query.
i got the formulate for calculating the distance between members from here:
below is my query.
SELECT *
FROM (
SELECT
u.qualityOfPictures,u.id,
st.statement,
ph.thumbsize
,
3956 * ACOS(COS(RADIANS('51.5247725')) * COS(RADIANS(`latitude`)) * COS(RADIANS('-0.13342680000005203') - RADIANS(`longitude`)) + SIN(RADIANS('51.5247725')) * SIN(RADIANS(`latitude`))) AS `distance`
FROM user u
LEFT OUTER JOIN
members_statement st
ON
u.id = st.id
LEFT OUTER JOIN
list_photos_uploaded_by_members ph
ON
u.id = ph.id
WHERE
`latitude`
BETWEEN '51.5247725' - ('5' / 69)
AND '51.5247725' + ('5' / 69)
AND `longitude`
BETWEEN '-0.13342680000005203' - ('5' / (69 * COS(RADIANS('51.5247725'))))
AND '-0.13342680000005203' + ('5' / (69* COS(RADIANS('51.5247725'))))
AND
u.live = 1
AND
u.membershipType =1
AND jobId IN (2,4,1)
) r
WHERE `distance` < '5'
ORDER BY `distance` ASC
this is the explain statement when i
AND this is the query in SQLfiddle
This is the explain statment when i take out the other conditions and just leave in the query for the distance.
this is the query in SQLFiddle
these are the tables:
CREATE TABLE IF NOT EXISTS `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`membershipType` smallint(6) DEFAULT NULL,
`qualityOfPictures` smallint(6) DEFAULT NULL,
`city` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`latitude` varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL,
`longitude` varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL,
`live` tinyint(1) NOT NULL,
`jobId` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `quick_search` (`live`,`membershipType`,`jobId`,`city`(5)),
KEY `latitude` (`latitude`),
KEY `longitude` (`longitude`)
);
member statement table
CREATE TABLE IF NOT EXISTS `members_statement` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`statement` varchar(2000) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE IF NOT EXISTS `list_photos_uploaded_by_members` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`thumbsize` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ;

Nested selects in MySQL

Setting:
Each page on my site has four widgets that are arranged in different orders (1-4).
I have a table 'content' and table 'widgets'. I have a bridging table that maps content.id to widgets.content_id.
Problem:
What I want to do is run a query that selects * from content along with addition columns widget_1, widget_2, widget_3, widget_4, each containing the id of the widget linked to that page.
I've been trying some nested selects all morning and can't seem to crack it. I've copied the MySQL dumps of the involved tables below :-).
CREATE TABLE `content` (
`id` int(11) NOT NULL auto_increment,
`permalink` varchar(64) character set latin1 NOT NULL,
`parent` int(11) NOT NULL default '1',
`title` varchar(128) character set latin1 NOT NULL,
`content` text character set latin1,
`content_type` varchar(16) NOT NULL default 'page',
PRIMARY KEY (`id`),
FULLTEXT KEY `title` (`title`,`content`,`meta_description`,`meta_keywords`)
)
CREATE TABLE `widgets` (
`id` int(11) unsigned NOT NULL auto_increment,
`title` varchar(64) default NULL,
`text` varchar(256) default NULL,
`image` varchar(128) default NULL,
`target` varchar(128) default NULL,
`code` varchar(32) default NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE `content_widgets` (
`content_id` int(11) NOT NULL,
`widget_id` int(11) NOT NULL,
`order` tinyint(4) NOT NULL
)
thanks a lot!
You don't need a nested query - just a join. Assuming that you want to start with a content record and return the matching widgets....
SELECT c.*, w.*
FROM content c
LEFT JOIN (
content_widgets cw INNER JOIN widgets w
ON cw.widget_id=w.id
) ON c.id=cw.id
WHERE c.id=....
Although a simple innter join is a better idea of you know you've got the widgets:
SELECT c.*, w.*
FROM content c, content_widgets cw widgets w
WHERE cw.widget_id=w.id
AND c.id=cw.id
AND c.id=....