I have a simple messaging system - keeping all the messages in a single table. Each message can(and should) be associated with one of the 3 other tables, that represent some sections of the website. Here is the create table statement
CREATE TABLE `messages` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`from_user_id` int(11) DEFAULT NULL,
`to_user_id` int(11) DEFAULT NULL,
`message` text COLLATE utf8_bin,
`table1_id` int(11) DEFAULT NULL,
`table2_id` int(11) DEFAULT NULL,
`table3_id` int(11) DEFAULT NULL,
`is_unread` tinyint(1) DEFAULT NULL,
`date` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
For each message, from table1_id, table2_id and table3_id if some column has value, it means the rest 2 are null. Here is the sqlfiddle structure and example data: http://sqlfiddle.com/#!9/b98a2/1/0.
So, table1_id, table2_id and table3_id are sort of threads, that I am using when grouping - to show the list of messages. Here is my query
SELECT
id,
table1_id,
table2_id,
table3_id,
message,
from_user_id,
to_user_id,
COUNT(table1_id) AS t1_count,
COUNT(table2_id) AS t2_count,
COUNT(table3_id) AS t3_count,
MAX(CASE WHEN to_user_id = 10 AND is_unread = 1 THEN 1 END) AS is_unread,
COUNT(CASE WHEN to_user_id = 10 THEN 1 END) AS inbox_count
FROM
messages
WHERE to_user_id = 10 OR from_user_id = 10
GROUP BY table1_id,
table2_id,
table3_id
ORDER BY id DESC
and this is in sqlfiddle http://sqlfiddle.com/#!9/b98a2/2/0
This query works fine when I have to show all the messages, but if e.g. I want to show only inbox of the user with id = 10 I have to check the condition that for each thread there is at least one received message, so for that I tried to apply the condition AND inbox_count > 0, which resulted an error Unknown column 'inbox_count' in 'where clause'.
I am trying to list messages similar to gmail - showing the number of total messages(t1_count, t2_count or t3_count) per thread, that is why I can not remove the OR from_user_id = 10 part.
Why it can not find that column and how I can apply that condition to show the list of received(outgoing) only messages.
Unless I completely misunderstand your intent... If you want to filter with inbox_count > 0 to then I think you want to add a
HAVING COUNT(CASE WHEN to_user_id = 10 THEN 1 END) > 0
after the group by clause. This would remove the "threads" that don't have any message with to_user = 10.
See this fiddle for and example.
Related
For the following query I get the error This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
SELECT * FROM `wp_dash_competition_versions` _versions
WHERE DATEDIFF(CURRENT_TIMESTAMP, _versions.created_at) > 20 AND
id NOT IN (
SELECT id FROM `wp_dash_competition_versions` WHERE competition_id = _versions.competition_id
ORDER BY created_at LIMIT 5
)
Other SO articles recommend the use of a join, however the join query doesn't apply the limit to a given parent_id.
Is it possible to achieve a single query that selects rows from a child table skipping the first 5 rows for a given parent_id?
I'm developing the query primarily for a delete operation to prune the table. I need the select first to ensure the statement is correct.
CREATE TABLE `wp_dash_competition_versions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`competition_id` bigint(20) unsigned DEFAULT NULL,
`competition_serialised` mediumtext COLLATE utf8mb4_unicode_520_ci,
`created_user_id` bigint(20) unsigned NOT NULL,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
)
SELECT _version.*
FROM `wp_dash_competition_versions` _version
INNER JOIN (
SELECT competition_id, GROUP_CONCAT(id ORDER BY created_at DESC) _versions
FROM `wp_dash_competition_versions` GROUP BY competition_id
) group_versions
ON _version.competition_id = group_versions.competition_id
AND FIND_IN_SET(id, _versions) > 5
AND DATEDIFF(CURRENT_TIMESTAMP, created_at) > 20
Found in answer https://stackoverflow.com/a/15585351/972457
My app needs to run this query pretty often, which gets a list of user data for the app to display. The problem is that subquery about the user_quiz is resource heavy and calculating the rankings are also very CPU intense too.
Benchmark: ~.5 second each run
When it will be run:
When the user want to see their ranking
When the user want to see other people's ranking
Getting a list of user's friends
.5 second it's a really long time considering this query will be run pretty often. Is there anything I could do to optimize this query?
Table for user:
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`firstname` varchar(100) DEFAULT NULL,
`lastname` varchar(100) DEFAULT NULL,
`password` varchar(20) NOT NULL,
`email` varchar(300) NOT NULL,
`verified` tinyint(10) DEFAULT NULL,
`avatar` varchar(300) DEFAULT NULL,
`points_total` int(11) unsigned NOT NULL DEFAULT '0',
`points_today` int(11) unsigned NOT NULL DEFAULT '0',
`number_correctanswer` int(11) unsigned NOT NULL DEFAULT '0',
`number_watchedvideo` int(11) unsigned NOT NULL DEFAULT '0',
`create_time` datetime NOT NULL,
`type` tinyint(1) unsigned NOT NULL DEFAULT '1',
`number_win` int(11) unsigned NOT NULL DEFAULT '0',
`number_lost` int(11) unsigned NOT NULL DEFAULT '0',
`number_tie` int(11) unsigned NOT NULL DEFAULT '0',
`level` int(1) unsigned NOT NULL DEFAULT '0',
`islogined` tinyint(1) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=230 DEFAULT CHARSET=utf8;
Table for user_quiz:
CREATE TABLE `user_quiz` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`question_id` int(11) NOT NULL,
`is_answercorrect` int(11) unsigned NOT NULL DEFAULT '0',
`question_answer_datetime` datetime NOT NULL,
`score` int(1) DEFAULT NULL,
`quarter` int(1) DEFAULT NULL,
`game_type` int(1) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=9816 DEFAULT CHARSET=utf8;
Table for user_starter:
CREATE TABLE `user_starter` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`result` int(1) DEFAULT NULL,
`created_date` date DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=456 DEFAULT CHARSET=utf8mb4;
My indexes:
Table: user
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
user 0 PRIMARY 1 id A 32 BTREE
Table: user_quiz
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
user_quiz 0 PRIMARY 1 id A 9462 BTREE
user_quiz 1 user_id 1 user_id A 270 BTREE
Table: user_starter
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
user_starter 0 PRIMARY 1 id A 454 BTREE
user_starter 1 user_id 1 user_id A 227 YES BTREE
Query:
SET #curRank = 0;
SET #lastPlayerPoints = 0;
SELECT
sub.*,
#curRank := IF(#lastPlayerPoints!=points_week, #curRank + 1, #curRank) AS rank,
#lastPlayerPoints := points_week AS db_PPW
FROM (
SELECT u.id,u.firstname,u.lastname,u.email,u.avatar,u.type,u.points_total,u.number_win,u.number_lost,u.number_tie,u.verified,
COALESCE(SUM(uq.score),0) as points_week,
COALESCE(us.number_lost,0) as number_week_lost,
COALESCE(us.number_win,0) as number_week_win,
(select MAX(question_answer_datetime) from user_quiz WHERE user_id = u.id and game_type = 1) as lastFrdFight,
(select MAX(question_answer_datetime) from user_quiz WHERE user_id = u.id and game_type = 2) as lastBotFight
FROM `user` u
LEFT JOIN (SELECT user_id,
count(case when result=1 then 1 else null end) as number_win,
count(case when result=-1 then 1 else null end) as number_lost
from user_starter where created_date BETWEEN '2016-01-11 00:00:00' AND '2016-05-12 05:10:27' ) us ON u.id = us.user_id
LEFT JOIN (SELECT * FROM user_quiz WHERE question_answer_datetime BETWEEN '2016-01-11 00:00:00' AND '2016-05-12 00:00:00') uq on u.id = uq.user_id
GROUP BY u.id ORDER BY points_week DESC, u.lastname ASC, u.firstname ASC
) as sub
EXPLAIN:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY <derived2> ALL 3027 100
2 DERIVED u ALL PRIMARY 32 100 Using temporary; Using filesort
2 DERIVED <derived5> ALL 1 100 Using where; Using join buffer (Block Nested Loop)
2 DERIVED <derived6> ref <auto_key0> <auto_key0> 4 fancard.u.id 94 100
6 DERIVED user_quiz ALL 9461 100 Using where
5 DERIVED user_starter ALL 454 100 Using where
4 DEPENDENT SUBQUERY user_quiz ref user_id user_id 4 func 35 100 Using where
3 DEPENDENT SUBQUERY user_quiz ref user_id user_id 4 func 35 100 Using where
Example output and expected output:
Bench mark: around .5 second
The following index should make the subquery to user_quiz ultra fast.
ALTER TABLE user_quiz
ADD INDEX (`user_id`,`game_type`,`question_answer_datetime`)
Please provide SHOW CREATE TABLE tablename statements for all tables, as that will help with additional optimizations.
Update #1
Alright, I've had some time to look things over, and fortunately there a appears to be a lot of relatively low hanging fruit in terms of optimization.
Here are all the indexes to add:
ALTER TABLE user_quiz
ADD INDEX `userGametypeAnswerDatetimes` (`user_id`,`game_type`,`question_answer_datetime`)
ALTER TABLE user_quiz
ADD INDEX `userAnswerScores` (`user_id`,`question_answer_datetime`,`score`)
ALTER TABLE user_starter
ADD INDEX `userResultDates` (`user_id`,`result`,`created_date`)
Note that the names (such as userGametypeAnswerDatetimes) are optional, and you can name them to whatever makes the most sense to you. But, in general, it's good to put specific names on your custom indexes (simply for organization purposes.)
Now, here is your query that should work will with those new indexes:
SET #curRank = 0;
SET #lastPlayerPoints = 0;
SELECT
sub.*,
#curRank := IF(#lastPlayerPoints!=points_week, #curRank + 1, #curRank) AS rank,
#lastPlayerPoints := points_week AS db_PPW
FROM (
SELECT u.id,
u.firstname,
u.lastname,
u.email,
u.avatar,
u.type,
u.points_total,
u.number_win,
u.number_lost,
u.number_tie,
u.verified,
COALESCE(user_scores.score,0) as points_week,
COALESCE(user_losses.number_lost,0) as number_week_lost,
COALESCE(user_wins.number_win,0) as number_week_win,
(
select MAX(question_answer_datetime)
from user_quiz
WHERE user_id = u.id and game_type = 1
) as lastFrdFight,
(
select MAX(question_answer_datetime)
from user_quiz
WHERE user_id = u.id
and game_type = 2
) as lastBotFight
FROM `user` u
LEFT OUTER JOIN (
SELECT user_id,
COUNT(*) AS number_won
from user_starter
WHERE created_date BETWEEN '2016-01-11 00:00:00' AND '2016-05-12 05:10:27'
AND result = 1
GROUP BY user_id
) user_wins
ON user_wins.user_id = u.user_id
LEFT OUTER JOIN (
SELECT user_id,
COUNT(*) AS number_lost
from user_starter
WHERE created_date BETWEEN '2016-01-11 00:00:00' AND '2016-05-12 05:10:27'
AND result = -1
GROUP BY user_id
) user_losses
ON user_losses.user_id = u.user_id
LEFT OUTER JOIN (
SELECT SUM(score)
FROM user_quiz
WHERE question_answer_datetime
BETWEEN '2016-01-11 00:00:00' AND '2016-05-12 00:00:00'
GROUP BY user_id
) user_scores
ON u.id = user_scores.user_id
ORDER BY points_week DESC, u.lastname ASC, u.firstname ASC
) as sub
Note: This is not necessarily the best result. It depends a LOT on your data set, as to whether this is necessarily the best, and sometimes you need to do a bit of trial and error.
A hint as to what you can use trial and error on is the structure of how we query the lastFrdFight and lastBotFight verses how we query points_week, number_week_lost, number_week_win. All of these could either be done in the select statement (like the first two are in my query) or could be done by joining to a subquery result (like the last three do, in my query.)
Mix and match to see what works best. In general, I've found the joining to a subquery to be fastest when you have a large number of rows in the outer query (in this case, querying the user table.) This is because it only needs to get the results once, and then can just match them up on a user by user basis. Other times, it can be better to have the query just in the SELECT clause - this will run MUCH faster, since there are more constants (the user_id is already known), but has to run for each row. So it's a trade off, and why you sometimes need to use trial and error.
Why do the indexes work?
So, you may be wondering why I made the indexes as I did. If you are familiar with phone books (in this age of smartphones, that's no longer a valid assumption I can make) then we can use that as an analogy:
If you had a composite index of phonebookIndex (lastname,firstname,email) on your user table (example here! you don' actually need to add that index!) you would have a result similar to what a phone book provides. (Using email instead of phone number.)
Each index is an internal copy of the data in the overall table. With this phonebookIndex there would internally be stored a list of all users with their lastname, then their first name, and then their email, and each of these would be ordered, just like a phone book.
Why is that useful? Consider when you know someone's first and last name. You can quickly flip to where their last name is, then quickly go through that list of everyone with their last name, finding the first name you want, so obtaining the email.
Indexes work in exactly the same way, in terms of how the database looks at them.
Consider the userGametypeAnswerDatetimes index I defined above, and how we query that index in the lastFrdFight SELECT subquery.
(
select MAX(question_answer_datetime)
from user_quiz
WHERE user_id = u.id and game_type = 1
) as lastFrdFight
Notice how we have both the user_id (from the outer query) and the game_type as constants. That is exactly like our example earlier, with having the first and last name, and wanting to look up an email/phone number. In this case, we are looking for the MAX of the 3rd value in the index. Still easy: All the values are ordered, so if this index was sitting in front of us, we could just flip to the specific user_id, then look at the section with all game_type=1 and then just pick the last value to find the maximum. Very very fast. Same for the database. It can find this value extremely fast, which is why you saw an 80%+ reduction in your overall query time.
So, that's how indexes work, and why I choose these indexes as I did.
Be aware, that the more indexes you have, the more you'll see slowdowns when doing inserts and updates. But, if you are reading a lot more from your tables than you are writing, this is usually a more than acceptable trade off.
So, give these changes a shot, and let me know how it performs. Please provide the new EXPLAIN plan if you want further optimization help. Also, this should give you quite a bit of tools to use trial and error to see what does work at what doesn't. All my changes are fairly independent of each other, so you can swap them in and out with your original query pieces to see how each one works.
I've got a submissions table and in it are submissions that have the type either tip or request.
I'm trying to grab all the submissions of a particular user (to display as an aggregation of all their activity on their dashboard).
E.g.
You have submitted: 5 requests and 1 tip.
My submissions create table looks like this:
Table: submissions
Create Table: CREATE TABLE `submissions` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`slug` varchar(255) NOT NULL,
`description` mediumtext NOT NULL,
`user_id` int(11) NOT NULL,
`created` datetime NOT NULL,
`type` enum('tip','request') NOT NULL,
`thumbnail` varchar(64) CHARACTER SET latin1 DEFAULT NULL,
`removed` tinyint(1) unsigned NOT NULL DEFAULT '0',
`keywords` varchar(255) NOT NULL,
`ip` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
FULLTEXT KEY `search` (`title`,`description`,`keywords`)
) ENGINE=InnoDB AUTO_INCREMENT=22 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
I came up with one query that works and gives me the amount of the users' submissions, but because each submission (row that comes back) saves the type as either tip or request. So I'm trying to figure out how to aggregate that info now.
My query which returns the user with all tips. I'm trying to do one for requests as well.
SELECT users.*, count(submissions.id)
AS "tipsCount"
FROM users
LEFT JOIN submissions on users.id = submissions.user_id
WHERE username = 'blahbster'
AND submissions.type = 'tip'
ORDER BY submissions.created DESC
LIMIT 1;
Perhaps I could use a sum here? My attempt:
SELECT users.*,
SUM(case when type = 'tip' then 1 else 0 end) as "tipsCount"
SUM(case when type = 'request' then 1 else 0 end) as "requestsCount"
FROM users
LEFT JOIN submissions on users.id = submissions.user_id
WHERE username = 'blahbster'
ORDER BY submissions.created DESC
LIMIT 1;
SELECT a.username, b.type,
SUM(case when b.type = 'tip' then 1 else 0 end) as "tipsCount",
SUM(case when b.type = 'request' then 1 else 0 end) as "requestsCount"
FROM users as a
LEFT JOIN submissions as b
ON a.id = b.user_id
GROUP BY a.username, b.type;
The second query you had was close... but it wasn't aggregating a particular user's totals tips and total requests. The sum didn't compute across anything, ie, there was no GROUP BY. The query above should help. You can obviously add the WHERE filter back in if you need it.
Table:
CREATE TABLE `messages` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`from_user_id` int(11) unsigned NOT NULL,
`to_user_id` int(11) unsigned NOT NULL,
`seen` tinyint(1) NOT NULL DEFAULT '0',
`sent` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`message` text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `messages` (`id`, `from_user_id`, `to_user_id`, `seen`, `sent`, `message`)
VALUES
(1,2,1,0,'2013-11-06 11:05:42','Hello!'),
(2,1,2,0,'2013-11-06 11:05:52','Hello you too!'),
(3,3,1,0,'2013-11-06 11:06:08','Whats up?'),
(4,1,3,0,'2013-11-06 11:06:27','Not much');
I would put this in sqlfiddle but it is down for me, I cannot access it last 32 hours.
I've searched SO and found several related topics, but with different grouping requirements so I could not apply them to my project.
For the query, I know current user id and target user id, and based on this I want query to return all conversation of these two users ordered by date.
I was thinking something like:
SELECT message FROM messages WHERE from_user_id = 1 OR to_user_id = 1 [but where do I limit this query to target user 3?]
In other words I want to select:
(3,3,1,0,'2013-11-06 11:06:08','Whats up?')
(4,1,3,0,'2013-11-06 11:06:27','Not much')
For user 1, conversation with user 3. All of it.
SELECT message FROM messages
WHERE 1 in (from_user_id,to_user_id) and 3 in (from_user_id,to_user_id)
Older questions seen
Counting one table of records for matching records of another table
MySQL Count matching records from multiple tables
Count records from two tables grouped by one field
Table(s) Schema
Table entries having data from 2005-01-25
CREATE TABLE `entries` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`ctg` VARCHAR(15) NOT NULL,
`msg` VARCHAR(200) NOT NULL,
`nick` VARCHAR(30) NOT NULL,
`date` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `msg` (`msg`),
INDEX `date` (`date`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;
Child table magnets with regular data from 2011-11-08(There might be a few entries from before that)
CREATE TABLE `magnets` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`eid` INT(10) UNSIGNED NOT NULL,
`tth` CHAR(39) NOT NULL,
`size` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0',
`nick` VARCHAR(30) NOT NULL DEFAULT 'hjpotter92',
`date` DATETIME NOT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `eid_tth` (`eid`, `tth`),
INDEX `entriedID` (`eid`),
INDEX `tth_size` (`tth`, `size`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;
Question
I want to get the count of total number of entries by any particular nick(or user) entered in either of the table.
One of the entry in entries is populated at the same time as magnets and the subsequent entries of magnets can be from the same nick or different.
My Code
Try 1
SELECT `e`.id, COUNT(1), `e`.nick, `m`.nick
FROM `entries` `e`
INNER JOIN `magnets` `m`
ON `m`.`eid` = `e`.id
GROUP BY `e`.nick
Try 2
SELECT `e`.id, COUNT(1), `e`.nick
FROM `entries` `e`
GROUP BY `e`.nick
UNION ALL
SELECT `m`.eid, COUNT(1), `m`.nick
FROM `magnets` `m`
GROUP BY `m`.nick
The second try is generating some relevant outputs, but it contains double entries for all the nick which appear in both tables.
Also, I don't want to count twice, those entries/magnets which were inserted in the first query. Which is what the second UNION statement is doing. It takes in all the values from both tables.
SQL Fiddle link
Here is the link to a SQL Fiddle along with randomly populated entries.
I really hope someone can guide me through this. If it's any help, I will be using PHP for final display of data. So, my last resort would be to nest loops in PHP for the counting(which I am currently doing).
Desired output
The output that should be generated on the fiddle should be:
************************************************
** Nick ||| Count **
************************************************
** Nick1 ||| 10 **
** Nick2 ||| 9 **
** Nick3 ||| 6 **
** Nick4 ||| 10 **
************************************************
There might be a more efficient way but this works if I understand correctly:
SELECT SUM(cnt), nick FROM
(SELECT count(*) cnt, e.nick FROM entries e
LEFT JOIN magnets m ON (e.id=m.eid AND e.nick=m.nick)
WHERE eid IS NULL GROUP BY e.nick
UNION ALL
SELECT count(*) cnt, nick FROM magnets m GROUP BY nick) u
GROUP BY nick