Order by two fields - Indexing - mysql

So I've got a table with all users, and their values. And I want to order them after how much "money" they got. The problem is that they have money in two seperate fields: users.money and users.bank.
So this is my table structure:
CREATE TABLE IF NOT EXISTS `users` (
`id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(54) COLLATE utf8_swedish_ci NOT NULL,
`money` bigint(54) NOT NULL DEFAULT '10000',
`bank` bigint(54) NOT NULL DEFAULT '10000',
PRIMARY KEY (`id`),
KEY `users_all_money` (`money`,`bank`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_swedish_ci AUTO_INCREMENT=100 ;
And this is the query:
SELECT id, (money+bank) AS total FROM users FORCE INDEX (users_all_money) ORDER BY total DESC
Which works fine, but when I run EXPLAIN it shows "Using filesort", and I'm wondering if there is any way to optimize it?

Because you want to sort by a derived value (one that must be calculated for each row) MySQL can't use the index to help with the ordering.
The only solution that I can see would be to create an additional total_money or similar column and as you update money or bank update that value too. You could do this in your application code or it would be possible to do this in MySQL with triggers too if you wanted.

Related

MySQL composite index effect on joins

I have the following SQL query (DB is MySQL 5):
select
event.full_session_id,
DATE(min(event.date)),
event_exe.user_id,
COUNT(DISTINCT event_pat.user_id)
FROM
event AS event
JOIN event_participant AS event_pat ON
event.pat_id = event_pat.id
JOIN event_participant AS event_exe on
event.exe_id = event_exe.id
WHERE
event_pat.user_id <> event_exe.user_id
GROUP BY
event.full_session_id;
"SHOW CREATE TABLE event":
CREATE TABLE `event` (
`id` int(12) NOT NULL AUTO_INCREMENT,
`date` datetime NOT NULL,
`session_id` varchar(64) DEFAULT NULL,
`full_session_id` varchar(72) DEFAULT NULL,
`pat_id` int(12) DEFAULT NULL,
`exe_id` int(12) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `SESSION_IDX` (`full_session_id`),
KEY `PAT_ID_IDX` (`pat_id`),
KEY `DATE_IDX` (`date`),
KEY `SESSLOGPATEXEC_IDX` (`full_session_id`,`date`,`pat_id`,`exe_id`)
) ENGINE=MyISAM AUTO_INCREMENT=371955 DEFAULT CHARSET=utf8
"SHOW CREATE TABLE event_participant":
CREATE TABLE `event_participant` (
`id` int(12) NOT NULL AUTO_INCREMENT,
`user_id` varchar(64) NOT NULL,
`alt_user_id` varchar(64) NOT NULL,
`username` varchar(128) NOT NULL,
`usertype` varchar(32) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `ALL_UNQ` (`user_id`,`alt_user_id`,`username`,`usertype`),
KEY `USER_ID_IDX` (`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=5397 DEFAULT CHARSET=utf8
Also, the query itself seems ugly, but this is legacy code on a production system, so we are not expected to change it (at least for now).
The problem is that, there is around 36 million record on the event table (in the production system), so there have been frequent crashes of the DB machine due to using temporary;using filesort processing (they provided these EXPLAIN outputs, unfortunately, I don't have them right now. I'll try to update them to this post later.)
The customer asks for a "quick fix" by adding indices. Currently we have indices on full_session_id, pat_id, date (separately) on event and user_id on event_participant.
Thus I'm thinking of creating a composite index (pat_id, exe_id, full_session_id, date) on event- this index comprises of the fields in the join (equivalent to where ?), then group by, then aggregate (min) parts.
This is just an idea because we currently don't have that kind of data volume to test, so we try the best we could first.
My question is:
Could the index above help in the performance ? (It's quite confusing on the effect because I have found two really contrasting results: https://dba.stackexchange.com/questions/158385/compound-index-on-inner-join-table
versus Separate Join clause in a Composite Index, where the latter suggests that composite index on joins won't work and the former that it'll work.
Does this path (adding indices) have hopes ? Or should we forget it and just try to optimize the query instead ?
Thanks in advance for your help :)
Update:
I have updated the full table description for the two related tables.
MySQL version is 5.1.69. But I think we don't need to worry about the ambiguous data issue mentioned in the comments, because it seems there won't be ambiguity for our data. Specifically, for each full_session_id, there is only one "event_exe.user_id" returned (it's just a business logic in the application)
So, what do you think about my 2 questions ?

MySQL sorts tables with one unique column automatically. Where is this documented?

I'm running MySQL 5.5 and found behaviour I didn't know of before.
Given this create:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name_UQ` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
With these inserts:
insert into test (name) values ('b');
insert into test (name) values ('a');
And this select:
select * from test;
MySQL does something I wasn't aware of:
2 a
1 b
It sorts automatically.
Given a table with one extra, non-unique column:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(128) DEFAULT NULL,
`other_column` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name_UQ` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And the same inserts (see above), the select (see above) gives this result:
1 b NULL
2 a NULL
Which is kind of expected.
Where is the behaviour of the first query (SQL Fiddle) documented? I'd like to see more of these peculiar things.
MySQL does not sort result sets automatically. The ordering of a result set is indeterminate unless the query specifies an order by clause.
You should never rely on any sort of "implicit" ordering. Just because you see it in 1 (or 100 queries). In fact, without an order by, the same query can return results in different orders on subsequent runs (although I'll admit that this regularly occurs in other database, it is unlikely in MySQL).
Instead, add the ORDER BY. Ordering by a primary key is remarkably efficient, so you don't have to worry about performance.

Optimize MySQL count query with JOIN

I have a query that takes about 20 seconds, I would like to understand if there is a way to optimize it.
Table 1:
CREATE TABLE IF NOT EXISTS `sessions` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=9845765 ;
And table 2:
CREATE TABLE IF NOT EXISTS `access` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`session_id` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `session_id ` (`session_id `)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=9467799 ;
Now, what I am trying to do is to count all the access connected to all sessions about one user, so my query is:
SELECT COUNT(*)
FROM access
INNER JOIN sessions ON access.session_id=session.id
WHERE session.user_id='6';
It takes almost 20 seconds...and for user_id 6 there are about 3 millions sessions stored.
There is anything I can do to optimize that query?
Change this line from the session table:
KEY `user_id` (`user_id`)
To this:
KEY `user_id` (`user_id`, `id`)
What this will do for you is allow you to complete the query from the index, without going back to the raw table. As it is, you need to do an index scan on the session table for your user_id, and for each item go back to the table to find the id for the join to the access table. By including the id in the index, you can skip going back to the table.
Sadly, this will make your inserts slower into that table, and it seems like this may be a bid deal, given just one user has 3 millions sessions. Sql Server and Oracle would address this by allowing you to include the id column in your index, without actually indexing on it, saving a little work at insert time, and also by allowing you specify a lower fill factor for the index, reducing the need to re-build or re-order the indexes at insert, but MySql doesn't support these.

Should we use the "LIMIT clause" in following example?

There is a structure:
CREATE TABLE IF NOT EXISTS `categories` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`parent_id` int(11) unsigned NOT NULL DEFAULT '0',
`title` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Query_1:
SELECT * FROM `categories` WHERE `id` = 1234
Query_2:
SELECT * FROM `categories` WHERE `id` = 1234 LIMIT 1
I need to get just one row. Since we apply WHERE id=1234 (finding by PRIMARY KEY) obviously that row with id=1234 is only one in whole table.
After MySQL has found the row, whether engine to continue the search when using Query_1?
Thanks in advance.
Look at this SQLFiddle: http://sqlfiddle.com/#!2/a8713/4 and especially View Execution Plan.
You see, that MySQL recognizes the predicate on a PRIMARY column and therefore it does not matter if you add LIMIT 1 or not.
PS: A little more explanation: Look at the column rows of the Execution Plan. The number there is the amount of columns, the query engine thinks, it has to examine. Since the columns content is unique (as it's a primary key), this is 1. Compare it to this: http://sqlfiddle.com/#!2/9868b/2 same schema but without primary key. Here rows says 8. (The execution plan is explained in the German MySQL reference, http://dev.mysql.com/doc/refman/5.1/en/explain.html the English one is for some reason not so detailed.)

Will this MySQL index increase performance?

I have the following MySQL table:
CREATE TABLE `my_data` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`tstamp` int(10) unsigned NOT NULL DEFAULT '0',
`name` varchar(255) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
I want to optimise for the following SELECT query only:
SELECT `id`, `name` FROM `my_data` ORDER BY `name` ASC
Will adding the following index increase performance, regardless of the size of the table?
CREATE INDEX `idx_name_id` ON `my_data` (`name`,`id`);
An EXPLAIN query suggests it would, but I have no quick way of testing with a large data set.
Yes it would. Even though you are still doing a full table scan, having the index will make the sort operation (due to the order by) unnecessary.
But it will also add overhead to inserts and delete statements!
Index are usefull when you use the column in the where clause, or when the column is a part of the condition for a link between two tables. See MySQL Doc