Index part of the mysql/innodb table? - mysql

I am sorry if this is a dumb question (cause it sounds unlikely).
I have a table that is 20 Million rows. However, only about 300K of these rows get accessed regularly, and they can be identified in a column condition called "app_user=1"
Is there anyway i can just index those rows, and when I call a select, i will be sure to pass in the condition as well?

I would recommend splitting the table into two separate tables. But in case you don't want to do that, the highest performance way to do this if you're always going to include "where app_user=1" in your queries is to create a primary key on the table that includes the app_user column as the first part of the key. InnoDB will use this as a clustered index which saves you a few extra disk accesses. You can create the table like this:
create table testTable (
app_user tinyint UNSIGNED default 0,
id int UNSIGNED NOT NULL,
name varchar(255) default '',
PRIMARY KEY k1(app_user, id)
) ENGINE=InnoDB;
A friend wrote this article on clustered indexes in InnoDB a while back:
http://www.joehruska.com/?p=6

Add a column called app_user and index on that, then pass in "WHERE app_user = 1" in your query.
You could go further to partition your table based on that column.

Related

is there any difference with joint primary key order?

I am curious about that , is there any difference with joint primary key order?
For example, is there any difference between the two tables' primary key? the key order would make no difference on the table?
CREATE TABLE `Q3` (
`user_id` VARCHAR(20) NOT NULL,
`retweet_id` VARCHAR(20) NOT NULL,
PRIMARY KEY (`user_id`,`retweet_id`)
)
vs
CREATE TABLE `Q3` (
`user_id` VARCHAR(20) NOT NULL,
`retweet_id` VARCHAR(20) NOT NULL,
PRIMARY KEY (`retweet_id`,`user_id`)
)
It would make difference in an index structure.
In composite index the index value consists of several values that go one after another. And the order determines what queries can be optimized using this particular index.
IE:
For the index created as
PRIMARY KEY (`user_id`,`retweet_id`)
The query like WHERE user_id = 42 will be optimized (not guaranteed, but technically possible), whereas for the query WHERE retweet_id = 4242 it won't be.
PS: it's a good idea to always have an artificial primary key, like a sequence (or an autoincrement column in case of mysql), instead of using natural primary keys. It would be better because the primary key is a clustered key, which means it defines how rows are physically stored in pages on disk. Which means it's a good idea for a PK to be monotonously growing (or decreasing, doesn't matter)
The order does affect how the index is used in queries. When you use multiple columns, each column is a sub-tree of the preceding column.
In your first case (user_id, retweet_id) - if you searched the index for user_id 1, you then have all the retweet_ids under that.
Subsequently if you wish to search for only retweet_id=7 (for all users) - the index cannot be used because you need to first step through each users item in the index.
So if you wish to query for user_id, or retweet_id individually (without the other), put that column first. If you need both you could consider adding a secondary index.
There are also limitations for range scans, you can only effectively use the last column queried for the range scan. You can read more about all of this here:
http://dev.mysql.com/doc/refman/5.6/en/multiple-column-indexes.html
Additionally if using InnoDB, the tables are stored in order of the PRIMARY KEY. This might matter for performance depending on how you query your data.

MySQL searching query?

I'm new to MySQL and want to know that if I have a table with 25 column and the first one of it is the "id". Would the computer render every time through the whole table to search the particular "id".
if you construct the query like SELECT * FROM $table_name WHERE table_id=$id; then it will not render all table.
And as #dku.rajkumar says in the comment, it depends on what you want to fetch and your query structure.
It may depend on the query and also the STORAGE Engine you choose to use.
like MyIsam or InnoDb
example
CREATE TABLE tablename (
id INT UNSIGNED PRIMARY KEY
)ENGINE=MyIsam;
CREATE TABLE tablename (
id INT UNSIGNED PRIMARY KEY
)ENGINE=InnoDB;
there do exist difference in way tables are stored ,dependiing on storage engine , which certainly will reflect in the criteria mysql server (mysqld) performs search to cater your needs .

Indexes in MySQL (Unique, Write Statement)

I Have 2 Question
If my table contain a unique column like this:
DROP TABLE IF EXISTS TestTable;
CREATE TABLE TestTable(
ID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
GUID VARCHAR(32) UNIQUE NULL);
Do i Need to create an Index for this GUID column.
Note: i have GUID column In Where statement with join tables
My Second Question is the update statement will effect the index table if the updated column(s) not have been indexes?
No, UNIQUE is kind of index, so you don't need another index on the same column.
It won't update the index, if the changed column is not indexed.
Indexes that are not changed do not get updated.
source
It depends on which database you are using. Different databases have different ways of indexing.
If you are using InnoDB then the Primary Key and Unique Key is already an index, so you won't need to. If you create manually yet another index for the GUID column then you will have an extra redundant index on that column which wastes space.

Why MYSQL doesn't use the index for the same query if I query more columns?

I have the following table:
create table stuff (
id mediumint unsigned not null auto_increment primary key,
title varchar(150) not null,
link varchar(250) not null,
time timestamp default current_timestamp not null,
content varchar(1500)
);
If I EXPLAIN the query
select id from stuff order by id;
then it says it uses they primary key as an index for ordering the results. But with this query:
select id,title from stuff order by id;
EXPLAIN says no possible keys and it resorts to filesort.
Why is that? Isn't the data of a certain row stored together in the database? If it can order the results using the index when I'm querying only the id then why adding an other column to the query makes a difference? The primary key identifies the row already, so I think it should use the primary key for ordering in the second case too.
Can you explain why this is not the case?
Sure, because it is more performant in this query: you need to read full index and after that iteratively read row by row from data. This is extremely unefficient. Instead of this mysql just prefers to read the data right from the data file.
Also, what kind of storage engine do you use? Seems like mysam.
For this case innodb would be more efficient, since it uses clustered indexes over primary key (which is monotonously growing in your case).

Query optimization

SELECT nar.name, nar.reg, stat.lvl
FROM members AS nar
JOIN stats AS stat
ON stat.id = nar.id
WHERE nar.ref = 9
I have indexes on id in both tables and I have index referavo either. But still, it checks all rows in stats table (I use Explain to get this information), but in members table it checks only one row how it supposed to be. What's wrong with stats table? Thank you very much.
CREATE TABLE `members` (
`id` int(11) NOT NULL
`ref` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT
CREATE TABLE `stats` (
`id` int(11) NOT NULL AUTO_INCREMENT
PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=37 DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE stat ALL PRIMARY NULL NULL NULL 22
1 SIMPLE nar eq_ref PRIMARY PRIMARY 4 table_nme.stat.id 1 Using where
Your tables are ridiculously small - just 23 rows is tiny.
MySQL chooses different query plans depending on how many rows there are in the table and based on how many it estimates will be selected (from the statistics). You should performance test your queries with realistic data - both the amount of data and the distribution of values in the data should be as realistic as possible. Otherwise the query plan MySQL chooses in testing might not be the same the actual query plan for your live system.
Your tables are so small that using an index could be slower than just checking the table directly. Remember that checking data that is already in memory is fast, but reads are slow. Accessing an index can require an extra read - first the index has to be fetched and read to find which rows to select, then if your index isn't a covering index the relevant rows in the table have to be fetched and read to get the values that aren't in the index. MySQL is perfectly entitled to not use an index even if one is available if it believes that doing so will result in a slower plan.
Put some more rows in your table (thousands) and try running EXPLAIN again. You will probably find that when you have more rows that the PRIMARY KEY index will be used for the join.
MySQL can use only one index at a time per table, thus it sees the member row using the index, and then performs a sequential search for the ID.
You have to create a multi columns index for the members table
CREATE INDEX idref ON members(id,ref);
please try the reverse one as well if it doesn't get better (first: drop index idref on members)
CREATE INDEX idref ON members(ref,id);
(I cannot try it myself now)