MySQL - Improve Performance for Multi Country - mysql

I have a table company, which has a column called country_id is a reference of country table Primary Key ID.
CREATE TABLE `company` (
`Id` INT(11) NOT NULL,
`Name` VARCHAR(100) NOT NULL,
`Symbol` VARCHAR(50) NOT NULL,
`Industry` VARCHAR(100) NOT NULL,
`Type` VARCHAR(20) NOT NULL,
'country_id' INT(11) NOT NULL,
PRIMARY KEY (`Id`),
CONSTRAINT `company_ibfk_1` FOREIGN KEY (`country_id`) REFERENCES `country` (`Id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB;
In this table has multiple company for multiple country. Like the below table.
My company table has huge data. I want to improve the select query on company table performance based on country. Select query will be like
select * from company where country_id = 2
What will be the best design approach? Do I need to add indexing on country_id column or do I need to do the partitioning based on the country_id column? Please suggest.

if you have big count of countries (more then 100), you should use index.
If few number countries in your DB would be better to use partitioning.
But this option has one big disadvantage. If you will add new country, you should add new partition linked to new one.

InnoDB will automatically generate index on your foreign key country_id, so you do not need to indexing on it.
About partition, InnoDB does not support foreign key and partition together and if you use partition you can just create an index on country_id (not foreign key) and join both tables if you needed.
If you have more than one million records on company and your query is not fast enough, you can use partition, like below:
CREATE TABLE `company` (
`Id` INT(11) NOT NULL,
`Name` VARCHAR(100) NOT NULL,
`Symbol` VARCHAR(50) NOT NULL,
`Industry` VARCHAR(100) NOT NULL,
`Type` VARCHAR(20) NOT NULL,
`country_id` INT(11) NOT NULL,
PRIMARY KEY (`Id`),
INDEX (`country_id`)
)
PARTITION BY RANGE (`country_id`) (
PARTITION P1 VALUES LESS THAN (20),
PARTITION P2 VALUES LESS THAN (50),
PARTITION P3 VALUES LESS THAN (100),
PARTITION P4 VALUES LESS THAN MAXVALUE)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB;

Related

Is there a way to use MySQL fulltext to search related tables?

I have a table called persons which contains data about, well, people. It also contains foreign keys to another table. I'd like to make a fulltext index that is able to search the related tables for full text.
Here is some sample data: (see http://sqlfiddle.com/#!9/036fc5/2)
CREATE TABLE IF NOT EXISTS `states` (
`id` char(2) NOT NULL,
`name` varchar(45) NOT NULL,
PRIMARY KEY (`id`)
);
INSERT INTO `states` (`id`, `name`) VALUES
('NY', 'New York'),
('NJ', 'New Jersey'),
('CT', 'Connecticut'),
('PA', 'Pennsylvania');
CREATE TABLE IF NOT EXISTS `persons` (
`id` int auto_increment NOT NULL,
`first_name` varchar(45) NOT NULL,
`last_name` varchar(45) NOT NULL,
`state_id` char(2) not null,
PRIMARY KEY (`id`),
FULLTEXT (first_name, last_name, state_id)
);
INSERT INTO `persons` (`first_name`, `last_name`, `state_id`) VALUES
('Arnold', 'Asher', 'NY'),
('Bert', 'Bertold', 'NJ'),
('Charlie', 'Chan', 'NJ'),
('Darrin', 'Darcy', 'CT');
So, I'd like to be able to search for persons from "Jersey", such as:
SELECT * FROM persons WHERE MATCH(first_name, last_name, state_id) AGAINST('Jersey');
But, of course, the text "Jersey" exists only in the states table and not in the persons table. Does it make sense to make a materialized/generated index? Is there a simpler way?
You need to put a separate full-text index on the states table, and join with that.
CREATE TABLE IF NOT EXISTS `states` (
`id` char(2) NOT NULL,
`name` varchar(45) NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT (name)
);
CREATE TABLE IF NOT EXISTS `persons` (
`id` int auto_increment NOT NULL,
`first_name` varchar(45) NOT NULL,
`last_name` varchar(45) NOT NULL,
`state_id` char(2) not null,
PRIMARY KEY (`id`),
FULLTEXT (first_name, last_name);
SELECT p.*
FROM persons p
JOIN states s ON s.id = p.state_id
WHERE MATCH(s.name) AGAINST ('Jersey')
UNION
SELECT *
FROM persons
WHERE MATCH(first_name, last_name) AGAINST ('Jersey')
In MySQL, no type of index spans multiple tables. Not fulltext indexes, not spatial indexes, not btree indexes, not hash indexes.
Every type of index you can define belongs to exactly one table, and can index only the values in that table.

How can I optimize the MYSQL query related to regions

Country table has around 2 records.
State table has around 50 records.
City table has around 6000 records.
Zipcode table has around 500000 records
Approximate exicution time to fetch data is around 12-15 minutes.
How can I optimize below query:
SELECT Country.id AS Country_id,
Country.country AS Country_country,
Country.countrycode AS Country_countrycode,
State.id AS State_id,
State.statecode AS State_statecode,
City.id AS City_id,
City.city AS City_city,
Zipcode.id AS Zipcode_id,
Zipcode.zipcode AS Zipcode_zipcode
FROM c_country Country
LEFT JOIN c_state State
ON Country.id = ( State.country_id )
LEFT JOIN c_city City
ON State.id = ( City.state_id )
LEFT JOIN c_zipcode Zipcode
ON City.id = ( Zipcode.city_id )
Below are the table structure:
CREATE TABLE `c_city` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`state_id` int(11) NOT NULL,
`city` varchar(256) NOT NULL,
`zone_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=7027 DEFAULT CHARSET=latin1;
CREATE TABLE `c_state` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`country_id` int(11) NOT NULL,
`state` varchar(256) NOT NULL,
`statecode` varchar(256) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=latin1;
CREATE TABLE `c_country` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`country` varchar(256) NOT NULL,
`countrycode` varchar(256) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1;
CREATE TABLE `c_zipcode` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`city_id` int(11) NOT NULL,
`zipcode` varchar(256) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=539420 DEFAULT CHARSET=latin1;
Below is the screenshot of EXPLAIN SELECT:
for try to improve performanbce you could add index for
table c_zipcode column city_id
table c_city column state_id
table c_state column country_id
or for resolve too all the columns values using the index
table c_zipcode column city_id, zipcode
table c_city column state_id, city
table c_state column country_id , statecode
I think your query is very slow because there is some index missing. You didn't declare any Foreign Key, in mysql InnoDB, Foreign Key are important because they are warranty for the data integrity but MySQL create automatically index on this foreign keys too.
a link to explain : https://dba.stackexchange.com/questions/105469/performance-impact-of-adding-a-foreign-key-to-a-1m-rows-table
You have to alter the tables to add foreign keys :
ALTER TABLE c_city ADD CONSTRAINT fk_state_id FOREIGN KEY (state_id) REFERENCES c_state(id);
ALTER TABLE c_state ADD CONSTRAINT fk_country_id FOREIGN KEY (country_id) REFERENCES c_country(id);
ALTER TABLE c_zipcode ADD CONSTRAINT fk_city_id FOREIGN KEY (city_id) REFERENCES c_city (id);

partitioning by data mysql

I have table 'items'. 18 mln records:
CREATE TABLE IF NOT EXISTS `items` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`log_id` int(11) NOT NULL,
`res_id` int(11) NOT NULL,
`link` varchar(255) NOT NULL,
`title` text NOT NULL,
`content` text NOT NULL,
`n_date` varchar(255) NOT NULL,
`nd_date` int(11) NOT NULL,
`s_date` int(11) NOT NULL,
`not_date` date NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `link_2` (`link`),
KEY `log_id` (`log_id`),
KEY `res_id` (`res_id`),
KEY `now_date` (`not_date`),
KEY `sql_index` (`res_id`,`id`,`not_date`)
) ENGINE=Aria DEFAULT CHARSET=utf8 PAGE_CHECKSUM=0 AUTO_INCREMENT=18382133 ;
Trying to partition this table I created a mini copy of it and include column 'not_date' in primary and uniq keys:
CREATE TABLE IF NOT EXISTS `part_items` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`log_id` int(11) NOT NULL,
`res_id` int(11) NOT NULL,
`link` varchar(255) NOT NULL,
`title` text NOT NULL,
`content` text NOT NULL,
`n_date` varchar(255) NOT NULL,
`nd_date` int(11) NOT NULL,
`s_date` int(11) NOT NULL,
`not_date` date NOT NULL,
PRIMARY KEY (`not_date`,`id`),
UNIQUE KEY `link_2` (`not_date`,`link`),
KEY `log_id` (`log_id`),
KEY `res_id` (`res_id`),
KEY `now_date` (`not_date`),
KEY `sql_index` (`res_id`,`id`,`not_date`)
) ENGINE=Aria DEFAULT CHARSET=utf8 PAGE_CHECKSUM=0
/*!50100 PARTITION BY RANGE ( TO_DAYS(not_date))
(PARTITION p_1 VALUES LESS THAN (735963) ENGINE = Aria,
PARTITION p_2 VALUES LESS THAN (736694) ENGINE = Aria) */ AUTO_INCREMENT=18414661 ;
Then I run sql_query:
alter table `part_items` PARTITION BY RANGE( TO_DAYS(not_date) ) (
PARTITION p_1 VALUES LESS THAN( TO_DAYS('2014-12-31') ),
PARTITION p_2 VALUES LESS THAN( TO_DAYS('2016-12-31') )
);
Then I trying to select records that must de in p_1 and explain partitions show me that searching was only in p_1. But when I select records that must be in p_2 explain partitions show full-scan(p_1,p_2).
What wrong in my code?
Queries:
explain partitions SELECT * FROM `part_items` where content like '%k%' and not_date < '2014-05-12'
explain partitions SELECT * FROM `part_items` where content like '%k%' and not_date > '2015-01-01'
And one more question: Is it possible to partitioning views?
When PARTITIONing by some date function, there is a chance of an invalid date being provided. That would lead to NULL; such values are stored in the first partition.
This is an issue that has tripped up many a developer. The typical 'workaround' is to have the first partition empty so that the effort of looking in it is (usually) minimal. In your case:
PARTITION p_0 VALUES LESS THAN(0)
Partitioning on less than about 6 partitions is usually not worth the effort; will you be adding more partitions?
(Caveat: My advice comes from years of MyISAM/InnoDB partitioning; I don't know of Aria works differently. I suspect that partitioning is mostly handled at then Engine-independent layer.)

Which columns should I add to a PRIMARY KEY

I want to create a table and avoid duplicated entries, by creating a PRIMARY KEY. The problem is I don't know which columns I should add to this KEY. Consider the next table:
CREATE TABLE `customers` (
`id_c` int(11) unsigned NOT NULL,
`lang` tinyint(2) unsigned NOT NULL,
`name` varchar(80) collate utf8_unicode_ci NOT NULL,
`franchise` int(11) unsigned NOT NULL,
KEY `id_c` (`id_c`),
KEY `lang` (`lang`),
KEY `franchise` (`franchise`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
id_c: Id of customer. It can be an enterprise. Suppose McDonald's
lang: Contact language.
boss: Boss' name
franchise: If not zero, it is a franchise. McDonald's in Rome, Paris, London...
As you can see, each ENTERPRISE can have different central "shops" in each country (contact language), but also different franchises in each city (where boss' name would be different).
I want to be able to INSERT new rows where the id_c, lang can be not-distinct (many franchises in same country). But name has to be distinct only if (id_c,lang) is the same (for other id_c,lang combination... name could be the same). And franchise can be the same too only if it has not been assigned in the same (id_c,lang) pair.
I was thinking about a PRIMARY KEY (lang,name), but it might not be the best way. Is this table structure just too complex?
you need to create a multiple column UNIQUE constraint,
CONSTRAINT tb_uq UNIQUE (id_c,lang, name)
or set them as the primary key,
CREATE TABLE `customers`
(
`id_c` int(11) unsigned NOT NULL,
`lang` tinyint(2) unsigned NOT NULL,
`name` varchar(80) collate utf8_unicode_ci NOT NULL,
`franchise` int(11) unsigned NOT NULL,
KEY `id_c` (`id_c`),
KEY `lang` (`lang`),
KEY `franchise` (`franchise`),
CONSTRAINT tb_PK PRIMARY KEY (id_c,lang, name) --- <<== compound PK
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
If i get your question right...u are asking which columns to choose...and not HOW to do it?Correct?
So i'd guess that franchise number is not a boolean(YES/NO) thing but holds a number unique for each shop?...each country? If thats the case then go with id_c and franchise.
If not you can choose all 4 of them to be the key...but i think thats not a good practise.In that case i'd say that you should add one more column(trueID for example - autoincrement integer) and use this one as your primary key.
Just give Id as primary key. Because using Id_c you can get other column values. As you see the best advice is to create your Primary id should be in first column.

SQL Join Table - Does it require a primary key at all or just unique keys?

I have created an application for the CakePHP framework which uses a join table.
I am unsure as to whether it is neccessary that I need a primary key to uniquley identify each row for the join table, as shown in the first block of SQL.
Do the two fields need to be set as unique keys or can they both be set as primary keys and I remove id as the primary key?
I was also asked why declaring atomic primary keys using a table constraint rather
than a column constraint, does this mean I shouldn't set unique keys for a join table?
CREATE TABLE IF NOT EXISTS `categories_invoices` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) NOT NULL,
`invoice_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `category_id` (`category_id`,`invoice_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=163 ;
I was thinking the solution is possibly to set both keys as unique and remove the primary key as shown here:
CREATE TABLE IF NOT EXISTS `categories_invoices` (
`category_id` int(11) NOT NULL,
`invoice_id` int(11) NOT NULL,
UNIQUE KEY `category_id` (`category_id`,`invoice_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
I did in fact test deleting the primary key 'id' for the join table leaving only 'category_id' and 'invoice_id' and the application still worked. This has left both fields as unique fields within the join table. Is this in fact the correct practice?
You don't need both. The compound unique key can replace the Primary key (unless the Cake framework cannot deal with compound Priamry Keys):
CREATE TABLE IF NOT EXISTS categories_invoices (
category_id int(11) NOT NULL,
invoice_id int(11) NOT NULL,
PRIMARY KEY (category_id, invoice_id)
)
ENGINE = MyISAM
DEFAULT CHARSET = latin1 ;
It's also good to have another index, with the reverse order, besides the index created for the Primary Key:
INDEX (invoice_id, category_id)
If you want to define Foreign Key constraints, you should use the InnoDB engine. With MyISAM you can't have Foreign Keys. So, it would be:
CREATE TABLE IF NOT EXISTS categories_invoices (
category_id int(11) NOT NULL,
invoice_id int(11) NOT NULL,
PRIMARY KEY (category_id, invoice_id),
INDEX invoice_category_index (invoice_id, category_id)
)
ENGINE = InnoDB
DEFAULT CHARSET=latin1 ;
If Cake cannot cope with composite Primary Keys:
CREATE TABLE IF NOT EXISTS categories_invoices (
id int(11) NOT NULL AUTO_INCREMENT,
category_id int(11) NOT NULL,
invoice_id int(11) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY category_invoice_unique (category_id, invoice_id),
INDEX invoice_category_index (invoice_id, category_id)
)
ENGINE = InnoDB
DEFAULT CHARSET=latin1 ;
There is nothing wrong with the second method. It is referred to as a composite key and is very common in database design, especially in your circumstance.
http://en.wikipedia.org/wiki/Relational_database#Primary_keys