SELECT ... INTO OUTFILE performance - mysql

I am trying to optimize the export process of a query.
I have the following tables (I omit some irrelevant fields):
CREATE TABLE _termsofuse (
ID int(11) NOT NULL AUTO_INCREMENT, TTC_ART_ID int(11) DEFAULT NULL,
TTC_TYP_ID int(11) DEFAULT NULL,
TERM_OF_USE_NAME varchar(200) DEFAULT NULL,
TERM_OF_USE_VALUE varchar(200) DEFAULT NULL,
PRIMARY KEY (ID)
) ENGINE=InnoDB AUTO_INCREMENT=185905671 DEFAULT CHARSET=utf8;
CREATE TABLE vehicle (
ID mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
TTC_TYP_ID int(11) unsigned NOT NULL,
PRIMARY KEY (ID),
UNIQUE KEY TTC_TYP_ID_UNIQUE (TTC_TYP_ID)
) ENGINE=InnoDB AUTO_INCREMENT=44793 DEFAULT CHARSET=utf8;
CREATE TABLE part (
ID int(11) unsigned NOT NULL AUTO_INCREMENT,
TTC_ART_ID int(11) unsigned NOT NULL,
PRIMARY KEY (ID),
UNIQUE KEY TTC_ART_ID_UNIQUE (TTC_ART_ID)
) ENGINE=InnoDB AUTO_INCREMENT=3732260 DEFAULT CHARSET=utf8;
CREATE TABLE term_of_use_name (
ID smallint(5) unsigned NOT NULL AUTO_INCREMENT,
ID_Lang tinyint(3) unsigned NOT NULL,
Name varchar(200) NOT NULL,
PRIMARY KEY (ID, ID_Lang),
UNIQUE KEY Name_Lang_UNIQUE (Name, ID_Lang),
KEY fk_term_of_use_name_lang_id_lang_idx (ID_Lang),
CONSTRAINT fk_term_of_use_name_lang_id_lang FOREIGN KEY (ID_Lang)
REFERENCES lang (ID) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=732 DEFAULT CHARSET=utf8;
CREATE TABLE term_of_use_value (
ID mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
ID_Lang tinyint(3) unsigned NOT NULL,
Value varchar(200) NOT NULL,
PRIMARY KEY (ID,ID_Lang),
UNIQUE KEY Value_Lang_UNIQUE (Value,ID_Lang),
KEY fk_term_of_use_value_lang_id_lang_idx (ID_Lang),
CONSTRAINT fk_term_of_use_value_lang_id_lang FOREIGN KEY (ID_Lang)
REFERENCES lang (ID) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=887502 DEFAULT CHARSET=utf8;
Now I try to select some columns to a csv file. Afterwards i will import the file to a database table, but I suspect this should not take too much time.
My Select statement is the following:
SELECT DISTINCT vehicle.ID, part.ID, term_of_use_name.ID, term_of_use_value.ID FROM _termsofuse
INNER JOIN vehicle ON vehicle.TTC_TYP_ID = _termsofuse.TTC_TYP_ID
INNER JOIN part ON part.TTC_ART_ID = _termsofuse.TTC_ART_ID
INNER JOIN term_of_use_name ON term_of_use_name.Name = _termsofuse.TERM_OF_USE_NAME AND term_of_use_name.ID_Lang = 2
INNER JOIN term_of_use_value ON term_of_use_value.Value = _termsofuse.TERM_OF_USE_VALUE AND term_of_use_value.ID_Lang = 2
INTO OUTFILE 'termsofuse.csv'
CHARACTER SET utf8
FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n';
This query takes longer than 8 hours on my laptop (I have 4 GB of RAM).
I tried to see the explain of the SELECT part and it shows the following:
I do not understand where exactly is the bottleneck. I have exported a similar (about 95 Million records) query in less than 1h. Also breaking the results into multiple tables using limit does not seem to help much...
Please have a look and any additional info you require just tell me.
Thank you in advance.
EDIT 15/01/2016
Results of Explain Select

Why have an ID when you have a perfectly good UNIQUE INT that could be the PK?
Seriously -- Having to reach through a secondary key slows things down. If each lookup slows it down by a factor of 2, that could add up.
How much RAM do you have? What is the value of innodb_buffer_pool_size? It should be about 70% of available RAM.
Let's see the EXPLAIN SELECT ...; there may be more clues there.

Related

MariaDB INNER JOIN with foreign keys are MUCH slower than without them

Please help me, I'm stuck with the strange behaviour of MariaDB server.
I have 3 tables.
CREATE TABLE `default_work` (
`add_date` datetime(6) NOT NULL,
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`keywords` varchar(255) DEFAULT NULL,
`short_text` longtext DEFAULT NULL,
`downloads` int(10) unsigned NOT NULL,
`published` tinyint(1) NOT NULL,
`subject_id` int(11) NOT NULL,
`work_type_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `default_work_subject_id_IDX` (`subject_id`) USING BTREE,
KEY `default_work_work_type_id_IDX` (`work_type_id`) USING BTREE,
CONSTRAINT `default_work_FK` FOREIGN KEY (`subject_id`) REFERENCES `default_subject` (`id`),
CONSTRAINT `default_work_FK_1` FOREIGN KEY (`work_type_id`) REFERENCES `default_worktype` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=210673 DEFAULT CHARSET=utf8
CREATE TABLE `default_subject` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`subject` varchar(255) NOT NULL,
`old_id` int(10) unsigned NOT NULL,
`subject_literal` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=43 DEFAULT CHARSET=utf8
CREATE TABLE `default_worktype` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`work_type` varchar(250) NOT NULL,
`description` longtext DEFAULT NULL,
`old_id` int(10) unsigned NOT NULL,
`work_type_literal` varchar(250) NOT NULL,
`title` varchar(255) NOT NULL,
`multiple` varchar(255) NOT NULL,
`keywords` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `default_worktype_old_id_a8b508fe_uniq` (`old_id`),
UNIQUE KEY `default_worktype_work_type_literal_1e609434_uniq` (`work_type_literal`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8
These tables were created by Django ORM but it seems to be ok.
The default_work table has about 200,000 records, default_subject - 42, and default_worktype - 12.
After I was making a request in Django admin with simple joins between those tables I've got about 9 secs of query time.
Looked in SQL log I've found a raw query:
SELECT `default_work`.`id`, `default_work`.`title`, `default_worktype`.`work_type`,`default_subject`.`subject`
FROM `default_work`
INNER JOIN `default_subject` ON (`default_work`.`subject_id` = `default_subject`.`id`)
INNER JOIN `default_worktype` ON (`default_work`.`work_type_id` = `default_worktype`.`id`)
ORDER BY `default_work`.`id` DESC LIMIT 100
The explain showing:
Explain result of the query with indexes
And this is a bit confusing because when I deleted all indexes on table default_work except the primary key, the results were completely different. The request time was about 3.4 msec and explain shows the all primary keys are used correctly.
Explain result of the query without indexes
PS. I'm tried to reproduce this situation on PostgreSQL and got a 1.3 msec with the request with indexes and foreign keys.
Looking at your EXPLAIN results you can see that when the foreign keys are turned on the system is using that key in the join, instead of choosing to use the primary key in the target table. (row 2)
As there will be many records with the same value it massivley increases the records that are being evaluated.
I don't know why it's choosing to do that. You may find that rewriting the select statement in a different order will change how it chooses the indexes. You may find the choice is different if in the ON clause you secify target table first then the source table (default_subject.id = default_work.subject_id)

Table "Products" with predefined products, user can customize the price. How to avoid data redundancy?

I've been thinking on this problem for fews days and I still can't find a way to do what I want.
Below is how my database is currently designed (it's where I'm stuck) :
This is what I want :
a User can create multiple PriceSheets. A User can give a PriceSheet any name he wants. There are two PriceSheets types : "Lab Fulfillment", or "Self Fulfillment".
if the User chooses "Lab Fulfillment", he can import all or part of the Products of one of the predefined Labs. (I rephrase : there are few Labs that come with a predefined list of Products). The User will only be able to customize the price. He can't add custom products to this PriceSheet.
if the User chooses "Self Fulfillment", he can add his own products, and can personalize each field (name, cost, price, dimension_h, dimension_l).
I don't know how to link the tables between them. If I put the predefined Products in the Products table and set a Many-to-Many relationship between PriceSheets and Product, the default price of a predefined Product will be overwritten when a User customizes it, which is not what I want.
Also, I want the default values of my predefined Products to be only once in my database. If 100 users uses the predefined Products, I don't want the default cost to be in my database 100 times.
Don't hesitate to ask for precisions, I had trouble making this question clear and I think it's still not totaly clear.
Thanks in advance for your help
OK, database normalization 101. Lots of ways to do this, would take me a day to really optimize all this, this should help:
User
Lab
Product
id name cost dimension .....
1 a
2 b
3 c
4 d
So those three tables are fine. All your products will go in the Product table. No foreign keys in any of those tables.
PriceSheet
user_id custom_price product_id type
1 1.99 1 lab-fulfillment
0 NULL 2 self-fulfillment
1 5.99 3 lab-fulfillment
So a user can have as many price sheets as they want, and they can only adjust the price of a product. This can actually be normalized further if you so wish:
PriceSheet (composite key on id, user_id, FK user_id)
id user_id
0 0
1 1
2 1
LabPriceSheet (you could add an id, might be better, or you could use a composite key, stricter)
PriceSheet_id custom_price lab_product_id
0 1.99 0
2 5.99 1
CustomPriceSheet
PriceSheet_id custom_product_id
1 0
With foreign keys as appropriate. This now makes MySQL restrict the custom_price, rather than in PHP (although you would still have to deal with ensuring correct INSERT!).
Now, to deal with who adds the products:
CustomProduct
id user_id product_id timestamp
0 3 2 ...
LabProduct
id lab_id product_id timestamp
0 0 1 ...
1 0 3 ...
So let's double check:
This is what I want :
a User can create multiple PriceSheets. check A User can give a PriceSheet
any name he wants. check There are two PriceSheets types : "Lab
Fulfillment", or "Self Fulfillment". check
if the User chooses "Lab Fulfillment", he can import all or part of the Products of one of the predefined Labs. (I rephrase : there are few Labs that come with a predefined list of Products). The User will only be able to customize the price. He can't add custom products to this PriceSheet.
Yup, because he would create a LabPriceSheet that can only add lab_product_id. Custom price is there too, that overrides the default price in product table.
if the User chooses "Self Fulfillment", he can add his own products, and can personalize each field (name, cost, price, dimension_h, dimension_l).
Yup, he would add a product (you would need to check if a similar one exists, else return the id of the existing product in the product table), and then that would also be an entry in CustomProduct.
I don't know how to link the tables between them. If I put the predefined Products in the Products table and set a Many-to-Many relationship between PriceSheets and Product, the default price of a predefined Product will be overwritten when a User customizes it, which is not what I want.
Yeah that won't happen :) Never (very very rarely) implement many-many rels.
Also, I want the default values of my predefined Products to be only
once in my database. If 100 users uses the predefined Products, I
don't want the default cost to be in my database 100 times.
Of course.
Let me know if you want the MySQL code, I assume you're good! Remember to use InnoDB and properly configure your MySQL configuration!
EDIT
I felt like helping you out with a copy and paste thing. I like copy and paste things. Also, there's a redundant user_id column in the blurb above which I fixed in an earlier edit.
SET GLOBAL innodb_file_per_table = 1;
SET GLOBAL general_log = 'OFF';
SET FOREIGN_KEY_CHECKS=1;
SET GLOBAL character_set_server = utf8mb4;
SET NAMES utf8mb4;
CREATE DATABASE SO; USE SO;
ALTER DATABASE SO CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
CREATE TABLE `User` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`email` VARCHAR(555) NOT NULL,
`password` VARBINARY(200) NOT NULL,
`username` VARCHAR(100) NOT NULL,
`role` INT(2) NOT NULL,
`active` TINYINT(1) NOT NULL,
`created` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
`modified` DATETIME ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `Lab` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(1000) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `Product` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(1000) NOT NULL,
`password` VARBINARY(200) NOT NULL,
`cost` DECIMAL(10, 2) NOT NULL,
`price` DECIMAL(10, 2) NOT NULL,
`height` DECIMAL(15, 5) NOT NULL,
`length` DECIMAL(15, 5) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `CustomProduct` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`user` BIGINT(20) UNSIGNED NOT NULL,
`product` BIGINT(20) UNSIGNED NOT NULL,
`created` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
FOREIGN KEY (`user`) REFERENCES `User`(`id`),
FOREIGN KEY (`product`) REFERENCES `Product`(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `LabProduct` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`lab` BIGINT(20) UNSIGNED NOT NULL,
`product` BIGINT(20) UNSIGNED NOT NULL,
`created` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
FOREIGN KEY (`lab`) REFERENCES `Lab`(`id`),
FOREIGN KEY (`product`) REFERENCES `Product`(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `PriceSheet` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(1000) NOT NULL,
`user` BIGINT(20) UNSIGNED NOT NULL,
PRIMARY KEY (`id`,`user`),
FOREIGN KEY (`user`) REFERENCES `User`(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `LabPriceSheet` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`price_sheet` BIGINT(20) UNSIGNED NOT NULL,
`lab_product` BIGINT(20) UNSIGNED NOT NULL,
`custom_price` DECIMAL(10, 2) NOT NULL,
PRIMARY KEY (`id`),
FOREIGN KEY (`price_sheet`) REFERENCES `PriceSheet`(`id`),
FOREIGN KEY (`lab_product`) REFERENCES `LabProduct`(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `CustomPriceSheet` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`price_sheet` BIGINT(20) UNSIGNED NOT NULL,
`custom_product` BIGINT(20) UNSIGNED NOT NULL,
PRIMARY KEY (`id`),
FOREIGN KEY (`price_sheet`) REFERENCES `PriceSheet`(`id`),
FOREIGN KEY (`custom_product`) REFERENCES `CustomProduct`(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

mysql InnoDB: FOREIGN KEY constraint performance

I have the following InnoDB tables:
CREATE TABLE `vehicle` (
`ID` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`Name` varchar(50) DEFAULT NULL,
`Model` varchar(100) DEFAULT NULL,
`Engine_Type` varchar(70) DEFAULT NULL,
`Construction_From` date DEFAULT NULL,
`Construction_To` date DEFAULT NULL,
`Engine_Power_KW` mediumint(8) unsigned DEFAULT NULL,
`Engine_Power_HP` mediumint(8) unsigned DEFAULT NULL,
`CC` mediumint(8) unsigned DEFAULT NULL,
`TTC_TYP_ID` int(11) unsigned DEFAULT NULL,
`Vehicle_Type` tinyint(1) DEFAULT NULL,
`ID_Body_Type` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=49407 DEFAULT CHARSET=utf8;
CREATE TABLE `part` (
`ID` int(11) unsigned NOT NULL AUTO_INCREMENT,
`ID_Brand` smallint(5) unsigned DEFAULT NULL,
`Code_Full` varchar(50) DEFAULT NULL,
`Code_Condensed` varchar(50) DEFAULT NULL,
`Ean` varchar(50) DEFAULT NULL COMMENT 'The part barcode.',
`TTC_ART_ID` int(11) unsigned DEFAULT NULL COMMENT 'TecDoc ID.',
`ID_Product_Status` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `TTC_ART_ID_UNIQUE` (`TTC_ART_ID`),
UNIQUE KEY `ID_Brand_Code_Full_UNIQUE` (`ID_Brand`,`Code_Full`)
) ENGINE=InnoDB AUTO_INCREMENT=3732260 DEFAULT CHARSET=utf8;
CREATE TABLE `vehicle_part` (
`ID_Vehicle` mediumint(8) unsigned NOT NULL,
`ID_Part` int(11) unsigned NOT NULL,
PRIMARY KEY (`ID_Vehicle`,`ID_Part`),
KEY `fk_vehicle_part_vehicle_id_vehicle_idx` (`ID_Vehicle`),
KEY `fk_vehicle_part_part_id_part_idx` (`ID_Part`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Table vehicle has about 45.000 records, table part has about 3.500.000 records and table vehicle_part has approximately 100.000.000 records.
Creating the secondary indexes for vehicle_part did not take too long, about 30 min for both.
What I cannot do though is create the foreign key constraints: for example
ALTER TABLE `vehicle_part`
ADD CONSTRAINT `fk_vehicle_part_vehicle_id_vehicle`
FOREIGN KEY (`ID_Vehicle`)
REFERENCES `vehicle` (`ID`)
ON DELETE NO ACTION
ON UPDATE NO ACTION;
takes ages to complete. I understand the table is rebuilt since it consumes a lot of disk space. What can I do to improve the performance?
If I create the table with the fk constraints and then add the records the insert process in vehicle_part also takes ages (about 3 days).
I am using a laptop with 4GB RAM.
EDIT 12/01/2016
The answer given by Drew helped a lot in improving the performance dramatically. I changed every script using SELECT ... INTO outfile and then LOAD DATA INFILE from the exported csv file. Also sometimes before LOAD DATA INFILE dropping the indexes and recreating them after the load proccess saves even more time. There is no need to drop the fk constraints just the secondary indexes.
If you know your data is pristine from an FK perspective, then establish your structure without secondary indexes as suggested in comments, but with the FK in the schema yet with FK checks temporarily disabled.
Load your data. If external data, certainly do it with LOAD DATA INFILE.
After your data is loaded, turn on FK checks. And establish secondary indexes with Alter Table.
Again, going with the assumption that your data is clean. There are other ways of proving that after-the-fact for the risk-adverse.
create table student
( id int auto_increment primary key,
sName varchar(100) not null
-- secondary indexes to be added later
);
create table booksAssigned
( id int auto_increment primary key,
studentId int not null,
isbn varchar(20) not null,
constraint foreign key `fk_b_s` (studentId) references student(id)
-- secondary indexes to be added later
);
insert booksAssigned(studentId,isbn) values (1,'asdf'); -- Error 1452 as expected
set FOREIGN_KEY_CHECKS=0; -- turn FK checks of temporarily
insert booksAssigned(studentId,isbn) values (1,'asdf'); -- Error 1452 as expected
set FOREIGN_KEY_CHECKS=1; -- succeeds despite faulty data
insert booksAssigned(studentId,isbn) values (2,'38383-asdf'); -- Error 1452 as expected
As per op comments, how to drop auto-generated index in referencing table after initial schema creation:
mysql> show create table booksAssigned;
| booksAssigned | CREATE TABLE `booksassigned` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`studentId` int(11) NOT NULL,
`isbn` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `fk_b_s` (`studentId`),
CONSTRAINT `booksassigned_ibfk_1` FOREIGN KEY (`studentId`) REFERENCES `student` (`id`)
) ENGINE=InnoDB |
mysql> set FOREIGN_KEY_CHECKS=0;
Query OK, 0 rows affected (0.00 sec)
mysql> drop index `fk_b_s` on booksAssigned;
Query OK, 0 rows affected (0.49 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show create table booksAssigned;
| booksAssigned | CREATE TABLE `booksassigned` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`studentId` int(11) NOT NULL,
`isbn` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
CONSTRAINT `booksassigned_ibfk_1` FOREIGN KEY (`studentId`) REFERENCES `student` (`id`)
) ENGINE=InnoDB |
Further links
Temporarily disable foreign keys
A Rolando Answer

Make LEFT JOIN query more efficient

The following query with LEFT JOIN is drawing too much memory (~4GB), but the host only allows about 120MB for this process.
SELECT grades.grade, grades.evaluation_id, evaluations.evaluation_name, evaluations.value, evaluations.maximum FROM grades LEFT JOIN evaluations ON grades.evaluation_id = evaluations.evaluation_id WHERE grades.registrar_id = ?
Create table syntax for grades:
CREATE TABLE `grades` (
`grade_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`evaluation_id` int(10) unsigned DEFAULT NULL,
`registrar_id` int(10) unsigned DEFAULT NULL,
`grade` float unsigned DEFAULT NULL,
`entry_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`grade_id`),
KEY `registrarGrade_key` (`registrar_id`),
KEY `evaluationKey` (`evaluation_id`),
KEY `grades_id_index` (`grade_id`),
KEY `eval_id_index` (`evaluation_id`),
KEY `grade_index` (`grade`),
CONSTRAINT `evaluationKey` FOREIGN KEY (`evaluation_id`) REFERENCES `evaluations` (`evaluation_id`),
CONSTRAINT `registrarGrade_key` FOREIGN KEY (`registrar_id`) REFERENCES `registrar` (`reg_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1627 DEFAULT CHARSET=utf8;
evaluations table:
CREATE TABLE `evaluations` (
`evaluation_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`instance_id` int(11) unsigned DEFAULT NULL,
`evaluation_col` varchar(255) DEFAULT NULL,
`evaluation_name` longtext,
`evaluation_method` enum('class','email','online','lab') DEFAULT NULL,
`evaluation_deadline` date DEFAULT NULL,
`maximum` int(11) unsigned DEFAULT NULL,
`value` float DEFAULT NULL,
PRIMARY KEY (`evaluation_id`),
KEY `instanceID_key` (`instance_id`),
KEY `eval_name_index` (`evaluation_name`(3)),
KEY `eval_method_index` (`evaluation_method`),
KEY `eval_deadline_index` (`evaluation_deadline`),
KEY `maximum` (`maximum`),
KEY `value_index` (`value`),
KEY `eval_id_index` (`evaluation_id`),
CONSTRAINT `instanceID_key` FOREIGN KEY (`instance_id`) REFERENCES `course_instance` (`instance_id`)
) ENGINE=InnoDB AUTO_INCREMENT=72 DEFAULT CHARSET=utf8;
The Php code to pull the data:
$sql = "SELECT grades.grade, grades.evaluation_id, evaluations.evaluation_name, evaluations.value, evaluations.maximum FROM grades LEFT JOIN evaluations ON grades.evaluation_id = evaluations.evaluation_id WHERE grades.registrar_id = ? AND YEAR(entry_date) = YEAR(CURDATE())";
$result = $mysqli->prepare($sql);
if($result === FALSE)
die($mysqli->error);
$result->bind_param('i',$reg_ids[$i]);
$result->execute();
$result->bind_result($grade, $eval_id, $evalname, $evalval, $max);
while($result->fetch()){
And the fatal error message
Is there a way to drastically reduce the memory load on this query?
Thanks!
Curiously, changing the MySQL query did not change the amount of memory attempted to be allocated
Please provide SHOW CREATE TABLE for both tables; I want to see if you have anything like these:
grades: INDEX(registration_id)
evaluations: PRIMARY KEY(evaluation_id)
Edit
You now have redundant indexes in both table -- probably because of my suggestion. That is, you already have both the indexes that would help with the query.
Since you have LONGTEXT and it is trying to allocate exactly 4GB, the max size of LONGTEXT, I guess that is the problem. Suggest you ALTER that column to be TEXT (64KB max) or MEDIUMTEXT (16MB max). I have not seen this behavior before in PHP, but then I rarely use anything bigger than TEXT.

MySQL Query Optimization for Analytics

I've about 30 tables in MySQL Data Warehouse database for analytical data. For now its around 2 Millions of data rows, but I'm sure in future it will become billions very soon. Challenge is, the queries should return the data in a fast way. I've following simple query which is taking over 60 seconds to process those 2 Million rows:
SELECT count(distinct fact.dim_pageview_id)
FROM `datawarehouse_schema_alpha`.`fact_master` fact
left join dim_visit visit on visit.dim_visit_id = fact.dim_visit_id
left join dim_datetime datetim on datetim.dim_datetime_id = fact.dim_datetime_id
where fact.dim_site_id = 552
Explain query result:
1 SIMPLE fact ref fk_fact_bb_pageview_dim_bb_site,fk_fact_master_dim_site fk_fact_bb_pageview_dim_bb_site 5 const 17490 Using where
1 SIMPLE visit eq_ref PRIMARY PRIMARY 4 datawarehouse_schema_alpha.fact.dim_visit_id 1 Using index
1 SIMPLE datetim eq_ref PRIMARY PRIMARY 4 datawarehouse_schema_alpha.fact.dim_datetime_id 1 Using index
I've following sample database structure:
--
-- Table structure for table `dim_datetime`
--
CREATE TABLE IF NOT EXISTS `dim_datetime` (
`dim_datetime_id` int(11) NOT NULL AUTO_INCREMENT,
`datetime_date` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
`datetime_year` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
`datetime_full` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
PRIMARY KEY (`dim_datetime_id`)
) ENGINE=InnoDB DEFAULT CHARSET=big5 AUTO_INCREMENT=4568326 ;
-- --------------------------------------------------------
--
-- Table structure for table `dim_visit`
--
CREATE TABLE IF NOT EXISTS `dim_visit` (
`dim_visit_id` int(11) NOT NULL AUTO_INCREMENT,
`visit_start_time` datetime DEFAULT NULL,
`visit_end_time` datetime DEFAULT NULL,
`visit_duration` varchar(45) DEFAULT NULL,
PRIMARY KEY (`dim_visit_id`)
) ENGINE=InnoDB DEFAULT CHARSET=big5 AUTO_INCREMENT=1295102 ;
-- --------------------------------------------------------
--
-- Table structure for table `dim_site`
--
CREATE TABLE IF NOT EXISTS `dim_site` (
`dim_site_id` int(11) NOT NULL AUTO_INCREMENT,
`site_name` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`site_url` text CHARACTER SET latin1,
`site_key` text CHARACTER SET latin1,
PRIMARY KEY (`dim_site_id`)
) ENGINE=InnoDB DEFAULT CHARSET=big5 AUTO_INCREMENT=870 ;
--
-- Table structure for table `fact_master`
--
CREATE TABLE IF NOT EXISTS `fact_master` (
`fact_master_id` int(11) NOT NULL AUTO_INCREMENT,
`dim_pageview_id` int(11) DEFAULT NULL,
`dim_visit_id` int(11) DEFAULT NULL,
`dim_site_id` int(11) DEFAULT NULL,
`dim_datetime_id` int(11) DEFAULT NULL,
`master_ip` varchar(255) DEFAULT NULL,
`master_spent_time` varchar(255) DEFAULT NULL,
`master_datetime` datetime DEFAULT NULL,
PRIMARY KEY (`fact_master_id`),
KEY `fk_fact_bb_pageview_dim_bb_visit` (`dim_visit_id`),
KEY `fk_fact_bb_pageview_dim_bb_datetime` (`dim_datetime_id`),
KEY `fk_fact_bb_pageview_dim_bb_pageview` (`dim_pageview_id`),
KEY `fk_fact_master_dim_pageview` (`dim_pageview_id`),
KEY `fk_fact_master_dim_visit` (`dim_visit_id`),
KEY `fk_fact_master_dim_datetime` (`dim_datetime_id`),
KEY `fk_fact_master_dim_site` (`dim_site_id`),
) ENGINE=InnoDB DEFAULT CHARSET=big5 AUTO_INCREMENT=1 ;
--
-- Constraints for dumped tables
--
--
-- Constraints for table `fact_master`
--
ALTER TABLE `fact_master`
ADD CONSTRAINT `fk_fact_master_dim_datetime` FOREIGN KEY (`dim_datetime_id`) REFERENCES `dim_datetime` (`dim_datetime_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
ADD CONSTRAINT `fk_fact_master_dim_pageview` FOREIGN KEY (`dim_pageview_id`) REFERENCES `dim_pageview` (`dim_pageview_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
ADD CONSTRAINT `fk_fact_master_dim_site` FOREIGN KEY (`dim_site_id`) REFERENCES `dim_site` (`dim_site_id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
ADD CONSTRAINT `fk_fact_master_dim_visit` FOREIGN KEY (`dim_visit_id`) REFERENCES `dim_visit` (`dim_visit_id`) ON DELETE NO ACTION ON UPDATE NO ACTION;
Please let me know what problem could there be? How can I use Indexes and Views to get the data in a really really quick way? Any other suggestions than using Indexes or Views, to optimize the speed is also welcome. Thanks!