Optimize tables MySQL - mysql

I have a query that is executed in 35s, which is waaaaay too long.
Here are the 3 tables concerned by the query (each table is approx. 13000 lines long, and should be much longer in the future) :
Table 1 : Domains
CREATE TABLE IF NOT EXISTS `domain` (
`id_domain` int(11) NOT NULL AUTO_INCREMENT,
`domain_domain` varchar(255) NOT NULL,
`projet_domain` int(11) NOT NULL,
`date_crea_domain` int(11) NOT NULL,
`date_expi_domain` int(11) NOT NULL,
`active_domain` tinyint(1) NOT NULL,
`remarques_domain` text NOT NULL,
PRIMARY KEY (`id_domain`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Table 2 : Keywords
CREATE TABLE IF NOT EXISTS `kw` (
`id_kw` int(11) NOT NULL AUTO_INCREMENT,
`kw_kw` varchar(255) NOT NULL,
`clics_kw` int(11) NOT NULL,
`cpc_kw` float(11,3) NOT NULL,
`date_kw` int(11) NOT NULL,
PRIMARY KEY (`id_kw`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Table 3 : Linking between domain and keyword
CREATE TABLE IF NOT EXISTS `kw_domain` (
`id_kd` int(11) NOT NULL AUTO_INCREMENT,
`kw_kd` int(11) NOT NULL,
`domain_kd` int(11) NOT NULL,
`selected_kd` tinyint(1) NOT NULL,
PRIMARY KEY (`id_kd`),
KEY `kw_to_domain` (`kw_kd`,`domain_kd`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The query is as follows :
SELECT ng.*, kd.*, kg.*
FROM domain ng
LEFT JOIN kw_domain kd ON kd.domain_kd = ng.id_domain
LEFT JOIN kw kg ON kg.id_kw = kd.kw_kd
GROUP BY ng.id_domain
ORDER BY kd.selected_kd DESC, kd.id_kd DESC
Basically, it selects all domains, with, for each one of these domains, the last associated keyword.
Does anyone have an idea on how to optimize the tables or the query ?

The following will get the last keyword, according to your logic:
select ng.*,
(select kw_kd
from kw_domain kd
where kd.domain_kd = ng.id_domain and kd.selected_kd = 1
order by kd.id_kd desc
limit 1
) as kw_kd
from domain ng;
For performance, you want an index on kw_domain(domain_kd, selected_kd, kw_kd). In this case, the order of the fields matters.
You can use this as a subquery to get more information about the keyword:
select ng.*, kg.*
from (select ng.*,
(select kw_kd
from kw_domain kd
where kd.domain_kd = ng.id_domain and kd.selected_kd = 1
order by kd.id_kd desc
limit 1
) as kw_kd
from domain ng
) ng left join
kw kg
on kg.id_kw = ng.kw_kd;
In MySQL, group by can have poor performance, so this might work better, particularly with the right indexes.

Related

MySQL Query to get all neighborhoods which a user did not join

I want to get all the neighborhoods (based on different zips) which the user is not a member of already.
I have a users table and several other tables like this:
table name: neighborhood
CREATE TABLE neighborhood(
`neighborhood_id` INT(11) NOT NULL AUTO_INCREMENT,
`name` VARCHAR(255) NOT NULL,
`description` TEXT DEFAULT NULL,
`neighborhood_postal_code` VARCHAR(255) NOT NULL,
`region_neighborhood` VARCHAR(255) NOT NULL,
`created_at` DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`neighborhood_id`),
INDEX `neighborhood_region_neighborhood_FI_1` (`region_neighborhood`)
) ENGINE = InnoDB;
table name: user_neighborhood
CREATE TABLE user_neighborhood(
`user_id` INT(11) NOT NULL,
`neighborhood_id` INT(11) NOT NULL,
`activity_circle` INT(1) DEFAULT 0,
`duo_circle` INT(1) DEFAULT 0,
FOREIGN KEY (`user_id`) REFERENCES `users` (`user_id`),
FOREIGN KEY (`neighborhood_id`) REFERENCES `neighborhood` (`neighborhood_id`)
) ENGINE = InnoDB;
I have tried the following query, but the result is not correct:
SELECT n.*
FROM `neighborhood` as n
left join user_neighborhood as un on n.neighborhood_id = un.neighborhood_id
where un.user_id != 1 and n.neighborhood_postal_code IN ('2000', '2100')
UPDATE: I managed to make the query seem correct at first instance using a subquery like this:
select *
from neighborhood
where neighborhood_id NOT IN (select neighborhood_id from user_neighborhood where user_id != 1)
AND neighborhood_postal_code IN ('2000', '2100')
However, it also returns (some) of the neighborhoods i am in already. It doesnot make much sense to me why only some..
Why exactly are you adding user_id != 1 in your subquery? I think if you know the id of the user you want to fetch for lets say user_id is 10 then use where user_id = 10 in subquery like:
select *
from neighborhood
where neighborhood_id NOT IN (select distinct neighborhood_id from user_neighborhood where user_id = 10)
AND neighborhood_postal_code IN ('2000', '2100')
But if you want to fetch all the neighbors which have no user then you can use this Query:
select *
from neighborhood
where neighborhood_id NOT IN (select distinct neighborhood_id from user_neighborhood)
AND neighborhood_postal_code IN ('2000', '2100')
Hope this helps!

Can I improve my movie selecting SQL query

I've created a database to store movies data. My tables are the following:
movies:
CREATE TABLE IF NOT EXISTS `movies` (
`movieId` int(11) NOT NULL AUTO_INCREMENT,
`imdbId` varchar(255) DEFAULT NULL,
`imdbRating` float DEFAULT NULL,
`movieTitle` varchar(255) NOT NULL,
`movieLength` varchar(255) NOT NULL,
`imdbRatingCount` varchar(255) NOT NULL,
`poster` varchar(255) NOT NULL,
`year` varchar(255) NOT NULL,
PRIMARY KEY (`movieId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
I have a table in which i store movie actors:
CREATE TABLE IF NOT EXISTS `actors` (
`actorId` int(10) NOT NULL AUTO_INCREMENT,
`actorName` varchar(255) NOT NULL,
PRIMARY KEY (`actorId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
And one other in which i store the relation between the movies and actors: (movieActor)
CREATE TABLE IF NOT EXISTS `movieActor` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`movieId` int(10) NOT NULL,
`actorId` int(10) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Now when i want to select a list of movies in which are the selected actors my query is:
SELECT *
FROM movies m inner join
(SELECT movieId FROM movieActor WHERE actorId IN(1,2,3) GROUP BY movieId having count(*) = 3) ma ON m.movieId = ma.movieId
WHERE imdbRating IS NOT NULL ORDER BY imdbRating DESC
This is working perfectly, but i don't know that this is the optimal table structure and query to accomplish this. Are there any better table structure to store data or query the list?
First of all, use indexes on your tables. In my opinion it should be useful to have 3 indexes on movieActor. MovieId - ActorID - MovieIdActorId.
Second try tu use foreign keys. These help to identify the best execution plan for your dbs.
Third try to avoid generating temp tables in your execution plan of your query. Subselects often creates temp tables which are used when the database has to temporarily save something in the RAM. To check this, write EXPLAIN in front of goer query.
I would write it like this:
SELECT m.*, movieActor
FROM movies m inner join
movieActor ma ON m.movieId = ma.movieId
WHERE imdbRating IS NOT NULL
and actorId IN(1,2,3)
GROUP BY movieId
having count(*) = 3)
ORDER BY imdbRating DESC
(Not tested)
Just try to optimize it with the EXPLAIN keyword. It also can help you to create the right indexes.

Sorting result of mysql join by avg of third table?

I have three tables.
One table contains submissions which has about 75,000 rows
One table contains submission ratings and only has < 10 rows
One table contains submission => competition mappings and for my test data also has about 75,000 rows.
What I want to do is
Get the top 50 submissions in a round of a competition.
Top is classified as highest average rating, followed by highest amount of votes
Here is the query I am using which works, but the problem is that it takes over 45 seconds to complete! I profiled the query (results at bottom) and the bottlenecks are copying the data to a tmp table and then sorting it so how can I speed this up?
SELECT `submission_submissions`.*
FROM `submission_submissions`
JOIN `competition_submissions`
ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings`
ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
WHERE `top_round` = 1
AND `competition_id` = '2'
AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC,
COUNT(submission_ratings.`id`) DESC
LIMIT 50
submission_submissions
CREATE TABLE `submission_submissions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`title` varchar(255) NOT NULL,
`description` varchar(255) DEFAULT NULL,
`genre` int(11) NOT NULL,
`goals` text,
`submission` text NOT NULL,
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`date_deleted` datetime DEFAULT NULL,
`cover_image` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `genre` (`genre`),
KEY `account_id` (`account_id`),
KEY `date_created` (`date_created`)
) ENGINE=InnoDB AUTO_INCREMENT=115037 DEFAULT CHARSET=latin1;
submission_ratings
CREATE TABLE `submission_ratings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`submission_id` int(11) NOT NULL,
`stars` tinyint(1) NOT NULL,
`date_created` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `submission_id` (`submission_id`),
KEY `account_id` (`account_id`),
KEY `stars` (`stars`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
competition_submissions
CREATE TABLE `competition_submissions` (
`competition_id` int(11) NOT NULL,
`submission_id` int(11) NOT NULL,
`top_round` int(11) DEFAULT '1',
PRIMARY KEY (`submission_id`),
KEY `competition_id` (`competition_id`),
KEY `top_round` (`top_round`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SHOW PROFILE Result (ordered by duration)
state duration (summed) in sec percentage
Copying to tmp table 33.15621 68.46924
Sorting result 11.83148 24.43260
removing tmp table 3.06054 6.32017
Sending data 0.37560 0.77563
... insignificant amounts removed ...
Total 48.42497 100.00000
EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE competition_submissions index_merge PRIMARY,competition_id,top_round competition_id,top_round 4,5 18596 Using intersect(competition_id,top_round); Using where; Using index; Using temporary; Using filesort
1 SIMPLE submission_submissions eq_ref PRIMARY PRIMARY 4 inkstakes.competition_submissions.submission_id 1 Using where
1 SIMPLE submission_ratings ALL submission_id 5 Using where; Using join buffer (flat, BNL join)
Assuming that in reality you won't be interested in unrated submissions, and that a given submission only has a single competition_submissions entry for a given match and top_round, I suggest:
SELECT s.*
FROM (SELECT `submission_id`,
AVG(`stars`) AvgStars,
COUNT(`id`) CountId
FROM `submission_ratings`
GROUP BY `submission_id`
ORDER BY AVG(`stars`) DESC, COUNT(`id`) DESC
LIMIT 50) r
JOIN `submission_submissions` s
ON r.`submission_id` = s.`id` AND
s.`date_deleted` IS NULL
JOIN `competition_submissions` c
ON c.`submission_id` = s.`id` AND
c.`top_round` = 1 AND
c.`competition_id` = '2'
ORDER BY r.AvgStars DESC,
r.CountId DESC
(If there is more than one competition_submissions entry per submission for a given match and top_round, then you can add the GROUP BY clause back in to the main query.)
If you do want to see unrated submissions, you can union the results of this query to a LEFT JOIN ... WHERE NULL query.
There is a simple trick that works on MySql and helps to avoid copying/sorting huge temp tables in queries like this (with LIMIT X).
Just avoid SELECT *, this copies all columns to the temporary table, then this huge table is sorted, and in the end, the query takes only 50 records from this huge table ( 50 / 70000 = 0,07 % ).
Select only columns that are really necessary to perform sort and limit, and then join missing columns only for selected 50 records by id.
select ss.*
from submission_submissions ss
join (
SELECT `submission_submissions`.id,
AVG(submission_ratings.`stars`) stars,
COUNT(submission_ratings.`id`) cnt
FROM `submission_submissions`
JOIN `competition_submissions`
ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings`
ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
WHERE `top_round` = 1
AND `competition_id` = '2'
AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC,
COUNT(submission_ratings.`id`) DESC
LIMIT 50
) xx
ON ss.id = xx.id
ORDER BY xx.stars DESC,
xx.cnt DESC;

mySql subtract row of different table

I want to subtract between two rows of different table:
I have created a view called leave_taken and table called leave_balance.
I want this result from both table:
leave_taken.COUNT(*) - leave_balance.balance
and group by leave_type_id_leave_type
Code of both table
-----------------View Leave_Taken-----------
CREATE ALGORITHM = UNDEFINED DEFINER=`1`#`localhost` SQL SECURITY DEFINER
VIEW `leave_taken`
AS
select
`leave`.`staff_leave_application_staff_id_staff` AS `staff_leave_application_staff_id_staff`,
`leave`.`leave_type_id_leave_type` AS `leave_type_id_leave_type`,
count(0) AS `COUNT(*)`
from
(
`leave`
join `staff` on((`staff`.`id_staff` = `leave`.`staff_leave_application_staff_id_staff`))
)
where (`leave`.`active` = 1)
group by `leave`.`leave_type_id_leave_type`;
----------------Table leave_balance----------
CREATE TABLE IF NOT EXISTS `leave_balance` (
`id_leave_balance` int(11) NOT NULL AUTO_INCREMENT,
`staff_id_staff` int(11) NOT NULL,
`leave_type_id_leave_type` int(11) NOT NULL,
`balance` int(3) NOT NULL,
`date_added` date NOT NULL,
PRIMARY KEY (`id_leave_balance`),
UNIQUE KEY `id_leave_balance_UNIQUE` (`id_leave_balance`),
KEY `fk_leave_balance_staff1` (`staff_id_staff`),
KEY `fk_leave_balance_leave_type1` (`leave_type_id_leave_type`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;
------- Table leave ----------
CREATE TABLE IF NOT EXISTS `leave` (
`id_leave` int(11) NOT NULL AUTO_INCREMENT,
`staff_leave_application_id_staff_leave_application` int(11) NOT NULL,
`staff_leave_application_staff_id_staff` int(11) NOT NULL,
`leave_type_id_leave_type` int(11) NOT NULL,
`date` date NOT NULL,
`active` int(11) NOT NULL DEFAULT '1',
`date_updated` date NOT NULL,
PRIMARY KEY (`id_leave`,`staff_leave_application_id_staff_leave_application`,`staff_leave_application_staff_id_staff`),
KEY `fk_table1_leave_type1` (`leave_type_id_leave_type`),
KEY `fk_table1_staff_leave_application1` (`staff_leave_application_id_staff_leave_application`,`staff_leave_application_staff_id_staff`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=32 ;
Well, I still don't think you've provided enough information. It would be very helpful to have some sample data and your expected output (in tabular format). That said, I may have something you can start working with. This query finds all staff members, calculates their current leave (grouped by type), and determines the difference between that and their balance by leave type. Take a look at it, and more importantly (perhaps) the sqlfiddle here that I used which has the sample data in it (very important to determining if this is the correct path for your data).
SELECT
staff.id_staff,
staff.name,
COUNT(`leave`.id_leave) AS leave_count,
leave_balance.balance,
(COUNT(`leave`.id_leave) - leave_balance.balance) AS leave_difference,
`leave`.leave_type_id_leave_type AS leave_type
FROM
staff
JOIN `leave` ON staff.id_staff = `leave`.staff_leave_application_staff_id_staff
JOIN leave_balance ON
(
staff.id_staff = leave_balance.staff_id_staff
AND `leave`.leave_type_id_leave_type = leave_balance.leave_type_id_leave_type
)
WHERE
`leave`.active = 1
GROUP BY
staff.id_staff, leave_type;
Good luck!

MySQL query killing my server

Looking at this query there's got to be something bogging it down that I'm not noticing. I ran it for 7 minutes and it only updated 2 rows.
//set product count for makes
$tru->query->run(array(
'name' => 'get-make-list',
'sql' => 'SELECT id, name FROM vehicle_make',
'connection' => 'core'
));
while($tempMake = $tru->query->getArray('get-make-list')) {
$tru->query->run(array(
'name' => 'update-product-count',
'sql' => 'UPDATE vehicle_make SET product_count = (
SELECT COUNT(product_id) FROM taxonomy_master WHERE v_id IN (
SELECT id FROM vehicle_catalog WHERE make_id = '.$tempMake['id'].'
)
) WHERE id = '.$tempMake['id'],
'connection' => 'core'
));
}
I'm sure this query can be optimized to perform better, but I can't think of how to do it.
vehicle_make = 45 rows
taxonomy_master = 11,223 rows
vehicle_catalog = 5,108 rows
All tables have appropriate indexes
UPDATE: I should note that this is a 1-time script so overhead isn't a big deal as long as it runs.
CREATE TABLE IF NOT EXISTS `vehicle_make` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(32) NOT NULL,
`product_count` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=46 ;
CREATE TABLE IF NOT EXISTS `taxonomy_master` (
`product_id` int(10) NOT NULL,
`v_id` int(10) NOT NULL,
`vehicle_requirement` varchar(255) DEFAULT NULL,
`is_sellable` enum('True','False') DEFAULT 'True',
`programming_override` varchar(25) DEFAULT NULL,
PRIMARY KEY (`product_id`,`v_id`),
KEY `idx2` (`product_id`),
KEY `idx3` (`v_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `vehicle_catalog` (
`v_id` int(10) NOT NULL,
`id` int(11) NOT NULL,
`v_make` varchar(255) NOT NULL,
`make_id` int(11) NOT NULL,
`v_model` varchar(255) NOT NULL,
`model_id` int(11) NOT NULL,
`v_year` varchar(255) NOT NULL,
PRIMARY KEY (`v_id`,`v_make`,`v_model`,`v_year`),
UNIQUE KEY `idx` (`v_make`,`v_model`,`v_year`),
UNIQUE KEY `idx2` (`v_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Update: The successful query to get what I needed is here....
SELECT
m.id,COUNT(t.product_id) AS CountOf
FROM taxonomy_master t
INNER JOIN vehicle_catalog v ON t.v_id=v.id
INNER JOIN vehicle_make m ON v.make_id=m.id
GROUP BY m.id;
without the tables/columns this is my best guess from reverse engineering the given queries:
UPDATE m
SET product_count =COUNT(t.product_id)
FROM taxonomy_master t
INNER JOIN vehicle_catalog v ON t.v_id=v.id
INNER JOIN vehicle_make m ON v.make_id=m.id
GROUP BY m.name
The given code loops over each make, and then runs a query the counts for each. My answer just does them all in one query and should be a lot faster.
have an index for each of these:
vehicle_make.id cover on name
vehicle_catalog.id cover make_id
taxonomy_master.v_id
EDIT
give this a try:
CREATE TEMPORARY TABLE CountsOf (
id int(11) NOT NULL
, CountOf int(11) NOT NULL DEFAULT 0.00
);
INSERT INTO CountsOf
(id, CountOf )
SELECT
m.id,COUNT(t.product_id) AS CountOf
FROM taxonomy_master t
INNER JOIN vehicle_catalog v ON t.v_id=v.id
INNER JOIN vehicle_make m ON v.make_id=m.id
GROUP BY m.id;
UPDATE taxonomy_master,CountsOf
SET taxonomy_master.product_count=CountsOf.CountOf
WHERE taxonomy_master.id=CountsOf.id;
instead of using nested query ,
you can separated this query to 2 or 3 queries,
and in php insert the result of the inner query to the out query ,
its faster !
#haim-evgi Separating the queries will not increase the speed significantly, it will just shift the load from the DB server to the Web server and create overhead of moving data between the two servers.
I am not sure with the appropriate indexes you run such query 7 minutes. Could you please show the table structure of the tables involved in these queries.
Seems like you need the following indices:
INDEX BTREE('make_id') on vehicle_catalog
INDEX BTREE('v_id') on taxonomy_master