How to refact UPDATE via SELECT - mysql

I have this structure of my db:
CREATE TABLE IF NOT EXISTS `peoples` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;
For customers.
CREATE TABLE IF NOT EXISTS `peoplesaddresses` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`people_id` int(10) unsigned NOT NULL,
`phone` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`address` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;
For their addresses.
CREATE TABLE IF NOT EXISTS `peoplesphones` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`people_id` int(10) unsigned NOT NULL,
`phone` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
`address` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;
For their phones.
UPD4
ALTER TABLE peoplesaddresses DISABLE KEYS;
ALTER TABLE peoplesphones DISABLE KEYS;
ALTER TABLE peoplesaddresses ADD INDEX i_phone (phone);
ALTER TABLE peoplesphones ADD INDEX i_phone (phone);
ALTER TABLE peoplesaddresses ADD INDEX i_address (address);
ALTER TABLE peoplesphones ADD INDEX i_address (address);
ALTER TABLE peoplesaddresses ENABLE KEYS;
ALTER TABLE peoplesphones ENABLE KEYS;
END UPD4
CREATE TABLE IF NOT EXISTS `order` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`people_id` int(10) unsigned NOT NULL,
`name` varchar(255) CHARACTER SET utf8 NOT NULL,
`phone` varchar(255) CHARACTER SET utf8 NOT NULL,
`adress` varchar(255) CHARACTER SET utf8 NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=12 ;
INSERT INTO `order` (`id`, `people_id`, `name`, `phone`, `adress`) VALUES
(1, 0, 'name1', 'phone1', 'address1'),
(2, 0, 'name1_1', 'phone1', 'address1_1'),
(3, 0, 'name1_1', 'phone1', 'address1_2'),
(4, 0, 'name2', 'phone2', 'address2'),
(5, 0, 'name2_1', 'phone2', 'address2_1'),
(6, 0, 'name3', 'phone3', 'address3'),
(7, 0, 'name4', 'phone4', 'address4'),
(8, 0, 'name1_1', 'phone5', 'address1_1'),
(9, 0, 'name1_1', 'phone5', 'address1_2'),
(11, 0, 'name1', 'phone1', 'address1'),
(10, 0, 'name1', 'phone1', 'address1');
Production base have over 9000 records. Is there way to execute this 3 update query's little more faster, than now (~50 min on dev machine).
INSERT INTO peoplesphones( phone, address )
SELECT DISTINCT `order`.phone, `order`.adress
FROM `order`
GROUP BY `order`.phone;
Fill peoplesphones table with unique phones
INSERT INTO peoplesaddresses( phone, address )
SELECT DISTINCT `order`.phone, `order`.adress
FROM `order`
GROUP BY `order`.adress;
Fill peoplesaddresses table with unique adress.
The next three querys are very slow:
UPDATE peoplesaddresses, peoplesphones SET peoplesaddresses.people_id = peoplesphones.id WHERE peoplesaddresses.phone = peoplesphones.phone;
UPDATE peoplesaddresses, peoplesphones SET peoplesphones.people_id = peoplesaddresses.people_id WHERE peoplesaddresses.address = peoplesphones.address;
UPDATE `order`, `peoplesphones` SET `order`.people_id = `peoplesphones`.people_id where `order`.phone = `peoplesphones`.phone;
Finally fill people table, and clear uneccessary fields.
INSERT INTO peoples( id, name )
SELECT DISTINCT `order`.people_id, `order`.name
FROM `order`
GROUP BY `order`.people_id;
ALTER TABLE `peoplesphones`
DROP `address`;
ALTER TABLE `peoplesaddresses`
DROP `phone`;
So, again: How can I make those UPDATE query's a little more faster? THX.
UPD: I forgott to say: I need to do it at once, just for migrate phones and adresses into other tables since one people can have more than one phone, and can order pizza not only at home.
UPD2:
UPD3:
Replace slow update querys on this (without with) get nothing.
UPDATE peoplesaddresses
LEFT JOIN
peoplesphones
ON peoplesaddresses.phone = peoplesphones.phone
SET peoplesaddresses.people_id = peoplesphones.id;
UPDATE peoplesphones
LEFT JOIN
`peoplesaddresses`
ON `peoplesaddresses`.address = `peoplesphones`.address
SET `peoplesphones`.people_id = `peoplesaddresses`.people_id;
UPDATE `order`
LEFT JOIN
`peoplesphones`
ON `order`.phone = `peoplesphones`.phone
SET `order`.people_id = `peoplesphones`.people_id;
UPD4 After adding code at the top (upd4), script takes a few seconds for execute. But on ~6.5k query it terminate with text: "The system cannot find the Drive specified".
Thanks to All. Especially to xQbert and Brent Baisley.

50 minutes for 9000 records is a bit ridiculous, event without indexes. You might as well put the 9000 records in Excel and do what you need to do. I think there is something else going on with your dev machine. Perhaps you have mysql configured to use very little memory? Maybe you can post the results of this "query":
show variables like "%size%";
Just this morning I did an insert(ignore)/select on 2 tables (one into another), both with over 400,000 records. 126,000 records were inserted into the second table, it took a total of 2 minutes 13 seconds.
I would say put indexes on any of the fields you are joining or grouping on, but this seems like a one time job. I don't think the lack of indexes is your problem.

All write operations are slow in relational databases. Especially indexes make them slow, since they have to be recalculated.
If you're using a WHERE in your statements, you should place an index on the fields referenced.
GROUP BY is always very slow, and so is DISTINCT, since they have to do a lot of checks that don't scale linearly. Always avoid them.
You may like to choose a different database engine for what you're doing. 9000 records in 50 minutes is very slow. Experiment with a few different engines, such as MyISAM and InnoDB. If you're using temporary tables a lot, MEMORY is really fast for those.
Update: Also, updating multiple tables in one statement probably shouldn't be done.

Related

How to copy a very large table into another table in MYSQL?

I have a large table with 110M rows. I would like to copy some of the fields into a new table and here is a rough idea of how I am trying to do:
DECLARE l_seenChangesTo DATETIME DEFAULT '1970-01-01 01:01:01';
DECLARE l_migrationStartTime DATETIME;
SELECT NOW() into l_migrationStartTime;
-- See if we've run this migration before and if so, pick up from where we left off...
IF EXISTS(SELECT seenChangesTo FROM migration_status WHERE client_user = CONCAT('this-migration-script-', user())) THEN
SELECT seenChangesTo FROM migration_status WHERE client_user = CONCAT('this-migration-script-', user()) INTO l_seenChangesTo;
SELECT NOW() as LogTime, CONCAT('Picking up from where we left off: ', l_seenChangesTo) as MigrationStatus;
END IF;
INSERT IGNORE INTO newTable
(field1, field2, lastModified)
SELECT o.column1 AS field1,
o.column2 AS field2,
o.lastModified
FROM oldTable o
WHERE
o.lastModified >= l_seenChangesTo AND
o.lastModified <= l_migrationStartTime;
INSERT INTO migration_status (client_user,seenChangesTo)
VALUES (CONCAT('this-migration-script-', user()), l_migrationStartTime)
ON DUPLICATE KEY UPDATE seenChangesTo=l_migrationStartTime;
Context:
CREATE TABLE IF NOT EXISTS `newTable` (
`field1` varchar(255) NOT NULL,
`field2` tinyint unsigned NOT NULL,
`lastModified` datetime NOT NULL,
PRIMARY KEY (`field1`, `field2`),
KEY `ix_field1` (`field1`),
KEY `ix_lastModified` (`lastModified`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `oldTable` (
`column1` varchar(255) NOT NULL,
`column2` tinyint unsigned NOT NULL,
`lastModified` datetime NOT NULL,
PRIMARY KEY (`column1`, `column2`),
KEY `ix_column1` (`column1`),
KEY `ix_lastModified` (`lastModified`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `migration_status` (
`client_user` char(64) NOT NULL,
`seenChangesTo` char(128) NOT NULL,
PRIMARY KEY (`client_user`)
);
Note: I have a few more columns in oldTable. Both oldTable and newTable are in same DB schema using mysql.
What's the general strategy when copying a very table? Should I perform this migration in an iterative manner by copy say 50,000 rows at time.
The insert speed doing a migration like this iteratively is going to be dreadfully slow. Why not SELECT oldTable INTO OUTFILE, then LOAD DATA INFILE ?

How to fetch a record containing something if Count(*)=0

I have a set of tables containing staff memebers' secondary specialities in different formats. Actually, only few primary specialities allow a person to have a secondary speciality.
I make a UNION view, combining the secondary specialities in this way and it is working fine:
SELECT tpl.id, tpl.speciality, 20 as gr FROM tpath_list tpl
UNION ALL
SELECT tlt.id, tlt.lab_type as speciality, 11 as gr FROM tlab_types tlt
UNION ALL
SELECT til.id, til.speciality, 10 as gr FROM tinstrumental_list til
Field gr contains the index field of the primary specialities (10, 11 and 20) allowing to have a secondary speciality. All other specialities (with id 1, 2, 3 etc.) do not have secondary ones.
I receive something like this...
I can now fetch data from the created view using WHERE gr=:n.
How can I modify the view so that fetching data from it using WHERE gr=:n (1, 2, 3 etc.) clause will give me a single record id=1 speciality="Not specified" gr=n
Appended on comments of the community:
I need to fetch all the records if :n IN(10,11,20) that are present in the tables and presented by the view. For example, 3 records listed in the picture for :n=20. Only if :n NOT IN(10,11,20) I need an additional (single) record absenting from the tables (and the view). Please, note that records with gr IN(10,11,20) have got the id=1 and speciality='Not specified'.
Appended on request of Jon Tofte-Hansen
Here is an unfiltered view output (without WHERE clause).
If I fetch the view with WHERE gr=20 I get the following (that is what I want):
If with WHERE gr=10 - this set (that is what I want):
If I fetch it with WHERE gr=1 then I woud like to receive a sigle record that is absent in any of the tables used for view creation. I wish to get this output but do not know how to:
Here is the structure of the tables if it might help:
DROP TABLE IF EXISTS `tspecialities_list`;
CREATE TABLE `tspecialities_list` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`speciality` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`speciality`),
KEY `i_by_speciality` (`speciality`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tlab_types`;
CREATE TABLE `tlab_types` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`lab_type` varchar(255) DEFAULT NULL,
`deleted` smallint(5) unsigned DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tpath_list`;
CREATE TABLE `tpath_list` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`speciality` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`speciality`),
KEY `i_by_speciality` (`speciality`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tinstrumental_list`;
CREATE TABLE `tinstrumental_list` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`speciality` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`speciality`),
KEY `i_by_speciality` (`speciality`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8;
It is not entirely clear what you want, but this could be it. Add an extra union that produce all the id's not present in the three tables with secondary specialities:
SELECT tpl.id, tpl.speciality, 20 as gr FROM tpath_list tpl
UNION ALL
SELECT tlt.id, tlt.lab_type as speciality, 11 as gr FROM tlab_types tlt
UNION ALL
SELECT til.id, til.speciality, 10 as gr FROM tinstrumental_list til
UNION ALL
select tsl.id, 'Not specified', tsl.id as gr from tspecialities_list tsl
where tsl.id not in
(SELECT tpl.id FROM tpath_list tpl
UNION
SELECT tlt.id FROM tlab_types tlt
UNION
SELECT til.id FROM tinstrumental_list til)
It seems like you are just asking for the following:
SELECT *
FROM viewname
WHERE speciality='Not specified' and gr=:n
Is there a reason this isn't what you want?

JOIN databases from SELECT

I need a joined (Union like) result from multiple databases on the same databaseserver in one query. Every customer database contains a location table and all customers are listed in the core database.
I don't need a simple join between to different databases. I need the actual joined database name to come from the same query.
I figure something like this.
SELECT customerlist.dbname,customerlist.realname,location.address
FROM core.customerlist
INNER JOIN `customer.dbname`.location
ORDER BY customerlist.realname
I know this won't work, I'm just trying to pseudo code what I'm searching for. I hope someone can help.
Database structure:
-- Database: `core`
CREATE TABLE IF NOT EXISTS `customerlist` (
`realname` varchar(20) NOT NULL,
`dbname` varchar(16) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `customerlist` (`realname`, `dbname`) VALUES
('Johnny', 'johnny'),
('Alfred', 'alfred');
-- --------------------------------------------------------
-- Database: `alfred`
CREATE TABLE IF NOT EXISTS `location` (
`address` varchar(20) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `location` (`address`) VALUES
('House Three'),
('Car 1');
-- --------------------------------------------------------
-- Database: `johnny`
CREATE TABLE IF NOT EXISTS `location` (
`address` varchar(20) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `location` (`address`) VALUES
('House One'),
('House Two');
Desired result;
johnny,Johnny,House One
johnny,Johnny,House Two
alfred,Alfred,House Three
alfred,Alfred,Car 1
You'll have to be careful of the two table identical tables...joining them in a union if fine since they are identical in structure:
SELECT customerlist.dbname,customerlist.realname,location.address
FROM core.customerlist
LEFT JOIN (
SELECT * FROM (johnny.location UNION alfred.location
)) AS T2 ON customerlist.dbname = T2.dbname
ORDER BY customerlist.realname

Get all items from table while joining with a second table in a single query

I got two tables:
CREATE TABLE IF NOT EXISTS `groups2rights` (
`groups2rights_group_id` int(11) NOT NULL default '0',
`groups2rights_right` int(11) NOT NULL default '0',
PRIMARY KEY (`groups2rights_group_id`,`groups2rights_right`),
KEY `groups2rights_right` (`groups2rights_right`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `groups2rights` (`groups2rights_group_id`, `groups2rights_right`) VALUES (1, 35);
CREATE TABLE IF NOT EXISTS `rights` (
`right` int(11) NOT NULL auto_increment,
`right_name` varchar(255) default NULL,
`description` text NOT NULL,
`category` int(11) NOT NULL default '0',
PRIMARY KEY (`right`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=36 ;
INSERT INTO `rights` (`right`, `right_name`, `description`, `category`) VALUES
(33, 'admin_right_group_add', '', 100),
(34, 'admin_right_group_edit', '', 0),
(35, 'admin_right_group_delete', '', 0);
ALTER TABLE `groups2rights` ADD CONSTRAINT `groups2rights_ibfk_4` FOREIGN KEY (`groups2rights_right`) REFERENCES `rights` (`right`) ON DELETE CASCADE;
Now I tried to select all available Rights and also get if the group has it assigned, but somehow I'm missing some of the rights. Query:
SELECT r.*,g2r.groups2rights_group_id
FROM rights AS r
LEFT JOIN groups2rights AS g2r ON (g2r.groups2rights_right=r.right)
WHERE g2r.groups2rights_group_id=<<ID>> OR g2r.groups2rights_group_id IS NULL
ORDER BY r.category,r.right_name ASC
Any ideas?
Edit:
Updated the Code.
Expected Result be 3 Rows with 2 of them Havin a Null field and one having a value set.
If you do
SELECT r.*,g2r.group_id
FROM rights AS r
LEFT JOIN groups2rights AS g2r ON (g2r.right=r.right)
WHERE g2r.group_id=<<#id>> OR g2r.group_id IS NULL
ORDER BY r.category,r.right_name ASC
You will not gets rows where g2r.group_id <> null and also g2r.group_id <> <<#id>>
If you want to get all rows in rights and some of the rows in groups2rights you should do:
SELECT r.*,g2r.group_id
FROM rights AS r
LEFT JOIN (SELECT * FROM groups2rights WHERE group_id=<<#id>>) AS g2r
ON (g2r.right=r.right)
ORDER BY r.category,r.right_name ASC
This should work.
So you want to return all results found in the right table? In this case you should be using a RIGHT JOIN. This will return all results from the right table regardless of it matching the left table.
http://www.w3schools.com/sql/sql_join_right.asp

Anyone care to help optimize a MySQL query?

Here's the query:
SELECT COUNT(*) AS c, MAX(`followers_count`) AS max_fc,
MIN(`followers_count`) AS min_fc, MAX(`following_count`) AS max_fgc,
MIN(`following_count`) AS min_fgc, SUM(`followers_count`) AS fc,
SUM(`following_count`) AS fgc, MAX(`updates_count`) AS max_uc,
MIN(`updates_count`) AS min_uc, SUM(`updates_count`) AS uc
FROM `profiles`
WHERE `twitter_id` IN (SELECT `followed_by`
FROM `relations`
WHERE `twitter_id` = 123);
The two tables are profiles and relations. Both have over 1,000,000 rows, InnoDB engine. Both have indexes on twitter_id, relations has an extra index on (twitter_id, followed_by). The query is taking over 6 seconds to execute, this really frustrates me. I know that I can JOIN this somehow, but my MySQL knowledge is not so cool, that's why I'm asking for your help.
Thanks in advance everyone =)
Cheers,
K ~
Updated
Okay I managed to get down to 2,5 seconds. I used INNER JOIN and added the three index pairs. Here's the EXPLAIN results:
id, select_type, table, type, possible_keys,
key, key_len, ref, rows, Extra
1, 'SIMPLE', 'r', 'ref', 'relation',
'relation', '4', 'const', 252310, 'Using index'
1, 'SIMPLE', 'p', 'ref', 'PRIMARY,twiter_id,id_fc,id_fgc,id_uc',
'id_uc', '4', 'follerme.r.followed_by', 1, ''
Hope this helps.
Another update
Here are the SHOW CREATE TABLE statements for both tables:
CREATE TABLE `profiles` (
`twitter_id` int(10) unsigned NOT NULL,
`screen_name` varchar(45) NOT NULL default '',
`followers_count` int(10) unsigned default NULL,
`following_count` int(10) unsigned default NULL,
`updates_count` int(10) unsigned default NULL,
`location` varchar(45) default NULL,
`bio` varchar(160) default NULL,
`url` varchar(255) default NULL,
`image` varchar(255) default NULL,
`registered` int(10) unsigned default NULL,
`timestamp` int(10) unsigned default NULL,
`relations_timestamp` int(10) unsigned default NULL,
PRIMARY KEY USING BTREE (`twitter_id`,`screen_name`),
KEY `twiter_id` (`twitter_id`),
KEY `screen_name` USING BTREE (`screen_name`,`twitter_id`),
KEY `id_fc` (`twitter_id`,`followers_count`),
KEY `id_fgc` (`twitter_id`,`following_count`),
KEY `id_uc` (`twitter_id`,`updates_count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `relations` (
`id` int(10) unsigned NOT NULL auto_increment,
`twitter_id` int(10) unsigned NOT NULL default '0',
`followed_by` int(10) unsigned default NULL,
`timestamp` int(10) unsigned default NULL,
PRIMARY KEY USING BTREE (`id`,`twitter_id`),
UNIQUE KEY `relation` (`twitter_id`,`followed_by`)
) ENGINE=InnoDB AUTO_INCREMENT=1209557 DEFAULT CHARSET=utf8
Wow, what a mess =) Sorry!
A join would look something like this:
SELECT COUNT(*) AS c,
MAX(p.`followers_count`) AS max_fc,
MIN(p.`followers_count`) AS min_fc,
MAX(p.`following_count`) AS max_fgc,
MIN(p.`following_count`) AS min_fgc,
SUM(p.`followers_count`) AS fc,
SUM(p.`following_count`) AS fgc,
MAX(p.`updates_count`) AS max_uc,
MIN(p.`updates_count`) AS min_uc,
SUM(p.`updates_count`) AS uc
FROM `profiles` AS p
INNER JOIN `relations` AS r ON p.`twitter_id` = r.`followed_by`
WHERE r.`twitter_id` = 123;
To help optimize it you should run EXPLAIN SELECT ... on both queries.
Create the following composite indexes:
profiles (twitter_id, followers_count)
profiles (twitter_id, following_count)
profiles (twitter_id, updates_count)
and post the query plan, for God's sake.
By the way, how many rows does this COUNT(*) return?
Update:
Your table rows are quite long. Create a composite index on all the fields you select:
profiles (twitter_id, followers_count, following_count, updates_count)
so that the JOIN query can retrieve all the values it need from that index.
SELECT COUNT(*) AS c,
MAX(`followers_count`) AS max_fc, MIN(`followers_count`) AS min_fc,
MAX(`following_count`) AS max_fgc, MIN(`following_count`) AS min_fgc,
SUM(`followers_count`) AS fc, SUM(`following_count`) AS fgc,
MAX(`updates_count`) AS max_uc, MIN(`updates_count`) AS min_uc, SUM(`updates_count`) AS uc
FROM `profiles`
JOIN `relations`
ON (profiles.twitter_id = relations.followed_by)
WHERE relations.twitted_id = 123;
might be a bit faster, but you'll need to measure and check if that is indeed so.
count(*) is a very expensive operation under the InnoDB Engine, have you tried this query without that piece? If it's causing the most processing time then maybe you could keep a running value instead of querying for it each time.
I'd approach this problem from a programmers angle; I'd have a separate table (or storage area somewhere) that stored the max,min and sum values associated with each field in your original query and update those values every time I updated and added a table record. (although deleting may be problematic if not handled correctly).
After the original query to populate these values is complete (which is the almost the same as the query you posted), you're essentially reducing your final query to getting one row from a data table, rather than computing everything all at once.