This is slow and taking 1.003 seconds
set #POSTCODE = 'SL2 1AT';
select latitude, longitude from postcode.postcodelatlng where postcode = #POSTCODE;
This is fast and taking .127 seconds
select latitude, longitude from postcode.postcodelatlng where postcode = 'SL2 1AT';
Table create statement
CREATE TABLE `postcodelatlng` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`postcode` varchar(8) NOT NULL,
`latitude` decimal(18,15) NOT NULL,
`longitude` decimal(18,15) NOT NULL,
PRIMARY KEY (`id`),
KEY `id_postcode` (`postcode`),
KEY `latLong` (`latitude`,`longitude`) USING BTREE
) ENGINE=MyISAM AUTO_INCREMENT=1738245 DEFAULT CHARSET=latin1;
Could you please help, why this is happening and we can we do to resolve this.
first of all, I suggest you use InnoDB over MyISAM.
The compare to a latin1 field was incurring a fairly heavy performance overhead.
instead of you can use following.
SET #POSTCODE:= CONVERT('SL2 1AT' USING latin1);
If you client is utf8 but your table is latin1.Now the query runs as fast as I would expect.
:= is the assignment operator in MySQL
set #POSTCODE := 'SL2 1AT';
select latitude, longitude from postcode.postcodelatlng where postcode = #POSTCODE;
There is a colon missing in the first line
Related
I'm trying to denormalize a few MySQL tables I have into a new table that I can use to speed up some complex queries with lots of business logic. The problem that I'm having is that there are 2.3 million records I need to add to the new table and to do that I need to pull data from several tables and do a few conversions too. Here's my query (with names changed)
INSERT INTO database_name.log_set_logs
(offload_date, vehicle, jurisdiction, baselog_path, path,
baselog_index_guid, new_location, log_set_name, index_guid)
(
select STR_TO_DATE(logset_logs.offload_date, '%Y.%m.%d') as offload_date,
logset_logs.vehicle, jurisdiction, baselog_path, path,
baselog_trees.baselog_index_guid, new_location, logset_logs.log_set_name,
logset_logs.index_guid
from
(
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, '/', 7), '/', -1) as offload_date,
SUBSTRING_INDEX(SUBSTRING_INDEX(path, '/', 8), '/', -1) as vehicle,
SUBSTRING_INDEX(path, '/', 9) as baselog_path, index_guid,
path, log_set_name
FROM database_name.baselog_and_amendment_guid_to_path_mappings
) logset_logs
left join database_name.log_trees baselog_trees
ON baselog_trees.original_location = logset_logs.baselog_path
left join database_name.baselog_offload_location location
ON location.baselog_index_guid = baselog_trees.baselog_index_guid);
The query itself works because I was able to run it using a filter on log_set_name however that filter's condition will only work for less than 1% of the total records because one of the values for log_set_name has 2.2 million records in it which is the majority of the records. So there is nothing else I can use to break this query up into smaller chunks from what I can see. The problem is that the query is taking too long to run on the rest of the 2.2 million records and it ends up timing out after a few hours and then the transaction is rolled back and nothing is added to the new table for the 2.2 million records; only the 0.1 million records were able to be processed and that was because I could add a filter that said where log_set_name != 'value with the 2.2 million records'.
Is there a way to make this query more performant? Am I trying to do too many joins at once and perhaps I should populate the row's columns in their own individual queries? Or is there some way I can page this type of query so that MySQL executes it in batches? I already got rid of all my indexes on the log_set_logs table because I read that those will slow down inserts. I also jacked my RDS instance up to a db.r4.4xlarge write node. I am also using MySQL Workbench so I increased all of it's timeout values to their maximums giving them all nines. All three of these steps helped and were necessary in order for me to get the 1% of the records into the new table but it still wasn't enough to get the 2.2 million records without timing out. Appreciate any insights as I'm not adept to this type of bulk insert from a select.
'CREATE TABLE `log_set_logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`purged` tinyint(1) NOT NULL DEFAUL,
`baselog_path` text,
`baselog_index_guid` varchar(36) DEFAULT NULL,
`new_location` text,
`offload_date` date NOT NULL,
`jurisdiction` varchar(20) DEFAULT NULL,
`vehicle` varchar(20) DEFAULT NULL,
`index_guid` varchar(36) NOT NULL,
`path` text NOT NULL,
`log_set_name` varchar(60) NOT NULL,
`protected_by_retention_condition_1` tinyint(1) NOT NULL DEFAULT ''1'',
`protected_by_retention_condition_2` tinyint(1) NOT NULL DEFAULT ''1'',
`protected_by_retention_condition_3` tinyint(1) NOT NULL DEFAULT ''1'',
`protected_by_retention_condition_4` tinyint(1) NOT NULL DEFAULT ''1'',
`general_comments_about_this_log` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1736707 DEFAULT CHARSET=latin1'
'CREATE TABLE `baselog_and_amendment_guid_to_path_mappings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`path` text NOT NULL,
`index_guid` varchar(36) NOT NULL,
`log_set_name` varchar(60) NOT NULL,
PRIMARY KEY (`id`),
KEY `log_set_name_index` (`log_set_name`),
KEY `path_index` (`path`(42))
) ENGINE=InnoDB AUTO_INCREMENT=2387821 DEFAULT CHARSET=latin1'
...
'CREATE TABLE `baselog_offload_location` (
`baselog_index_guid` varchar(36) NOT NULL,
`jurisdiction` varchar(20) NOT NULL,
KEY `baselog_index` (`baselog_index_guid`),
KEY `jurisdiction` (`jurisdiction`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1'
'CREATE TABLE `log_trees` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`baselog_index_guid` varchar(36) DEFAULT NULL,
`original_location` text NOT NULL, -- This is what I have to join everything on and since it's text I cannot index it and the largest value is above 255 characters so I cannot change it to a vachar then index it either.
`new_location` text,
`distcp_returncode` int(11) DEFAULT NULL,
`distcp_job_id` text,
`distcp_stdout` text,
`distcp_stderr` text,
`validation_attempt` int(11) NOT NULL DEFAULT ''0'',
`validation_result` tinyint(1) NOT NULL DEFAULT ''0'',
`archived` tinyint(1) NOT NULL DEFAULT ''0'',
`archived_at` timestamp NULL DEFAULT NULL,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`dir_exists` tinyint(1) NOT NULL DEFAULT ''0'',
`random_guid` tinyint(1) NOT NULL DEFAULT ''0'',
`offload_date` date NOT NULL,
`vehicle` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `baselog_index_guid` (`baselog_index_guid`)
) ENGINE=InnoDB AUTO_INCREMENT=1028617 DEFAULT CHARSET=latin1'
baselog_offload_location has not PRIMARY KEY; what's up?
GUIDs/UUIDs can be terribly inefficient. A partial solution is to convert them to BINARY(16) to shrink them. More details here: http://localhost/rjweb/mysql/doc.php/uuid ; (MySQL 8.0 has similar functions.)
It would probably be more efficient if you have a separate (optionally redundant) column for vehicle rather than needing to do
SUBSTRING_INDEX(SUBSTRING_INDEX(path, '/', 8), '/', -1) as vehicle
Why JOIN baselog_offload_location? Three seems to be no reference to columns in that table. If there, be sure to qualify them so we know what is where. Preferably use short aliases.
The lack of an index on baselog_index_guid may be critical to performance.
Please provide EXPLAIN SELECT ... for the SELECT in your INSERT and for the original (slow) query.
SELECT MAX(LENGTH(original_location)) FROM .. -- to see if it really is too big to index. What version of MySQL are you using? The limit increased recently.
For the above item, we can talk about having a 'hash'.
"paging the query". I call it "chunking". See http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks . That talks about deleting, but it can be adapted to INSERT .. SELECT since you want to "chunk" the select. If you go with chunking, Javier's comment becomes moot. Your code would be chunking the selects, hence batching the inserts:
Loop:
INSERT .. SELECT .. -- of up to 1000 rows (see link)
End loop
I have a table defined as follows:
| book | CREATE TABLE `book` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`provider_id` int(10) unsigned DEFAULT '0',
`source_id` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`description` longtext COLLATE utf8_unicode_ci,
PRIMARY KEY (`id`),
UNIQUE KEY `provider` (`provider_id`,`source_id`),
KEY `idx_source_id` (`source_id`),
) ENGINE=InnoDB AUTO_INCREMENT=1605425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
when there are about 10 concurrent read with following sql:
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '1037122800') ORDER BY `book`.`id` ASC LIMIT 1
it becomes slow, it takes about 100 ms.
however if I changed it to
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '221630001') LIMIT 1
then it is normal, it takes several ms.
I don't understand why adding order by id makes query much slower? could anyone expain?
Try to add desired columns (Select Column Name,.. ) instead of * or Refer this.
Why is my SQL Server ORDER BY slow despite the ordered column being indexed?
I'm not a mysql expert, and not able to perform a detailed analysis, but my guess would be that because you are providing values for the UNIQUE KEY in the WHERE clause, the engine can go and fetch that row directly using an index.
However, when you ask it to ORDER BY the id column, which is a PRIMARY KEY, that changes the access path. The engine now guesses that since it has an index on id, and you want to order by id, it is better to fetch that data in PK order, which will avoid a sort. In this case though, it leads to a slower result, as it has to compare every row to the criteria (a table scan).
Note that this is just conjecture. You would need to EXPLAIN both statements to see what is going on.
I have a large live database where around 1000 users are updating 2 or more updates every minute. at the same time there are 4 users are getting reports and adding new items. the main 2 tables contains around 2 Million and 4 Million rows till present.
Queries using these tables are taking too much time, even simple queries like:
"SELECT COUNT(*) FROM MyItemsTable" and "SELECT COUNT(*) FROM MyTransactionsTable"
are taking 10 seconds and 26 seconds
large reports now are taking 15mins !!! toooooo much time.
All the table that I'm using are innodb
is there any way to solve this problem before I read about reputation ??
Thank you in advance for any help
Edit
Here is the structure and indexes of MyItemsTable:
CREATE TABLE `pos_MyItemsTable` (
`itemid` bigint(15) NOT NULL,
`uploadid` bigint(15) NOT NULL,
`itemtypeid` bigint(15) NOT NULL,
`statusid` int(1) NOT NULL,
`uniqueid` varchar(10) DEFAULT NULL,
`referencenb` varchar(30) DEFAULT NULL,
`serialnb` varchar(25) DEFAULT NULL,
`code` varchar(50) DEFAULT NULL,
`user` varchar(16) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`pass` varchar(100) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`expirydate` date DEFAULT NULL,
`userid` bigint(15) DEFAULT NULL,
`insertdate` datetime DEFAULT NULL,
`updateuser` bigint(15) DEFAULT NULL,
`updatedate` datetime DEFAULT NULL,
`counternb` int(1) DEFAULT '0',
PRIMARY KEY (`itemid`),
UNIQUE KEY `referencenb_unique` (`referencenb`),
KEY `MyItemsTable_r04` (`itemtypeid`),
KEY `MyItemsTable_r05` (`uploadid`),
KEY `FK_MyItemsTable` (`statusid`),
KEY `ind_MyItemsTable_serialnb` (`serialnb`),
KEY `uniqueid_key` (`uniqueid`),
KEY `ind_MyItemsTable_insertdate` (`insertdate`),
KEY `ind_MyItemsTable_counternb` (`counternb`),
CONSTRAINT `FK_MyItemsTable` FOREIGN KEY (`statusid`) REFERENCES `MyItemsTable_statuses` (`statusid`),
CONSTRAINT `MyItemsTable_r04` FOREIGN KEY (`itemtypeid`) REFERENCES `itemstypes` (`itemtypeid`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `MyItemsTable_r05` FOREIGN KEY (`uploadid`) REFERENCES `uploads` (`uploadid`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Just having few indexes does not mean your tables and queries are optimized.
Try to identify the querties that run the slowest and add specific indexes there.
Selecting * from a huge table .. where you have columns that contain text / images / files
will be aways slow. Try to limit the selection of such fat columns when you don't need them.
future readings:
http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/
and some more advanced configurations:
http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/
http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/
source
UPDATE:
try to use composite keys for some of the heaviest queries,
by placing the main fields that are compared in ONE index:
`MyItemsTable_r88` (`itemtypeid`,`statusid`, `serialnb`), ...
this will give you faster results for queries that complare only columns from the index :
SELECT * FROM my_table WHERE `itemtypeid` = 5 AND `statusid` = 0 AND `serialnb` > 500
and extreamlly fast if you search and select values from the index:
SELECT `serialnb` FROM my_table WHERE `statusid` = 0 `itemtypeid` IN(1,2,3);
This are really basic examples you will have to read a bit more and analyze the data for the best results.
can you please advise why such a query would take so long (literally 20-30 minutes)?
I seem to have proper indexes set up, don't I?
UPDATE `temp_val_import_435` t1,
`attr_upc` t2 SET t1.`attr_id` = t2.`id` WHERE t1.`value` LIKE t2.`upc`
CREATE TABLE `attr_upc` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`upc` varchar(255) NOT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `upc` (`upc`),
KEY `last_update` (`last_update`)
) ENGINE=InnoDB AUTO_INCREMENT=102739 DEFAULT CHARSET=utf8
CREATE TABLE `temp_val_import_435` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`attr_id` int(11) DEFAULT NULL,
`translation_id` int(11) DEFAULT NULL,
`source_value` varchar(255) NOT NULL,
`value` varchar(255) DEFAULT NULL,
`count` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `core_value_id` (`core_value_id`),
KEY `translation_id` (`translation_id`),
KEY `source_value` (`source_value`),
KEY `value` (`value`),
KEY `count` (`count`)
) ENGINE=InnoDB AUTO_INCREMENT=32768 DEFAULT CHARSET=utf8
Ed Cottrell's solution worked for me. Using = instead of LIKE sped up a smaller test query on 1000 rows by a lot.
I measured 2 ways: 1 in phpMyAdmin, the other looking at the time for DOM load (which of course involves other processes).
DOM load went from 44 seconds to 1 second, a 98% increase.
But the difference in query execution time was much more dramatic, going from 43.4 seconds to 0.0052 seconds, a decrease of 99.988%. Pretty good. I will report back on results from huge datasets.
Use = instead of LIKE. = should be much faster than LIKE -- LIKE is only for matching patterns, as in '%something%', which matches anything with "something" anywhere in the text.
If you have this query:
SELECT * FROM myTable where myColumn LIKE 'blah'
MySQL can optimize this by pretending you typed myColumn = 'blah', because it sees that the pattern is fixed and has no wildcards. But what if you have this data in your upc column:
blah
foo
bar
%foo%
%bar
etc.
MySQL can't optimize your query in advance, because it's possible that the text it is trying to match is a pattern, like %foo%. So, it has to perform a full text search for LIKE matches on every single value of temp_val_import_435.value against every single value of attr_upc.upc. With a simple = and the indexes you have defined, this is unnecessary, and the query should be dramatically faster.
In essence you are joining on a LIKE which is going to be problematic (would need EXPLAIN to see is MySQL if utilizing indexes at all). Try this:
UPDATE `temp_val_import_435` t1
INNER JOIN `attr_upc` t2
ON t1.`value` LIKE t2.`upc`
SET t1.`attr_id` = t2.`id` WHERE t1.`value` LIKE t2.`upc`
In mysql I have a varchar containing latitude and longitude provided by Google maps.
I need to be able to query based on a bounding box value, but have no need for the geo features now available. I'm trying to populate 2 new Decimal fields with the Decimal values found in the varchar. Here is the query that I'm trying to use, but the result are all rounded values in the new fields.
Sample data:
'45.390746926938185, -122.75535710155964',
'45.416444621636415, -122.63058006763458'
Create Table:
CREATE TABLE IF NOT EXISTS `cameras` (
`id` int(11) NOT NULL auto_increment,
`user_id` int(75) NOT NULL,
`position` varchar(75) NOT NULL,
`latitude` decimal(17,15) default NULL,
`longitude` decimal(18,15) default NULL,
`address` varchar(75) NOT NULL,<br />
`date` varchar(11) NOT NULL,<br />
`status` int(1) NOT NULL default '1',
`duplicate_report` int(11) NOT NULL default '0',
`missing_report` int(11) NOT NULL default '0',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `status` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1050 ;
SQL:
UPDATE cameras
SET latitude = CAST( (substring(position,1,locate(',',position))) AS DECIMAL(17,15) ),
longitude = CAST( (substring(position,locate(',',position)+1)) AS DECIMAL(18,15) )
SQL Alternate attempt:
UPDATE cameras
SET latitude = CONVERT( (substring(position,1,locate(',',position))), DECIMAL(17,15) ),
longitude = CONVERT( (substring(position,locate(', ',position)+1)), DECIMAL(18,15) )
The resulting field values are for both scenarios:
45.000000000000000 and -122.000000000000000
AND
45.000000000000000 and -122.000000000000000
Can anyone see what I'm doing wrong?
Thanks.
Both the CAST and CONVERT forms seem to be correct.
SELECT CAST((SUBSTRING(t.position,1,LOCATE(',',t.position))) AS DECIMAL(17,15)) AS lat_
, CONVERT(SUBSTRING(t.position,LOCATE(', ',t.position)+1),DECIMAL(18,15)) AS long_
FROM (SELECT '45.390746926938185, -122.75535710155964' AS `position`) t
lat_ long_
------------------ ----------------------
45.390746926938185 -122.755357101559640
I think position is a reserved word, but I don't think that matters in this case. But it wouldn't hurt to assign a table alias and qualify all column references
UPDATE cameras c
SET c.latitude = CAST((SUBSTRING(c.position,1,LOCATE(',',c.position))) AS DECIMAL(17,15))
, c.longitude = CAST((SUBSTRING(c.position,LOCATE(',',c.position)+1)) AS DECIMAL(18,15))
But I suspect that won't resolve the problem.
One thing to check is for a before or after update trigger defined on the table, which is rounding/modifying the values assigned to the latitude and longitude columns?
I suggest you try running just a query.
SELECT CAST((SUBSTRING(c.position,1,LOCATE(',',c.position))) AS DECIMAL(17,15)) AS lat_
, CAST((SUBSTRING(c.position,LOCATE(',',c.position)+1)) AS DECIMAL(18,15)) AS lon_
FROM cameras c
and verify that produces the decimal values you expect.
A dot character should be recognized as a decimal point. Does the position column contain some other special characters, like a space or something?
From what you posted, it looks like the CAST and CONVERT working on the integer portion up to the decimal point. (There shouldn't be an implicit convert to signed integer in there, so it's not clear why the characters following the decimal point aren't being included.)
If you can figure out what character(s) are being used to represent the decimal point, then you could use a MySQL REPLACE() function to replace those with a simple dot character.