I have a table Postcode which holds all UK postcode (approx 1.8m i think)
CREATE TABLE `Postcode` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`Postcode` varchar(8) DEFAULT NULL,
`Postcode_Simple` varchar(8) DEFAULT NULL,
`Positional_Quality_Indicator` int(11) DEFAULT NULL,
`Eastings` int(11) DEFAULT NULL,
`Northings` int(11) DEFAULT NULL,
`Latitude` double DEFAULT NULL,
`Longitude` double DEFAULT NULL,
`LatLong` point DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `Postcode` (`Postcode`),
KEY `Postcode_Simple` (`Postcode_Simple`),
KEY `LatLong` (`LatLong`(25))
) ENGINE=InnoDB AUTO_INCREMENT=1755933 DEFAULT CHARSET=latin1;
What I want to achieve is...Given a co-ordinate, locate the postcode nearest to the co-ordinate. Problem is I'm having a bit of an issue with the query (actually in a stored procedure) I've written to do this. The query is:
SELECT
Postcode
FROM
(SELECT
Postcode,
GLENGTH(
LINESTRINGFROMWKB(
LINESTRING(
LatLong,
GEOMFROMTEXT(CONCAT('POINT(', varLatitude, ' ', varLongitude, ')'))
)
)
) AS distance
FROM
Postcode
WHERE
NOT LatLong IS NULL) P
ORDER BY
Distance
LIMIT
1;
The problem I'm having is that the query takes some 12seconds to run and I cannot have it take that long to get a result. Can anyone think of any ways I can reliably speed this query up?
(Here's the explain for the query)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL (NULL) (NULL) (NULL) (NULL) 1688034 Using filesort
2 DERIVED Postcode ALL LatLong (NULL) (NULL) (NULL) 1717998 Using where
I've been trying to think of a way to narrow down the initial amount of data that I must perform the distance calculation on, but I haven't been able to come up with anything that doesn't restrict to finding postcodes within a given distance.
Maybe try something along the lines of:
SELECT Postcode, lat, lon
FROM
(
SELECT Postcode, MAX(latitude) AS lat, MAX(longitude) AS lon
FROM PostCode
-- field name
GROUP BY Postcode
HAVING MAX(latitude)<varLatitude AND MAX(longitude)<varLongitude
LIMIT 1
) AS temp
which will basically bring the postcode whose lat and lon are less than the ones you specify but greater than any other lat/lon combination that is less than your vars; so effectively the closest lat/lon to your vars, hence the closest postcode. You can try the same using MIN and greater then instead to go the other way round.
The above will only get you a single result/postcode. If you're looking to have something niftier with like finding a group of postcodes given in a specific radius of lat/long then you should have a look at the formula explained at https://developers.google.com/maps/articles/phpsqlsearch_v3#findnearsql
I've written a tutorial on pretty much exactly what you're after.
Basically, you're on the right lines. In order to improve the efficiency of the search, you'll need to reduce the number of GLength() calculations made by making use of a spatial index on your LatLong field. If you restrict the search to a refined area, such as polygon 10 miles around the point you're comparing the postcodes to, you'll find the query is much quicker.
Related
First of all, I am using mysql8.
I have a table that store states of my country.
My table ddl is:
CREATE TABLE state_v2(
`id` int(11) NOT NULL,
uf_code INT NULL,
uf VARCHAR(2) NOT NULL,
`name` VARCHAR(100) NOT NULL,
latitude FLOAT(8) NOT NULL,
longitude FLOAT(8) NOT NULL,
`country` varchar(75) NOT NULL,
coord POINT SRID 4326 NOT NULL,
PRIMARY KEY (id),
unique (country, uf_code),
unique (latitude, longitude),
index (uf_code)
);
ALTER TABLE state_v2 ADD SPATIAL INDEX(coord);
The user choose a place and I use this place as reference to find the nearest state less then 100km. (the parameter is a correct point).
I am using this query:
SELECT * FROM (
SELECT
sv.*,
ST_distance_sphere(
$param,
sv.coord) as distance
FROM state_v2 sv
WHERE ST_distance_sphere(
$param,
sv.coord) < 100000
) as temp
order by temp.distance;
1 - Is it correct this query? I am worried about the performance of using like that, it is using full table scan.
2 - Is it correct the created indexes?
I have a database which has lots of car locations(coordinates; lat,lng). I want to get the location of the cars just by giving the coordinates to server and get the locations very fast. The problem is that the number of cars is numerous and if there are millions of records in a table, when requesting with
select *
from locations
where (lat >= '$lat-10' and lat <= '$lat+10') and
(lng >= '$lng-10' and lng <= '$lng+10')
it might be normal but it compares millions of coordinates with four conditions and it utilizes resources very much. So is there any algorithm to find the locations very fast? I thought whether it is a good idea to categorize the map of my country with squares and put each section to a separate table in order to find the locations fast. So if a user wants to find the location of a car just by giving her/his current coordinates, SQL will search only in the section(table) that the user currently is. But now the problem is that the number of tables will grow much, maybe 100,000 tables!!
EDIT
CREATE TABLE `locations` (
`car_id` int(10) UNSIGNED NOT NULL,
`car_code` varchar(8) COLLATE utf8mb4_unicode_ci NOT NULL,
`lat` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
`lng` tinytext COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `locations`
ADD PRIMARY KEY (`car_id`);
or maybe I should use index in SQL? if yes how can I do that? Is this the answer index
How can I find the location of cars fast?
Thanks
be sure you don't need unuseful conversione between data so use the same data type pr lat and lng in you table schema and query.
(if the lat, lng in db is decimal don't use a string for pass the vars value .. eventually ise a proper binding )
be sure you have composite index on table locations on both
select only the column you really need in select and add these column to your composite index eg:
select col1,col2,col3
from locations
where (lat >= :lat-10 and lat <= :lat+10) and (lng >= :lng-10 and lng <= :lng+10);
CREATE INDEX idx_lat_lng ON locations (lat,lng, col1, col2, col3);
I have a complicated issue but rather than go into the specifics i have simplified it to the following.
Lets say we are trying to build a system, where users of the system can apply for priority levels on various services on a per zip-code basis. This system would have four tables like so...
CREATE TABLE `zip_code` (
`zip` varchar(7) NOT NULL DEFAULT '',
`lat` float NOT NULL DEFAULT '0',
`long` float NOT NULL DEFAULT '0'
PRIMARY KEY (`zip`,`lat`,`long`),
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `user` (
`user_id` int(10) NOT NULL AUTO_INCREMENT
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `service` (
`service_id` int(10) NOT NULL AUTO_INCREMENT
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `service_priority` (
`user_id` int(10) NOT NULL',
`service_id` int(10) NOT NULL',
`zip` varchar(7) NOT NULL,
`priority` tinyint(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Now lets also say that we have 45000 zip-codes, a few hundred services and a few thousand users, and that no user can have the same priority level as another user for the same service in the same zip code.
I need a query that if given a particular zip code, radius, service, and a user_id will return the highest available priority level for all other zip codes within that radius for that service.
And, also, would like to know any suggestions for restructuring this data.
The problem that i see happening here is as the user base grows, the service_priority table is going to get huge, in theory 45000 rows bigger for every user although in practice probably only 10000 rows bigger.
What can i do to mitigate these problems?
Switch to InnoDB.
zip_code table should probably have PRIMARY KEY(zip) unless you really want multiple rows for a given zip.
"no user can have the same priority level as another user for the same service in the same zip code" -- can be enforced by
service_priority : UNIQUE(service_id, user_id, zip)
Then your query may look something like
SELECT sp.*
FROM ( SELECT b.zip
FROM ( SELECT lat, lng FROM zip_code WHERE zip = '$zip' ) AS a
JOIN zip_code AS b
WHERE ... < $radius
) AS z
JOIN service_priority AS sp
WHERE sp.zip = z.zip
AND sp.user_id = $user_id
AND sp.service_id = $service_id
ORDER BY sp.priority DESC
LIMIT 1
Notes:
The index, above, is also tailored for this query.
The innermost query gets the one lat/lng for the center point.
The middle query focuses on finding the nearby zips. See the tag I added to find many questions discussion how to do that.
The outer query then filters results based on user and service.
Finally, the highest priority row is picked.
In mysql I have a varchar containing latitude and longitude provided by Google maps.
I need to be able to query based on a bounding box value, but have no need for the geo features now available. I'm trying to populate 2 new Decimal fields with the Decimal values found in the varchar. Here is the query that I'm trying to use, but the result are all rounded values in the new fields.
Sample data:
'45.390746926938185, -122.75535710155964',
'45.416444621636415, -122.63058006763458'
Create Table:
CREATE TABLE IF NOT EXISTS `cameras` (
`id` int(11) NOT NULL auto_increment,
`user_id` int(75) NOT NULL,
`position` varchar(75) NOT NULL,
`latitude` decimal(17,15) default NULL,
`longitude` decimal(18,15) default NULL,
`address` varchar(75) NOT NULL,<br />
`date` varchar(11) NOT NULL,<br />
`status` int(1) NOT NULL default '1',
`duplicate_report` int(11) NOT NULL default '0',
`missing_report` int(11) NOT NULL default '0',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `status` (`status`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1050 ;
SQL:
UPDATE cameras
SET latitude = CAST( (substring(position,1,locate(',',position))) AS DECIMAL(17,15) ),
longitude = CAST( (substring(position,locate(',',position)+1)) AS DECIMAL(18,15) )
SQL Alternate attempt:
UPDATE cameras
SET latitude = CONVERT( (substring(position,1,locate(',',position))), DECIMAL(17,15) ),
longitude = CONVERT( (substring(position,locate(', ',position)+1)), DECIMAL(18,15) )
The resulting field values are for both scenarios:
45.000000000000000 and -122.000000000000000
AND
45.000000000000000 and -122.000000000000000
Can anyone see what I'm doing wrong?
Thanks.
Both the CAST and CONVERT forms seem to be correct.
SELECT CAST((SUBSTRING(t.position,1,LOCATE(',',t.position))) AS DECIMAL(17,15)) AS lat_
, CONVERT(SUBSTRING(t.position,LOCATE(', ',t.position)+1),DECIMAL(18,15)) AS long_
FROM (SELECT '45.390746926938185, -122.75535710155964' AS `position`) t
lat_ long_
------------------ ----------------------
45.390746926938185 -122.755357101559640
I think position is a reserved word, but I don't think that matters in this case. But it wouldn't hurt to assign a table alias and qualify all column references
UPDATE cameras c
SET c.latitude = CAST((SUBSTRING(c.position,1,LOCATE(',',c.position))) AS DECIMAL(17,15))
, c.longitude = CAST((SUBSTRING(c.position,LOCATE(',',c.position)+1)) AS DECIMAL(18,15))
But I suspect that won't resolve the problem.
One thing to check is for a before or after update trigger defined on the table, which is rounding/modifying the values assigned to the latitude and longitude columns?
I suggest you try running just a query.
SELECT CAST((SUBSTRING(c.position,1,LOCATE(',',c.position))) AS DECIMAL(17,15)) AS lat_
, CAST((SUBSTRING(c.position,LOCATE(',',c.position)+1)) AS DECIMAL(18,15)) AS lon_
FROM cameras c
and verify that produces the decimal values you expect.
A dot character should be recognized as a decimal point. Does the position column contain some other special characters, like a space or something?
From what you posted, it looks like the CAST and CONVERT working on the integer portion up to the decimal point. (There shouldn't be an implicit convert to signed integer in there, so it's not clear why the characters following the decimal point aren't being included.)
If you can figure out what character(s) are being used to represent the decimal point, then you could use a MySQL REPLACE() function to replace those with a simple dot character.
I've taken a zip code database and pre-calculated the distances to other zip codes within a certain radius. The database itself is about 2.5GB so not anything extraordinary.
The goal of doing this is to be able to:
select * from zipcode_distances where zipcode_from=92101 and distance < 10;
So far the only index i've defined is:
(zipcode_from, distance)
However, running the query takes about 20 seconds to get the results.
When I remove the "and distance < 10" clause, the results are instantaneous.
Any advice would be appreciated.
Edit:
Here is the create statement:
delimiter $$
CREATE TABLE `zipcode_distances` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`zipcode_from` char(5) COLLATE utf8_bin NOT NULL,
`zipcode_to` char(5) COLLATE utf8_bin NOT NULL,
`distance` double unsigned NOT NULL COMMENT 'stored in miles',
PRIMARY KEY (`id`),
KEY `idx_zip_from_distance` (`zipcode_from`,`distance`)
) ENGINE=MyISAM AUTO_INCREMENT=62548721 DEFAULT CHARSET=utf8 COLLATE=utf8_bin$$
Here is the explain:
explain extended select * from zipcode_distances where zipcode_from=90210 and distance < 10;
Results:
id, select_type, table, possible_keys, key, key_len, ref, rows,
filtered, Extra 1, SIMPLE, zipcode_distances, ALL,
idx_zip_from_distance, null, null, null, 62548720, 100.00, Using where
Thank you!
I see no problem with MySQL using the index for the query. I do wonder if the type conversion from 92101 could be confusing it.
Do you get the same poor performance with this?
select * from zipcode_distances where zipcode_from='92101' and distance < 10;
The other issue is how you are doing the timings. You have to run multiple times to avoid the effects of filling the cache.