Slow location based search result query - mysql

I have a query that I use to find results that are ordered by location. Results also have to account for VAT so this is also in the query. The query can unfortunately take 4+ seconds to run when not cached. Can anyone spot any glaringly obvious issues or suggest anything I can do to improve it?
Just to clarify what is happening in the query:
The distance is calculation is euclidean distance using lat/long
The incvat fields are used to show the price when vat is included
The WHEN / THEN statement is used to put prices of 0 at the very bottom
The query:
SELECT * , ROUND( SQRT( POW( ( 69.1 * ( company_branch_lat - 52.4862 ) ) , 2 ) + POW( ( 53 * ( company_branch_lng - - 1.8905 ) ) , 2 ) ) , 1 ) AS distance,
hire_car_day + ( hire_car_day * 0.2 * ! hire_car_incvat ) AS hire_car_day_incvat,
hire_car_addday + ( hire_car_addday * 0.2 * ! hire_car_incvat ) AS hire_car_addday_incvat,
hire_car_week + ( hire_car_week * 0.2 * ! hire_car_incvat ) AS hire_car_week_incvat,
hire_car_weekend + ( hire_car_weekend * 0.2 * ! hire_car_incvat ) AS hire_car_weekend_incvat
FROM hire_car
LEFT JOIN company_branch ON company_branch_id = hire_car_branchid
LEFT JOIN hire_cartypelink ON hire_cartypelink_carhireid = hire_car_id
LEFT JOIN users ON company_branch_userid = user_id
WHERE 1
GROUP BY hire_car_id
HAVING distance <=30
ORDER BY CASE hire_car_day_incvat
WHEN 0
THEN 40000
ELSE hire_car_day_incvat
END , distance ASC
LIMIT 0 , 30

You can use the mysql spatial extension and save the latitude and longitude as a point datatype and make it a spatial index. That way you can reorder the coordinates along a curve and reduce the dimension and preserve spatial information. You can use the spatial index as a bounding box to filter the query and then use the harvesine formula to pick the optimal result. Your bounding box should be bigger then the radius of the great circle. Mysql uses a rtree with some spatial index and my example was about a z curve or a hilbert curve: https://softwareengineering.stackexchange.com/questions/113256/what-is-the-difference-between-btree-and-rtree-indexing.
Then you can insert a geocoordinate directly into a point column: http://dev.mysql.com/doc/refman/5.0/en/creating-spatial-values.html. Or you can use a geometry datatype: http://markmaunder.com/2009/10/10/mysql-gis-extensions-quick-start/. Then you can use MBRcontains function like so: http://dev.mysql.com/doc/refman/4.1/en/relations-on-geometry-mbr.html or any other functions: http://dev.mysql.com/doc/refman/5.5/en/functions-for-testing-spatial-relations-between-geometric-objects.html. Hence you need a bounding box.
Here are some examples:
Storing Lat Lng values in MySQL using Spatial Point Type
https://gis.stackexchange.com/questions/28333/how-to-speed-up-this-simple-mysql-points-in-the-box-query
Here is a simple example with point datatype:
CREATE SPATIAL INDEX sx_place_location ON place (location)
SELECT * FROM mytable
WHERE MBRContains
(
LineString
(
Point($x - $radius, $y - $radius),
Point($x + $radius, $y + $radius)
)
location
)
AND Distance(Point($x, $y), location) <= $radius
MySQL latitude and Longitude table setup.
I'm not sure if it works because it's uses a radius variable with a bounding-box function. It's seems to me MBRwithin is a bit simpler, because it doesn't need any argument: Mysql: Optimizing finding super node in nested set tree.

You are using GROUP BY statement together with HAVING, although I don't see anywhere in the query any aggregate functions. I recommend you to re-write the query like this and see if it makes any difference
SELECT * , ROUND( SQRT( POW( ( 69.1 * ( company_branch_lat - 52.4862 ) ) , 2 ) + POW( ( 53 * ( company_branch_lng - - 1.8905 ) ) , 2 ) ) , 1 ) AS distance,
hire_car_day + ( hire_car_day * 0.2 * ! hire_car_incvat ) AS hire_car_day_incvat,
hire_car_addday + ( hire_car_addday * 0.2 * ! hire_car_incvat ) AS hire_car_addday_incvat,
hire_car_week + ( hire_car_week * 0.2 * ! hire_car_incvat ) AS hire_car_week_incvat,
hire_car_weekend + ( hire_car_weekend * 0.2 * ! hire_car_incvat ) AS hire_car_weekend_incvat
FROM hire_car
LEFT JOIN company_branch ON company_branch_id = hire_car_branchid
LEFT JOIN hire_cartypelink ON hire_cartypelink_carhireid = hire_car_id
LEFT JOIN users ON company_branch_userid = user_id
WHERE ROUND( SQRT( POW( ( 69.1 * ( company_branch_lat - 52.4862 ) ) , 2 ) + POW( ( 53 * ( company_branch_lng - - 1.8905 ) ) , 2 ) ) , 1 ) <= 30
ORDER BY CASE hire_car_day_incvat
WHEN 0
THEN 40000
ELSE hire_car_day_incvat
END , distance ASC
LIMIT 0 , 30

Related

Combining SQL Google Search with own SQL query

I am pretty new to SQL queries.
I have a google SQL Search example
SELECT cID,
(6371 * acos
(
cos(radians(51.455643))
* cos(radians(latCord))
* cos(radians(longCord) - radians(7.011555))
+ sin(radians(51.455643))
* sin(radians(latCord))
)
) AS distance
FROM breitengrade
HAVING distance < 50
ORDER BY distance
LIMIT 0, 20
and a own SQL query
SELECT breitengrade.cID
,breitengrade.latCord
,breitengrade.longCord
,Pages.cIsActive
FROM breitengrade
INNER JOIN Pages ON breitengrade.cID = Pages.cID
WHERE cIsActive = '1'
How can I combine these 2 queries into one so that I can get one single result set?
SELECT breitengrade.cID,
breitengrade.latCord,
breitengrade.longCord,
Pages.cIsActive
(6371 * acos
(
cos(radians(51.455643))
* cos(radians(latCord))
* cos(radians(longCord) - radians(7.011555))
+ sin(radians(51.455643))
* sin(radians(latCord))
)
) AS distance
FROM breitengrade
INNER JOIN Pages ON breitengrade.cID = Pages.cID
WHERE cIsActive = '1'
HAVING distance < 50
ORDER BY distance
LIMIT 0, 20

Using query result data for another query

The following query works for me. However, is there a way to speed it up (table1/2 contain each more than 300000 entries). I also would like to query more data in the subquery and use the distance just to filter results. The two tables have not much in common except the lat/lon.
SELECT `lat1`,
`lon1`,
(SELECT Sqrt(Pow( 69.1 * ( `lat1` - `lat2` ), 2 )
+ Pow( 69.1 * ( `lon2` - `lon1` ) * Cos( `lat` / 57.3 ), 2 )
) AS
distance
FROM table1
ORDER BY distance
LIMIT 0, 1) AS `test`
FROM `table2`
Thanks in advance

Relevancy MySQL query to replace stored procedure

I've taken over a project built by my predecessor. That project contains a stored procedure originally taken from here:
https://stackoverflow.com/a/9100182/439925
Currently this is preventing us from updating MySQL, so i have been attempting to remove it and replace the calls to subStringCount() with an adjustment to the top answer on that question.
(( Round (( Char_length(`title`) - Char_length(REPLACE(`title`, 'info', "")) ) / Char_length('info')) * 30 )) AS `title_score`,
The queries are used to count the number of times a search string occurs in a number of fields and then order by the total. Unfortunately i can't get the new query results to match the old one.
The 2 full queries are as follows:
Old WITH the stored Proc:
SELECT SQL_CALC_FOUND_ROWS `temp`.*,
( `title_score` + `source_score`
+ `abstract_score` + `authors_score`
+ `drugs_score` + `uploader_score`
+ `area_score`
+ Ifnull(`document_content_score`, 0) ) AS
`relevance`
FROM (SELECT `kb_uploads`.*,
(( Substringcount(`title`, 'info') * 30 )) AS `title_score`,
(( Substringcount(`source`, 'info') * 15 )) AS`source_score`,
(( Substringcount(`abstract`, 'info') * 20 ) AS`abstract_score`,
(( Substringcount(`authors`, 'info') * 30 )) AS `authors_score`,
(( Substringcount(`drugs`, 'info') * 20 )) AS `drugs_score`,
(( Substringcount(`kb_users`.`name`, 'info') * 20 )) AS `uploader_score`,
(( Substringcount(`kb_upload_areas`.`name`, 'info') * 20 )) AS `area_score`,
( `content_tbl`.`index_score` * 1 ) AS `document_content_score`
FROM `kb_uploads`
LEFT JOIN `kb_users`
ON `kb_users`.`id` = `kb_uploads`.`uploader`
LEFT JOIN `kb_upload_areas`
ON `kb_upload_areas`.`id` = `kb_uploads`.`area`
LEFT JOIN (SELECT `upload`,
Sum(`weighting`) AS `index_score`
FROM `kb_search_index`
WHERE `word` = 'info'
GROUP BY `upload`) AS `content_tbl`
ON `content_tbl`.`upload` = `kb_uploads`.`id`) AS `temp`
WHERE `is_deleted` = 0
HAVING `relevance` > 0
ORDER BY `relevance` DESC
LIMIT 10 OFFSET 0
New WITHOUT the stored Proc:
SELECT SQL_CALC_FOUND_ROWS `temp`.*,
( `title_score` + `source_score`
+ `abstract_score` + `authors_score`
+ `drugs_score` + `uploader_score`
+ `area_score`
+ Ifnull(`document_content_score`, 0) ) AS
`relevance`
FROM (SELECT `kb_uploads`.*,
(( Round (( Char_length(`title`) - Char_length(REPLACE(`title`, 'info', "")) ) / Char_length('info')) * 30 )) AS `title_score`,
(( Round (( Char_length(`source`) - Char_length(REPLACE(`source`,'info' , "")) ) / Char_length('info')) * 15 )) AS `source_score`,
(( Round (( Char_length(`abstract`) - Char_length(REPLACE(`abstract`, 'info', "" ) ) ) / Char_length('info')) * 20 )) AS `abstract_score`,
(( Round (( Char_length(`authors`) - Char_length(REPLACE(`authors`,'info', "")))/Char_length('info')) * 30 )) AS `authors_score`,
(( Round (( Char_length(`drugs`) - Char_length(REPLACE(`drugs`,'info',"")) ) / Char_length('info')) * 20 )) AS `drugs_score`,
(( Round (( Char_length(`kb_users`.`name`) - Char_length(REPLACE(`kb_users`.`name`,'info',""))) / Char_length('info')) * 20 )) AS `uploader_score`,
(( Round (( Char_length(`kb_upload_areas`.`name`) - Char_length(REPLACE(`kb_upload_areas`.`name`,'info', "")) ) / Char_length('info'))* 20 )) AS`area_score`,
( `content_tbl`.`index_score` * 1 ) AS `document_content_score`
FROM `kb_uploads`
LEFT JOIN `kb_users`
ON `kb_users`.`id` = `kb_uploads`.`uploader`
LEFT JOIN `kb_upload_areas`
ON `kb_upload_areas`.`id` = `kb_uploads`.`area`
LEFT JOIN (SELECT `upload`,
Sum(`weighting`) AS `index_score`
FROM `kb_search_index`
WHERE `word` = 'info'
GROUP BY `upload`) AS `content_tbl`
ON `content_tbl`.`upload` = `kb_uploads`.`id`) AS `temp`
WHERE `is_deleted` = 0
HAVING `relevance` > 0
ORDER BY `relevance` DESC
LIMIT 10 OFFSET 0
There are 2 main issues.
1) Ifnull is not working in the NEW query. The table column contains mostly null instead of 0
2) The relevancy numbers in the new query don't match the numbers in the old one, possibly something to do with IFNULL not working.
The full queries are constructed in PHP, i have left the code out as the logic hasnt changed, only the string concats to replace the Stored Proc.

Different calculation figure in arithmetic TSQL

I am getting diff. figures in my below equation written.
SELECT 9.36 + 9.36 / ( 284.36 ) * 15.64 = 9.8748065528 (CORRECT)
SELECT 18.72 / ( 284.36 ) * 15.64 = 1.0296131056 (INCORRECT)
I have total (9.36 * 2) Which I am putting in second select statement and gives incorrect amount.
What I am doing wrong?
The equations are not the same.
The second statement is the equivalent of:
SELECT (9.36 + 9.36) / ( 284.36 ) * 15.64
not
SELECT 9.36 + 9.36 / ( 284.36 ) * 15.64
Remember your Order of Operations.
THIS is what I want.
SELECT 9.36 + (9.36 / ( 284.36 ) * 15.64)

How to make this query faster

SELECT D.A1, D.A2, D.A3 from D, l1
,l2 where
l1.Z = "90001"and substring(D.A1,1,5) = substring(l2.A1,1,5) and substring(D.A2,1,5) = substring(l2.A2,1,5)
AND 3959 * (PI()/180) * SQRT(
POW( (l2.A2-l1.A2)*COS((PI()/180)*(l2.A1+l2.A1)/2), 2 ) +
POW( l2.A1-l1.A1, 2 ) ) <= 10');
This is taking forever to run. I am not sure how to make this faster.
What indexes do you have? Have you got an index on l1.Z? If not, this would be an obvious way to start.
You could also store the results of expressions like substring(l2.A1,1,5) in the table itself. This would allow you to index the results of the expressions, meaning that your join will be faster.
It also looks like you are trying to find things that are "nearby" in space. If so, have a look at MySQLs spatial extensions.
The design of your database looks weird to me and I'm pretty sure MySQL can't use INDEX with theses conditions
substring(D.A1,1,5) = substring(l2.A1,1,5) and substring(D.A2,1,5) = substring(l2.A2,1,5)
Henceforth the bad performance :/
You should start by rewriting the query with joins, and see if that helps anything:
select D.A1, D.A2, D.A3
from D
inner join l2 on substring(D.A1,1,5) = substring(l2.A1,1,5) and substring(D.A2,1,5) = substring(l2.A2,1,5)
inner join l1 on 3959 * (PI()/180) * SQRT( POW( (l2.A2-l1.A2)*COS((PI()/180)*(l2.A1+l2.A1)/2), 2 ) + POW( l2.A1-l1.A1, 2 ) ) <= 10
where l1.Z = "90001"
coll_cs,you can try this
select D1.A1, D1.A2, D1.A3
from
(
SELECT substring(D.A1,1,5) AS A1,
substring(D.A2,1,5) AS A2,
D.D3 AS D3
from D
) as D1
,
(
select
substring(l2.A1,1,5) AS A1,
substring(l2.A2,1,5) AS A2
l1,l2
where l1.Z = "90001"
and
3959 *
(PI()/180) *
SQRT(
POW( (l2.A2-l1.A2)*COS((PI()/180)*(l2.A1+l2.A1)/2), 2 ) +
POW( l2.A1-l1.A1, 2 )
) <= 10
) as L
where D1 = L.A1
and D1 = L.A2
I don't have enough information about the table, such is field type, number of rows, keys and indexes
UPDATED FORMATTING