SELECT D.A1, D.A2, D.A3 from D, l1
,l2 where
l1.Z = "90001"and substring(D.A1,1,5) = substring(l2.A1,1,5) and substring(D.A2,1,5) = substring(l2.A2,1,5)
AND 3959 * (PI()/180) * SQRT(
POW( (l2.A2-l1.A2)*COS((PI()/180)*(l2.A1+l2.A1)/2), 2 ) +
POW( l2.A1-l1.A1, 2 ) ) <= 10');
This is taking forever to run. I am not sure how to make this faster.
What indexes do you have? Have you got an index on l1.Z? If not, this would be an obvious way to start.
You could also store the results of expressions like substring(l2.A1,1,5) in the table itself. This would allow you to index the results of the expressions, meaning that your join will be faster.
It also looks like you are trying to find things that are "nearby" in space. If so, have a look at MySQLs spatial extensions.
The design of your database looks weird to me and I'm pretty sure MySQL can't use INDEX with theses conditions
substring(D.A1,1,5) = substring(l2.A1,1,5) and substring(D.A2,1,5) = substring(l2.A2,1,5)
Henceforth the bad performance :/
You should start by rewriting the query with joins, and see if that helps anything:
select D.A1, D.A2, D.A3
from D
inner join l2 on substring(D.A1,1,5) = substring(l2.A1,1,5) and substring(D.A2,1,5) = substring(l2.A2,1,5)
inner join l1 on 3959 * (PI()/180) * SQRT( POW( (l2.A2-l1.A2)*COS((PI()/180)*(l2.A1+l2.A1)/2), 2 ) + POW( l2.A1-l1.A1, 2 ) ) <= 10
where l1.Z = "90001"
coll_cs,you can try this
select D1.A1, D1.A2, D1.A3
from
(
SELECT substring(D.A1,1,5) AS A1,
substring(D.A2,1,5) AS A2,
D.D3 AS D3
from D
) as D1
,
(
select
substring(l2.A1,1,5) AS A1,
substring(l2.A2,1,5) AS A2
l1,l2
where l1.Z = "90001"
and
3959 *
(PI()/180) *
SQRT(
POW( (l2.A2-l1.A2)*COS((PI()/180)*(l2.A1+l2.A1)/2), 2 ) +
POW( l2.A1-l1.A1, 2 )
) <= 10
) as L
where D1 = L.A1
and D1 = L.A2
I don't have enough information about the table, such is field type, number of rows, keys and indexes
UPDATED FORMATTING
Related
So I'm having a problem when I add in ORDER BY date_last_access DESC the whole query slows down to 3secs and without it's about 0.2secs, why is it running so slow and how can I change the query to run faster?
There are also indexes on all the tables and fields used.
Users: 1+ million records
Likes: 5+ million records (over 1 billion in production)
Tables will be growing really fast once in production.
QUERY
SELECT
id,
sid,
first_name,
date_birth,
location,
date_created,
date_last_access,
(3956 * 2 * ASIN(
SQRT(
POWER(
SIN(
({LAT} - latitude) * pi() / 180 / 2
),
2
) + COS({LAT} * pi() / 180) * COS(latitude * pi() / 180) * POWER(
SIN(
({LON} - longitude) * pi() / 180 / 2
),
2
)
)
)) AS distance
FROM
users
WHERE
`id` != {UID} AND
`gender` = {GEND} AND
`date_birth` BETWEEN {DOB_MIN} AND {DOB_MAX} AND
`status` = 'active' AND
(SELECT COUNT(*) FROM likes WHERE likes.judged_user = users.id AND likes.user_id = {UID}) = 0
HAVING distance <= {DIST}
ORDER BY date_last_access DESC
LIMIT {ROWS}
EXPLAIN
1 PRIMARY users ref PRIMARY,Index_2,discovery,index_1 index_1 2 const 226184 Using index condition; Using where; Using filesort
2 DEPENDENT SUBQUERY likes eq_ref PRIMARY,index_1,index_2 PRIMARY 16 const,hello.users.id 1 Using index
INDEXES
LIKES - user_id, judged_user - NORMAL - BTREE
USERS - id, gender, date_birth, status, date_last_access - NORMAL - BTREE
When I order by id instead of date_last_access it seems to run much faster, could it be cause date_last_access is a datetime format?
First try run a EXPLAIN of your query. This will show you what fields and operations are slowing your query. Then try to make joins with indexed columns and filter you resultset with more specific values.
Simplyfying the subquery could be a better way to avoid extra processing time (COUNT):
(SELECT COUNT(*) FROM likes WHERE likes.judged_user = users.id AND likes.user_id = {UID}) = 0
could change to
(SELECT 1 FROM likes WHERE likes.judged_user = users.id AND likes.user_id = {UID} limit 1) IS NULL
Avoiding a subquery could be the best way to improve the performance of the query. You could check what options could be better for your case (an index for likes.user_id is required in this case)
FROM
users
LEFT JOIN (
SELECT distinct judged_user FROM likes WHERE likes.user_id = {UID}
) l ON l.judged_user=users.id
WHERE
`id` != {UID} AND
`gender` = {GEND} AND
`date_birth` BETWEEN {DOB_MIN} AND {DOB_MAX} AND
`status` = 'active' AND
l.judged_user is NULL
You should phrase the FROM clause as:
WHERE `id` <> {UID} AND
`gender` = {GEND} AND
`date_birth` BETWEEN {DOB_MIN} AND {DOB_MAX} AND
`status` = 'active' AND
NOT EXISTS (SELECT 1 FROM likes l WHERE l.judged_user = users.id AND l.user_id = {UID})
HAVING distance <= {DIST}
For this query, you can try two indexes:
LIKES(judged_user, user_id)
USERS(Gender, status, date_birth, id)
I am pretty new to SQL queries.
I have a google SQL Search example
SELECT cID,
(6371 * acos
(
cos(radians(51.455643))
* cos(radians(latCord))
* cos(radians(longCord) - radians(7.011555))
+ sin(radians(51.455643))
* sin(radians(latCord))
)
) AS distance
FROM breitengrade
HAVING distance < 50
ORDER BY distance
LIMIT 0, 20
and a own SQL query
SELECT breitengrade.cID
,breitengrade.latCord
,breitengrade.longCord
,Pages.cIsActive
FROM breitengrade
INNER JOIN Pages ON breitengrade.cID = Pages.cID
WHERE cIsActive = '1'
How can I combine these 2 queries into one so that I can get one single result set?
SELECT breitengrade.cID,
breitengrade.latCord,
breitengrade.longCord,
Pages.cIsActive
(6371 * acos
(
cos(radians(51.455643))
* cos(radians(latCord))
* cos(radians(longCord) - radians(7.011555))
+ sin(radians(51.455643))
* sin(radians(latCord))
)
) AS distance
FROM breitengrade
INNER JOIN Pages ON breitengrade.cID = Pages.cID
WHERE cIsActive = '1'
HAVING distance < 50
ORDER BY distance
LIMIT 0, 20
The following query works for me. However, is there a way to speed it up (table1/2 contain each more than 300000 entries). I also would like to query more data in the subquery and use the distance just to filter results. The two tables have not much in common except the lat/lon.
SELECT `lat1`,
`lon1`,
(SELECT Sqrt(Pow( 69.1 * ( `lat1` - `lat2` ), 2 )
+ Pow( 69.1 * ( `lon2` - `lon1` ) * Cos( `lat` / 57.3 ), 2 )
) AS
distance
FROM table1
ORDER BY distance
LIMIT 0, 1) AS `test`
FROM `table2`
Thanks in advance
SELECT *,
SUM(
b1.bit_sum_excl * b1.bit_quantity -
if(b1.bit_deduction_percentage = 1, (b1.bit_deduction / 100)*(b1.bit_sum_excl * b1.bit_quantity), b1.bit_deduction)
) as _total_unconverted,
/*************** _total_unconverted * 0.15 as some_val; ****************/
FROM `bill_items` b1
LEFT JOIN clients_packing_list ON (b1.bit_cli_pack_list_id = clients_packing_list.cli_pack_list_id)
WHERE b1.bit_cli_bill_id = 1 OR b1.bit_cli_bill_id = 0
I'd like to be able to add an expression
_total_unconverted * 0.15 as _converted
into the select.
Currently it complains that it's not a column.
Is there a way to do this, or must I copy paste the SUM code and add the multiplier?
SELECT *,SOURCEQUERY._total_unconverted * 0.15 as some_val FROM
(
SELECT
SUM(
b1.bit_sum_excl * b1.bit_quantity -
if(b1.bit_deduction_percentage = 1, (b1.bit_deduction / 100)*(b1.bit_sum_excl * b1.bit_quantity), b1.bit_deduction)
) as _total_unconverted
FROM bill_items AS b1
LEFT JOIN clients_packing_list ON (b1.bit_cli_pack_list_id = clients_packing_list.cli_pack_list_id)
WHERE b1.bit_cli_bill_id = 1 OR b1.bit_cli_bill_id = 0
) AS SOURCEQUERY
There's no way to reference the column alias _total_unconverted in the SELECT list of the query that returns that column.
You could wrap the query in a set of parens and reference it as an inline view, and the column alias would be available in the outer query.
In this example, the inner query returns a column foo, and that column name can be referenced in the outer query:
SELECT v.foo
, v.foo * 0.15 AS bar
FROM ( SELECT SUM(col) AS foo FROM mytable ) v
But it's not possible to reference foo in the SELECT list of the inner query. (MySQL does allow the column alias to be referenced in an ORDER BY clause and a HAVING clause.)
To avoid the overhead of an inline view, you'd need to repeat the expression.
SELECT *
, SUM(b1.bit_sum_excl * b1.bit_quantity -
IF(b1.bit_deduction_percentage = 1
,(b1.bit_deduction / 100)*(b1.bit_sum_excl * b1.bit_quantity)
, b1.bit_deduction
)
) AS _total_unconverted
, SUM(b1.bit_sum_excl * b1.bit_quantity -
IF(b1.bit_deduction_percentage = 1
,(b1.bit_deduction / 100)*(b1.bit_sum_excl * b1.bit_quantity)
, b1.bit_deduction
)
) * 0.15 AS some_val
FROM `bill_items` b1
LEFT
JOIN clients_packing_list
ON b1.bit_cli_pack_list_id = clients_packing_list.cli_pack_list_id
WHERE b1.bit_cli_bill_id IN (0,1)
You can use user defined variables with MySQL in query.
SELECT *,
(#total:=SUM(
b1.bit_sum_excl * b1.bit_quantity -
if(b1.bit_deduction_percentage = 1, (b1.bit_deduction / 100)*(b1.bit_sum_excl * b1.bit_quantity), b1.bit_deduction)
)) as _total_unconverted,
(#total * 0.15) as _converted,
FROM `bill_items` b1
LEFT JOIN clients_packing_list ON (b1.bit_cli_pack_list_id = clients_packing_list.cli_pack_list_id)
WHERE b1.bit_cli_bill_id = 1 OR b1.bit_cli_bill_id = 0
Edit : However, this does not guarantee that the value will be assigned in the correct order. In order to implement it properly, see this small example :
SELECT #min_price:=MIN(price),#max_price:=MAX(price) FROM shop;
SELECT * FROM shop WHERE price=#min_price OR price=#max_price;
Which is a paste from documentation.
Basically, you assign the values first, then re-select according to the values to make sure they were assigned properly.
I have a query that I use to find results that are ordered by location. Results also have to account for VAT so this is also in the query. The query can unfortunately take 4+ seconds to run when not cached. Can anyone spot any glaringly obvious issues or suggest anything I can do to improve it?
Just to clarify what is happening in the query:
The distance is calculation is euclidean distance using lat/long
The incvat fields are used to show the price when vat is included
The WHEN / THEN statement is used to put prices of 0 at the very bottom
The query:
SELECT * , ROUND( SQRT( POW( ( 69.1 * ( company_branch_lat - 52.4862 ) ) , 2 ) + POW( ( 53 * ( company_branch_lng - - 1.8905 ) ) , 2 ) ) , 1 ) AS distance,
hire_car_day + ( hire_car_day * 0.2 * ! hire_car_incvat ) AS hire_car_day_incvat,
hire_car_addday + ( hire_car_addday * 0.2 * ! hire_car_incvat ) AS hire_car_addday_incvat,
hire_car_week + ( hire_car_week * 0.2 * ! hire_car_incvat ) AS hire_car_week_incvat,
hire_car_weekend + ( hire_car_weekend * 0.2 * ! hire_car_incvat ) AS hire_car_weekend_incvat
FROM hire_car
LEFT JOIN company_branch ON company_branch_id = hire_car_branchid
LEFT JOIN hire_cartypelink ON hire_cartypelink_carhireid = hire_car_id
LEFT JOIN users ON company_branch_userid = user_id
WHERE 1
GROUP BY hire_car_id
HAVING distance <=30
ORDER BY CASE hire_car_day_incvat
WHEN 0
THEN 40000
ELSE hire_car_day_incvat
END , distance ASC
LIMIT 0 , 30
You can use the mysql spatial extension and save the latitude and longitude as a point datatype and make it a spatial index. That way you can reorder the coordinates along a curve and reduce the dimension and preserve spatial information. You can use the spatial index as a bounding box to filter the query and then use the harvesine formula to pick the optimal result. Your bounding box should be bigger then the radius of the great circle. Mysql uses a rtree with some spatial index and my example was about a z curve or a hilbert curve: https://softwareengineering.stackexchange.com/questions/113256/what-is-the-difference-between-btree-and-rtree-indexing.
Then you can insert a geocoordinate directly into a point column: http://dev.mysql.com/doc/refman/5.0/en/creating-spatial-values.html. Or you can use a geometry datatype: http://markmaunder.com/2009/10/10/mysql-gis-extensions-quick-start/. Then you can use MBRcontains function like so: http://dev.mysql.com/doc/refman/4.1/en/relations-on-geometry-mbr.html or any other functions: http://dev.mysql.com/doc/refman/5.5/en/functions-for-testing-spatial-relations-between-geometric-objects.html. Hence you need a bounding box.
Here are some examples:
Storing Lat Lng values in MySQL using Spatial Point Type
https://gis.stackexchange.com/questions/28333/how-to-speed-up-this-simple-mysql-points-in-the-box-query
Here is a simple example with point datatype:
CREATE SPATIAL INDEX sx_place_location ON place (location)
SELECT * FROM mytable
WHERE MBRContains
(
LineString
(
Point($x - $radius, $y - $radius),
Point($x + $radius, $y + $radius)
)
location
)
AND Distance(Point($x, $y), location) <= $radius
MySQL latitude and Longitude table setup.
I'm not sure if it works because it's uses a radius variable with a bounding-box function. It's seems to me MBRwithin is a bit simpler, because it doesn't need any argument: Mysql: Optimizing finding super node in nested set tree.
You are using GROUP BY statement together with HAVING, although I don't see anywhere in the query any aggregate functions. I recommend you to re-write the query like this and see if it makes any difference
SELECT * , ROUND( SQRT( POW( ( 69.1 * ( company_branch_lat - 52.4862 ) ) , 2 ) + POW( ( 53 * ( company_branch_lng - - 1.8905 ) ) , 2 ) ) , 1 ) AS distance,
hire_car_day + ( hire_car_day * 0.2 * ! hire_car_incvat ) AS hire_car_day_incvat,
hire_car_addday + ( hire_car_addday * 0.2 * ! hire_car_incvat ) AS hire_car_addday_incvat,
hire_car_week + ( hire_car_week * 0.2 * ! hire_car_incvat ) AS hire_car_week_incvat,
hire_car_weekend + ( hire_car_weekend * 0.2 * ! hire_car_incvat ) AS hire_car_weekend_incvat
FROM hire_car
LEFT JOIN company_branch ON company_branch_id = hire_car_branchid
LEFT JOIN hire_cartypelink ON hire_cartypelink_carhireid = hire_car_id
LEFT JOIN users ON company_branch_userid = user_id
WHERE ROUND( SQRT( POW( ( 69.1 * ( company_branch_lat - 52.4862 ) ) , 2 ) + POW( ( 53 * ( company_branch_lng - - 1.8905 ) ) , 2 ) ) , 1 ) <= 30
ORDER BY CASE hire_car_day_incvat
WHEN 0
THEN 40000
ELSE hire_car_day_incvat
END , distance ASC
LIMIT 0 , 30