I am using a query similar to:
$distance = "st_distance_sphere(Point({$lat}, {$lon}), Point({$lat}, {$lon})) * 0.000621371192";
$query = "SELECT table.*, {$distance} as distance
WHERE {$distance} < 10
AND postcode IN ( 'XX12 3XX', 'XX12 3XX', 'XX12 3XX' )
ORDER BY distance asc
LIMIT 100";
The problem kicks in when postcodes contain 100k+ records, this is when the query will take considerable time to return results.
Is there a way of speeding this query up?
Some of the things that I was considering are:
turning lat and long into a POINT data type. https://dev.mysql.com/doc/refman/5.7/en/spatial-type-overview.html
just like with point one but using Elastic Search https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-queries.html
Related
I use this to find the nearest point
SELECT
id,
ST_Distance(
POINT(52.760667210533,-7.22646337599035),
geo_point
) as distance
from Points
order by distance limit 1
I have a temp table TempPoints with all my candidate points and I want to normalise them onto OSM nodes, but there's lots, so I need a single query to resolve them all in one call. UNION wont let me use order by, and my DB raw query interface wont let me just fire a series of queries separated by ';'.
The temp table has lat and lon but can just as easily have a POINT. How can I go
select id,NearestTo(TempPoint.geo_point,Points) from TempPoints;
EDIT: I can parenthesise each select in my large union query, which solves my issue.
I would still like to be able to join on nearest row.
This might work for you:
SELECT t.id as tid, p.id as pid, p.geo_point
FROM TempPoint t
JOIN Points p ON p.id = (
SELECT p1.id
FROM Points p1
ORDER BY ST_Distance(p1.geo_point, t.geo_point)
LIMIT 1
)
My solution is to issue a series of queries, one for each row, and bind them together with a UNION. The mysql stack will blow eventually so you need to do them in blocks, but 1000 is OK on a default install.
You have to parenthesize the queries as they include an order by. Some points may fail so I label them all with a literal line_no sequence so you can edit and filter the originals. You also need to restrict the query with a
WHERE Contains(<polygon>,point)
clause, else it will try and sort the whole table, where polygon is a bounding box you have to cook up with GEOMFROMTEXT() and POLYGON(). And of course you need a special spatial index on the column!. Here's some code
var SMALL=0.001
var=query=points
.map(function(point){
var bottom=point.lat+SMALL
var top=point.lat-SMALL
var left=point.lon-SMALL
var right=point.lon+SMALL
var polygon=[
[bottom,left],
[top,left],
[top,right],
[bottom,right],
[bottom,left]
]
polygon="POLYGON(("+polygon.map(function(point){
return point.join(' ')
})
.join(",")+"))"
point.line_no=line_no++
return "(SELECT "+point.line_no+" as line_no,id, ST_Distance(POINT("+
point.lat+","+point.lon+
"),geo_point) as distance"+
" from Points "+
" WHERE Contains(GeomFromText('"+polygon+"'),geo_point) "+
" order by distance limit 1) "
})
.join(" UNION ")+" order by line_no"
return sequelize.query(query)
I want to select a random row with a specific where statement but the query is taking to long (around 2.7 seconds)
SELECT * FROM PIN WHERE available = '1' ORDER BY RAND() LIMIT 1
The database contains around 900k rows
Thanks
SELECT * FROM PIN WHERE available = '1' ORDER BY RAND() LIMIT 1
means, that you are going to generate a random number for EVERY row, then sort the whole result-set and finally retrieve one row.
That's a lot of work for querying a single row.
Assuming you have id's without gaps - or only little of them - you better use the programming language you are using to generate ONE random number - and fetch that id:
Pseudo-Example:
result = null;
min_id = queryMinId();
max_id = queryMaxId();
while (result == null){
random_number = random_beetween(min_id, max_id);
result = queryById(randomNumber);
}
If you have a lot of gaps, you could retrieve the whole id-set, and then pick ONE random number from that result prior:
id_set = queryAllIds();
random_number = random_beetween(0, size(id_set)-1);
result = queryById(id_set[random_number])
The first example will work without additional constraints. In your case, you should use option 2. This ensures, that all IDs with available=1 are pre-selected into an 0 to count() -1 array, hence ignoring all invalid ids.
Then you can generate a random number between 0 and count() -1 to get an index within that result-set, which you can translate to an actual ID, which you are going to fetch finally.
id_set = queryAllIdsWithAvailableEqualsOne(); //"Condition"
random_number = random_beetween(0, size(id_set)-1);
result = queryById(id_set[random_number])
I have this query that I would like to add a time search to.
Here is my working query:
$query = "SELECT *,(((acos(sin((".$lat."*pi()/180)) * sin((Lat*pi()/180)) +
cos((".$lat."*pi()/180)) * cos((Lat*pi()/180)) * cos(((".$lon."- Lon) *
pi()/180))))*180/pi())*60*1.1515) as distance
FROM items
HAVING distance < ".$distance."
ORDER BY distance
LIMIT ".$min." , ".$max."";
I would like to add something like this
WHERE timestamp > ".$somePastDate."
For hours now I have tried all combinations I can think of with no luck. I bet its simple too and I'll be shaking my head. Thanks in advance.
I suggest you use a nested query for this, as follows
SELECT *, big_cosine_law_distance_formula AS distance
FROM (
SELECT *
FROM items
WHERE items.timestamp > ".$somePastDate."
) AS i
HAVING distance < ".$distance."
The inner query will narrow down your items by time, so you don't have to grind out the big distance formula on them all.
You might also consider using a faster bounding-box-based search to narrow down your items spatially, as described here. http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/
You can troubleshoot this kind of thing by starting with the inner query.
SELECT *
FROM items
WHERE items.timestamp > ".$somePastDate."
When you're getting reliable results from that query, add the outer one.
I have a query that I wrote a couple of years ago for a membership website. I'm not extremely well versed in using $wpdb (or MYSQL code in general) to write custom queries, and the site has grown quite a bit. There are about 150k rows in the wp_usermeta table now, and the page where the query runs hangs for a couple of seconds now before loading. I expect that this will get worse as time goes on and the site gains more users.
Any help in figuring out how to speed this query up would be greatly appreciated.
$paged = (get_query_var('paged')) ? get_query_var('paged') : 1;
$limit = 15;
$offset = ($paged - 1) * $limit;
$key = 'first_name';
$sql = "SELECT SQL_CALC_FOUND_ROWS {$wpdb->users}.* FROM {$wpdb->users}
INNER JOIN {$wpdb->usermeta} wp_usermeta ON ({$wpdb->users}.ID = wp_usermeta.user_id)
WHERE 1=1
AND wp_usermeta.meta_key = '$key'
AND wp_usermeta.meta_value <> ''
AND wp_users.user_registered < '2014-01-30'
ORDER BY wp_usermeta.meta_value ASC
LIMIT $offset, $limit";
$members = $wpdb->get_results($sql);
$found_rows = $wpdb->get_var("SELECT FOUND_ROWS();");
foreach ($members as $member) { // display member info here }
*Note: I am paginating...not displaying all results on one page.
A couple of things here.
First, you seem to be ordering by user first name, but not retrieving or displaying it. That's a bit odd. You could order by wp_users.user_nicename, and omit the join to wp_usermeta. That will save some time in joining. That column's value is set by the user in her profile, and reflects the identity by which she wants to be known. So, it's an appropriate ordering column, most likely.
Second, you're retrieving all the columns (*) in wp_users. Can you avoid retrieving them all, and rather enumerate the one you actually need? That will save time in sorting.
Your query would become this.
$sql = "SELECT SQL_CALC_FOUND_ROWS
id, user_login, user_nicename, etc etc
FROM {$wpdb->users}
WHERE user_registered < '2014-01-30'
ORDER BY user_nicename ASC
LIMIT $offset, $limit";
You are stuck, as you are paginating a large number of users, with some inescapable inefficiency. But this should be better than what you have.
Hi I need to get the results and apply the order by only in the limited section. You know, when you apply order by you are ordering all the rows, what I want is to sort only the limited section, here is an example:
// all rows
SELECT * FROM users ORDER BY name
// partial 40 rows ordered "globally"
SELECT * FROM users ORDER BY name LIMIT 200,40
The solution is:
// partial 40 rows ordered "locally"
SELECT * FROM (SELECT * FROM users LIMIT 200,40) AS T ORDER BY name
This solution works well but there is a problem: I'm working with a Listview component that needs the TOTAL rows count in the table (using SQL_CALC_FOUND_ROWS). If I use this solution I cannot get this total count, I will get the limited section count (40).
I hope you will give me solution based on the query, for example something like: "ORDER BY LOCALLY"
Since you're using PHP, might as well make things simple, right? It is possible to do this in MySQL only, but why complicate things? (Also, placing less load on the MySQL server is always a good idea)
$result = db_query_function("SELECT SQL_CALC_FOUND_ROWS * FROM `users` LIMIT 200,40");
$users = array();
while($row = db_fetch_function($result)) $users[] = $row;
usort($users,function($a,$b) {return strnatcasecmp($a['name'],$b['name']);});
$totalcount = db_fetch_function(db_query_function("SELECT FOUND_ROWS() AS `count`"));
$totalcount = $totalcount['count'];
Note that I used made-up function names, to show that this is library-agnostic ;) Sub in your chosen functions.