I have Lat/Lons as POINT coordinates and want to cluster them based on location with presto. For now, I am rounding the lat/lons to 2 decimals and converting them to strings, concatening them and finally grouping by. But this way I lose the information of individual points. Is there any good and clean way to do this (like may be ST_Cluster* functions in postgis) using presto?
Trino (formerly known as Presto SQL) seems to have https://trino.io/docs/current/functions/geospatial.html
geospatial functions, but nothing equivalent to st_cluster as you asked for. Probably, you may use function like ST_Distance for concatenation instead of converting to decimal, strings..
Though not as clean as directly using st_cluster, but a workaround to create clustering like behavior using existing geospatial functions
Related
I have a case where I need to calculate the Levenshtein distance between two columns for a row in MySQL. There are UDF's available for this, but I need to do this without a UDF. The reason for this is that I am using MemSQL, which is an extremely fast in-memory database, but does not support UDFs - but it does support nearly any query you can run in MySQL. Is anyone aware of a non-udf implementation of the Levenshtein distance algorithm as a query? Something like the following UDF:
http://www.artfulsoftware.com/infotree/qrytip.php?id=552
I'm working on converting this myself as well. I'm open to other solutions as well (aka, other ways to make this happen in MemSQL).
Note: I cannot using Hamming distance. That would be simpler, but the use case calls for Levenshtein distance.
I try to figure out, if there is the possibilty to create a geometry collection from an exisitng geometry-column with 1000 entrys or more, by the use of an SQL select-statment in MySQL. Any Ideas?
Thanks for answering
(please no PHP, Perl, etc. solutions)
You can do this by converting your geometry objects to text in the WKT format, manipulating them, and converting them back to the internal format.
Judicious use of MySQL functions like GeomFromText(), AsText(), CONCAT(), GROUP_CONCAT(), and string processing functions should make it possible to gather up multiple items and create new ones.
If you have 1000 objects you want in one single collection, you will run into a string-length limitation in GROUP_CONCAT. You will need to change the group_concat_max_len system variable.
This SQL isn't going to be very beautiful, eh? It certainly isn't going to be easy to debug.
The site currently does mainly range searches (latitude & longitude) with some filtering like WHERE color = "red" type of clauses. However using MySQL with geospatial index is still quite slow and I need to speed it up.
Problem: Will using Solr to do the search be a good idea?
If so, should I only duplicate the range columns from MySQL into Solr, and do the WHERE clauses in MySQL, or do both type of queries in Solr?
I've read that Solr is not for storing data like a database (ie. MySQL). Does this mean that if my search can take place over 10 different columns (or field in Solr terms), and the MySQL table that I replicated Solr's from only has 11 tables, I would still keep the MySQL table even though that will use up almost twice as much storage space half of which is redundant?
It appears that I'm using structured data (because each row has many columns defined?) and storing the entire table in Solr instead having redundant data on MySQL and Solr will save storage space and number of database access operations when writing. Is Solr a good choice here?
In terms of speed, would it be better to use PostGIS or Solr?
Solr has very fast numerical/date range queries. Solr 3 geospatial takes advantage of that, and I wrote a plugin that does even better. I doubt MySQL is faster.
That said, if the sole problem you are trying to solve is slow geospatial queries then bringing in Solr may solve it but will add a lot of overall complexity to your system since it isn't designed to replace relational databases--it works alongside them. Don't get me wrong; Solr is awesome, particularly for faceted navigation and text search. But you didn't state you wanted to take advantage of Solr's primary features.
PostGIS is by far the most mature open-source GIS storage system. I suggest you try it as an experiment to see if it's better. I would try a lat + lon pair of columns approach like what you are doing now with MySQL, and I would also try using the PostGIS native geospatial way to do it, whatever that is exactly.
One thing you could try in either MySQL or PostGIS is to round your latitude and longitude value to the number of decimals to get an appropriate level of precision you need, which is surely far less than the full precision of a double. And if you store them in floats rather than doubles, right there the precision is capped to 2.37 meters. The system you use will probably have a much easier time doing range queries if there are fewer distinct values to scan over.
I couldn't find anything on this, hence the new thread.
We have an application where the data is stored in SQL Server. Some tables have columns of the type "Geography". We use the SQL-Server function STDistance to filter out data within a specified distance. Now we are researching a little on converting the application to PHP for different reasons. One of the heaviest reasons is the cost of ASP.Net and SQL-Server. Now i can't seem to find anything on how MySQL handles Geography-datatype, am i right it doesn't exist?
Isn't it possible to create own functions in MySQL? I thought i could create simple function that calculates whether a location is within the desired radius. What would be the most efficient way of doing this? Of course i could calculate for each row if the coordinates is within the radius, but that feels inefficient and not like a very scalable solution. I was thinking that i first would select all the rows where x1>lat>x2 and y1>lon>y2 and then do the "heavy calculation".
What would be the best way of doing this?
Is it possible to use MySQL's spacial search to find points inside of a 3D polygon?
Or better still, is it possible to use MySQL to find the values on the surface of an HSV cylinder?
Just so you know MySQL only uses bounding boxes for its analysis and is not very accurate. I would not reccomend using it for any spatial analysis. Look at the documentation and you will will see most features are not implemented.
You could certain use PostGIS or SpatialLite (postgresql or sqllite respectively) to do what you require. For the projection just pick -1 which means no projection.