I couldn't find anything on this, hence the new thread.
We have an application where the data is stored in SQL Server. Some tables have columns of the type "Geography". We use the SQL-Server function STDistance to filter out data within a specified distance. Now we are researching a little on converting the application to PHP for different reasons. One of the heaviest reasons is the cost of ASP.Net and SQL-Server. Now i can't seem to find anything on how MySQL handles Geography-datatype, am i right it doesn't exist?
Isn't it possible to create own functions in MySQL? I thought i could create simple function that calculates whether a location is within the desired radius. What would be the most efficient way of doing this? Of course i could calculate for each row if the coordinates is within the radius, but that feels inefficient and not like a very scalable solution. I was thinking that i first would select all the rows where x1>lat>x2 and y1>lon>y2 and then do the "heavy calculation".
What would be the best way of doing this?
Related
i am working on a GEO-enabled application where i have a obvious use case of searching users within some distance of given user location .Currently i am having MySQL DB used. as the User table is expected to be very large by time the time for getting results will get longer (too long in case it need to traverse entire table).
i am using InnoDB as my table do need many things which MYISAM cant do. i have tried mongo and had a test drive with adding 5 million users and doing some test over them . now i am curious to know what MYSQL can offer in same situation as i will prefer MYSQL if it gives slightly near results to mongo .
My user table is having other fields plus a lat field and a lng (both indexed). still it takes much time. can anyone suggest a better design approach for faster results.
Mongo has a bunch of very useful built in geospatial commands and aggregations that will be ideal for your given case of finding users near to a given user point. Others include within that finds points within a bounding box or polygon. In your case the geoNear aggregation is perfect and can provide the calculated distance away from the given point.
You will have to code a lot of that functionality with mysql. Then you also have Postgis an add on for Postgres. Postgres is the classic open source Mysql competitor and Postgis has been around longer than Mongo and the database presumably behind open street maps, government gis and similar.
But to the problem, you need to use geojson format and 2dsphere index that you might not be using. Post a single record of your data.
I recently asked a question about many-to-many relationships and how they can be used to calculate intersections that got answered pretty fine. Now, there is another nice-to-have requirement for our cube to extend that to more data. The general question remains: How many orders contain both product x and y?
However, the measure groups are now much larger, currently about 1.4 billion rows. I tried to implement that using the method described in the other post, with several hidden cross-referenced measure groups. However, this is simply too much for our hardware, the cube is reaching sizes next to 0.5 TB, and querys take several minutes to complete.
Now I would try to use another option: Can I access our relational database in a calculated measure? It seems I can, using UDFs like described in this article. I could write a Function in c# that queries our relational database and returns all the orders that contain the products chosen by the user. But in order to do that, I need to supply all the dimensional data the user has selected to the UDF. I also need the UDF to return the calculated value so it can be output as the result of the calculated member. Is that possible? If yes, how? The example microsoft provides only includes a small deterministic string-function as the UDF.
Here my own results:
It seems to be possible, though with limitations. The class Microsoft.AnalysisServices.AdomdServer.Context can provide you with the currentMember of each Hierarchy, however this does not work with Excel-Style-Subselects. It either contains a single member or the AllMember.
Another option is to get the MDX query using the dmv SELECT * FROM $System.DISCOVER_SESSIONS. There will be a column on that view which contains the last mdx query for a given session. However in order to not overwrite your own last query, you will need to not use the current connection, but to open a new one. The session id can be obtained through Microsoft.AnalysisServices.AdomdServer.Context.CurrentConnection.SessionID.
The second approach is ok for our use-case. It does not allow you to handle axes, since the udf-function has a cell-scope, but you don't know which cell you are in. If anyone of you knows anything about that last bit, please tell me. Thanks!
Well I want your opinions about this case:
I need a database that will have... two or three tables at most, one of them will have points (latitude, longitude) and some other info.
It's really simple what I need: Get the points within a given radius.
I'm not asking how to do it (but any advice is more than welcome, specially if it's about good practices), I want to know if making use of the MySQL's spatial support would help. Since what I need is fairly easy to get with just one query, what I expect by using Spatial support is to increase performance.
So, are the spatial indexes going to help noticeably? I don't think the table will store that many points. I'd say no more than 200.
If it's really only 200 points, I recommend you do without: This makes it much easier to write portable SQL (which I consider an important thing).
Write your SQL so, that first longitued and latitude are checked against the precalculated mins and maxes (giving you a rectangle), then check for the radius. This way, you will only need to calculate the radius without finally selecting the point for 1/pi of the result set.
I personally consider this an acceptable tradeof against writing SQL, that could if must be executed against SQlite or whatever.
I'm starting a new project and although I'm used to MySQL, I'm worried about efficiency. I'm open to other options, and graph databases sound intriguing.
I will need to find similar users based on location and rating like values. In mysql I probably would have to join across 2 many to many relationships and order based on distance of both location and those values (euclidean distance probably). MySQL seems slow with things like that.
I will also need to do things like find 10 nodes with text that starts with a sub string, and has the largest number of connections (which is an autocomplete I guess).
Would Neo4j or another graph database do this easily and efficiently?
Yes, Neo4J is certainly more appropriate than MySQL. I've used it myself for similarity searches and continue to do so. Check out Cypher, or Gremlin depending on how complex your criteria are -- together with the inbuilt Lucene index, it's terrific.
Examples of what you may be trying to achieve: http://docs.neo4j.org/chunked/stable/data-modeling-examples.html
Is there any way to have MySQL order results by how close they 'sound' to a search term?
I'm trying to order fields that contain user input of city names. Variations and misspellings exist, and I'd like to show the 'closest' matches at the top.
I know soundex may not be the best algorithm for this, but if it (or another method) could be reasonable successful - it may be worth having the sorting done by the database.
Soundex is no good for this sort of thing because different words can give you the same Soundex results and will therefore sort arbitrarily. A better solution for this is the Levenshein Edit Distance algorithm and you may be able to implement it as a function in your database: Link to Levensheint impl. as MySql stored function!!!
You can also check out this SO link. It contains a Sql server (T-SQL-specific) implementation of the algorithm but it should be possible to port. The mechanics of the algorithm are fairly simple needing only a 2D array and looping over string.