How to find routes near a point - mysql

I have a lat and lon coordinates of a spot and the radius in which I want to search for a stop, I then execute a function from google-maps to query my GTFS database with those variables but I don't know how the query should look. Can I select the wanted routes using only sql query ? If so, how can I do that?
If it can't be done using only sql what are my options?
*sorry for the broad question and no code samples but I'm new to this and need some basic concept guidance sometimes.
anyway thanks for the help.

(Caveat: I'm not that familiar with MySQL and these queries are untested.)
First define a function in MySQL to calculate the distance between pairs of lat-long points. See e.g. this answer. Then, to select stops near a given point:
SELECT stop_id
FROM stops
WHERE getDistanceBetweenPointsNew(stop_lat, stop_lon, my_lat, my_lon) < my_dist;
There is no extremely natural way to find routes associated with stops in the GTFS spec. To do so, you'll need to join trips against stop_times, which will be slow if your stop_times table is large and/or unindexed. I suggest pre-calculating a table associating stops and routes:
CREATE TABLE route_stop AS
SELECT DISTINCT route_id, stop_id
FROM trips
JOIN stop_times
ON trips.trip_id = stop_times.trip_id;
Assuming this table has been created, you can find the list of routes that stop near a given point like so:
SELECT route_id
FROM stops
JOIN route_stop
ON stops.stop_id = route_stop.stop_id
WHERE getDistanceBetweenPointsNew(stop_lat, stop_lon, my_lat, my_lon) < my_dist;

Related

Sequel query causes script to hang and causes computer to slow down

I have the following code:
team_articles = user.npt_teams.to_a.inject({}) {|arts,team|
arts.merge({ team.name =>
NptArticle.join(:npt_authors).join(:users).join(:npt_teams).where(:npt_teams__id => team.id).to_a.uniq})
}
It causes my terminal to stop responding and my Macbook to slow down.
In mysqlworkbench it gets a response instantly.
A suggestion was to create a lighter version of the NptArticle object but I'm not quite sure how to create a version that pulls less columns so any suggestion to fix this issue would be great.
This is the table.
The generated SQL is:
SELECT * FROM `npt_articles` INNER JOIN `npt_authors` INNER JOIN `users` INNER JOIN `npt_teams` WHERE (`npt_teams`.`id` = 1)
I'd love to upgrade the Ruby version but I can't. I'm working off an old code-base and this is the version of Ruby it uses. There are plans to re-build in the future with more modern tools but at the moment this is what I have to work with.
Results from :
EXPLAIN SELECT * FROM npt_articles INNER JOIN npt_authors INNER JOIN users INNER JOIN npt_teams WHERE (npt_teams.id = 1);
So for npt_team.id =1 you are performing a cross join for all:
npt_articles
npt_authors
users
If the number of articles, authors and users is even moderate you would get a huge number of results as the joins aren't restricted. Normally, you would use something like:
INNER JOIN `npt_authors` ON (npt_articles.ID=npt_authors.articleID)
(it depends on how your database relates).
In addition, you would need indexes on the fields that relate the tables to each other, which will speed things up as well.
Look at the rows column of the EXPLAIN SELECT. That is how many rows are being processed for each part of the join. To get an estimate of the total number of rows processed, multiply these numbers together. 1 x 657 x 269723 x 956188 = rather a lot.
I'm not Ruby wiz so perhaps somebody else can post how you do this.

Draw routes with leaflet from GTFS data

How to extract stops AND their stop_sequence knowing a route_id from GTFS data using MySQL.
I want this because I'm trying to draw the routes using Leaflet which requires to give the stops coordinates in the right order.
I've only found the stop_sequence information in the stop_times.txt file, but it's only correct for one trip on this route.
This answer only tells which are the stops that are associated with a certain route, but not in the good order
I think you've arrived at your own answer here: Stops are ordered in sequence only along a specific trip, of which a route normally has many. This is meant to accommodate routes that have multiple branches or that change their path at certain times, such as a route that makes a diversion through an industrial park during rush hour.
What you'll need to do is first identify a trip that is typical of the route you intend to plot, and note its trip ID. To get a list of all the trips along a specific route, run a query like
SELECT id, headsign, short_name, direction_id
FROM trips
WHERE route_id = <route_id>;
Once you've selected a trip, getting the list of the stops it visits, in order, is straightforward:
SELECT code, name, lat, lon, arrival_time, departure_time
FROM stops
INNER JOIN stop_times ON stop_times.stop_id = stops.id
WHERE trip_id = <trip_id>
ORDER BY stop_sequence ASC;
(I've added a few extra fields here for clarity; it sounds like all you really need are the lat and lon fields included in the results.)
So how do you identify a "typical" trip for the route you want to plot? Often the headsign information for a trip indicates its branch or destination. If you need to be more specific—identifying trips that run between certain hours on certain days, for instance—the information in the calendars and calendar_dates tables can help you narrow these down.

MySQL How to efficiently compare multiple fields between tables?

So my expertise is not in MySQL so I wrote this query and it is starting to run increasingly slow as in 5 minutes or so with 100k rows in EquipmentData and 30k or so in EquipmentDataStaging (which to me is very little data):
CREATE TEMPORARY TABLE dataCompareTemp
SELECT eds.eds_id FROM equipmentdatastaging eds
INNER JOIN equipment e ON e.e_id_string = eds.eds_e_id_string
INNER JOIN equipmentdata ed ON e.e_id = ed.ed_e_id
AND eds.eds_ed_log_time=ed.ed_log_time
AND eds.eds_ed_unit_type=ed.ed_unit_type
AND eds.eds_ed_value = ed.ed_value
I am using this query to compare data rows pulled from a clients device to current data sitting within their database. From here I take the temp table and use the ID's off it to make conditional decisions. I have the e_id_string indexed and I have e_id indexed and everything else is not. I know that it looks stupid that I have to compare all this information, but the clients system is spitting out redundant data and I am using this query to find it. Any type of help on this would be greatly appreciated whether it be a different approach by SQL or MySql Management. I feel like when I do stuff like this in MSSQL it handles the requests much better, but that is probably because I have something set up incorrectly.
TIPS
index all necessary columns which are using with ON or WHERE condition
here you need to index eds_ed_log_time,eds_e_id_string, eds_ed_unit_type, eds_ed_value,ed_e_id,ed_log_time,ed_unit_type,ed_value
change syntax to SELECT STRAIGHT JOIN ... see more reference

remove duplicates in mysql database

I have a table with columns latitude and longitude. In most cases the value extends past the decimal quite a bit: -81.7770051972473 on the rare occasion the value is like this: -81.77 for some records.
How do I find duplicates and remove one of the duplicates for only the records that extend beyond two decimal places?
Using some creative substring, float, and charindex logic, I came up with this:
delete l1
from
latlong l1
inner join (
select
id,
substring(cast(latitude as varchar), 0, INSTR(CAST(latitude as varchar))+3, '.') as truncatedLat
from
latlong
) l2 on
l1.id <> l2.id
and l1.latitude = cast(l2.truncatedLat as float)
Before running, try select * in lieu of delete l1 first to make sure you're deleting the right rows.
I should note that this worked on SQL Server using functions I know exist in MySQL, but I wasn't able to test it against a MySQL instance, so there may be some little tweaking that needs to be done. For example, in SQL Server, I used charindex instead of instr, but both should work similarly.
Not sure how to do that purely in SQL.
I have used scripting languages like PHP or CFML to solve similar needs by building a query to pull the records then looping over the record set and performing some comparison. If true, then VERY CAREFULLY call another function, passing in the record ID and delete the record. I would probably even leave the record in the table, but mark some another column as isDeleted.
If you are more ambitious than I, it looks like this thread is close to what you want
Deleting Duplicates in MySQL
finding multi column duplicates mysql
Using an external programming language (Perl, PHP, Java, Assembly...):
Select * from database
For each row, select * from database where newLat >= round(oldLat,2) and newLat < round(oldLat,2) + .01 and //same criteria for longitude
Keep one of them based on whatever criteria you choose. If lowest primary key, sort by that and skip the first result.
Delete everything else.
Repeat skipping to this step for any records you already deleted.
If for some reason you want to identify everything with greater than 2 digit precision:
select * from database where lat != round(lat,2), or long != round(long,2)

SQL Table design for the following scenario

I am using MySQL and I am having the following scenario:
Table nodes: node_id, lat, lng, name
Table links: node1, node2, name
So there are two tables, in table nodes, it stores all the point and their respective latitude and longitude, and in the table link, where store node1 which reference nodes, and node2, which reference nodes too.
Since in MySQL and Rails we can't really have 2 foreign key pointing to the same table (correct me if I am wrong) and for example if I want to find the starting node_name and ending node_name, how would I construct my SQL statement? I tried
SELECT nodes.name from Nodes, links WHERE nodes.node_id = node1 which kinda works but very slow (I have less than 10k records in each table), and if I want to find both names for starting node and ending node, how can I go over and do it? Or if I want to limit the starting node with lat > x and ending node with lat < y to find all the links?
Thank you.
Regards,
Andy.
yes you can do this. your table structure looks good.
your query is also good. try making sure you have proper indexes on the node_id to help performance.
you can run 2 queries, one for each name, or you can do a union r two subqueries if you want the results to all be in the same query.