I'm trying to define a GTFS feed for a ferry crossing between 2 ports (A <-> B). There may be 2 ferries running between these ports.
routes.txt
route_id,route_short_name,route_long_name,route_desc,route_type
AB,A-B,A << >> B,Ferry travelling between A and B,4
calender.txt
service_id,monday,tuesday,wednesday,thursday,friday,saturday,sunday,start_date,end_date
FULLWEEK,1,1,1,1,1,1,1,20180103,20180430
trips.txt
route_id,service_id,trip_id,trip_headsign,direction_id,shape_id,wheelchair_accessible,bikes_allowed
AB,FULLWEEK,a_b,B Dest,0,ab_shape,1,1
AB,FULLWEEK,b_a,B Dest,1,ab_shape,1,1
stops.txt
stop_id,stop_name,stop_desc,stop_lat,stop_lon,location_type
A,B-A,Travelling from B to A,xxxx,xxxx,1
B,A-B,Travelling from A to B,xxxx,xxxx,1
stop_times.txt
trip_id,arrival_time,departure_time,stop_id,stop_sequence
a_b,02:45:00,03:00:00,A,1
a_b,04:45:00,05:00:00,A,1
b_a,00:45:00,01:00:00,B,2
b_a,03:45:00,04:00:00,B,2
^^ this is where the errors appear in the feed validator
Duplicate stop_sequence in trip_id a_b
I can't work if I should be using 2 routes instead of 1 (and stop using the direction_id value in trips.txt) and what the sequence of the timetables are, since the timetables at both ports may not match up as a sequence as there may be multiple ferries running between the 2 ports.
Thank you.
Figured it out, basically trips.txt must contain an entry for every scheduled departure. I was treating trips like routes, when in fact every departure is it's own "trip".
Related
Slot(sid, wall, x, y)
Hold(hid, color, desc)
Route(rid, name, circuit)
Placement(rid, hid, sid)
Slot represents the possible locations for a hold. sid is a surrogate key, the wall is the name of the wall (e.g., "north," "front"), (x,y) is the location on the wall, measured in meters.
Hold manages the inventory of shaped resin pieces that simulate outcroppings on which to step or grab.
Route is a set of holds attached to particular slots. name is a descriptive text string. circuit is a label indicating that this route is part of a set of related routes.
sid, hid, rid are integers.
Question: A conflict is when two holds occupy the same slot. A set of routes are compatible if they have no conflicts. Write a query to check that all the routes in the circuit called Beginner are compatible. Your query should return the sid that is causing the conflict.
There would be two sids that conflict with each other as I understand it, but the question only requires one. If their Ids are numeric and incremental then this query should return the latest one.
SELECT MAX(sid)
FROM placement JOIN Route ON Route.rid = Placement.rid
WHERE Route.circuit = 'Beginner'
GROUP BY hid HAVING Count(sid) > 1
I've got this kind of problem with Proton CEP: i currently have a "Sequence" EPA; its input are 2 events. But these events have different granularity: let's say i have A and B events; i receive N "A" events, and M "B" events, where M << N.
So i'd like to have a rule like "if event of type A is not consumed within X seconds, remove it", otherwise i've got a long A events queue; i only need the rule to be evaluated for closest (temporally) events.
Practically, i've got a fake room temperature sensor that sends its temperature updates every 5seconds, and i've got another program that checks external weather and sends it every minute.
Any idea how to solve this situation?
Thank you very much!
I guess that in "consume" you mean arrival, so do you want to evaluate the time the A event took to get to the proton pcoressor? or the time between A events? Do you want to ensure that the A events are indeed continuous in a fix rate? "Removing" an event means to ignore it, since events are not kept anywhere, just processed. At the end, what is that you want to detect here? Like, what is the trend of room temperature compared to the outside temperature? then, emit output events accordingly?
Thanks.
all the relevant event instances are kept within the local state of a corresponding EPA.
For each EPA operand you have policies which dictates how the state is gathered and how the matching set for event derivation is built.
For example, instance selection policy which is defined per operand, and has the values of "Each", "First" and "Last" will tell you if all A instances are examined for match with B instance, or the first (in the order of arrival), or the last.
The consumption policy says what to do with the operand state once a seqence is detected - should the instances of say A which participated in sequence be removed from EPA's state ("consume" value of the policy) or should they remain.
Playing with combination of those policies should give you the behaviour you require
I have four channels in my application: A, B, C, D. Some application users are only interested in documents contained in both channels A and B only. Also can be expressed as: A ∩ B. Others may be interested in a different combination like: A ∩ B ∩ D.
UPDATE
I don't think the following will work anyway
What has been suggested so far is that I can create a new channel (like A_B and A_B_D) for each combination and then tag the documents that meet the intersection criteria accordingly. But you can see how this could easily get out of hand since with just 4 channels, you end up with 15 combinations (11 extra channels).
Is there a way to do this with channels or perhaps some other feature I have missed in Couchbase?
The assignment of channels to a document is done via the sync function. So a document is not "contained" in a channel, but it may have attributes from which the channels to which it is routed can be derived. Only in the simplest default case, the document's channel attribute will route it to the channel having that value of that attribute.
So what you intend can be achieved by putting statements like
if (doc.areas.includes("A") && doc.areas.includes("B") {
channel("AB");
}
into the sync function. (I renamed the channels attribute to areas to make clear to the reader of the program that these are not the actual channels, but that channels are only derived from combinations of them.)
I have a Traveling Salesman problem with some additional constraint. (Note: this is not a homework problem, I've phrased it like one to abstract the problem.)
Given a list of events for a particular day with specific start times and end times, what is the optimal route to maximize the number of events attend? Assume the salesman/socialite must stay until the end of each event. While normal people might eye-ball the list and solve this the Socialite receives invites for up to 20 events each night.
How can one solve for something like this? So far I have investigated the directions API from Google Maps and ArcGIS route planner but the problem exceeds their capabilities.
I don't see how this is related to GIS. But it can be modelled as a linear integer programming problem and solved if the only objective is to maximise the number of events attended (and not minimise the distance travelled).
objective function: max z = x(1) + x(2) + .... + x(N)
constraints:
for all x(i): x(i) = 0 or 1
for all x(i), x(j) where i != j: M.x(i).E(i) + T(i,j) <= M.x(j).S(i)
x(i) equals 1 if the ith event is attended, 0 otherwise. S(i) and E(i) represent the start and end times of the events, in terms of hours into the day. For example, if the 5th event on the list starts at 10am and ends at 11.30am, then S(5) is 10 and E(5) is 11.5. T(i, j) is the travel time between the locations that host event i and event j. M is a large arbitrary constant for scaling.
The objective function maximises the number of events attended. The first constraint specifies that each event is either attended or not attended. The second constraint ensures that if two events are attended, then the socialite has enough time to travel, between the ending time of the first attended event and the starting time of the second attended event.
I'm not a Natural Language Programming student, yet I know it's not trivial strcmp(n1,n2).
Here's what i've learned so far:
comparing Personal Names can't be solved 100%
there are ways to achieve certain degree of accuracy.
the answer will be locale-specific, that's OK.
I'm not looking for spelling alternatives! The assumption is that the input's spelling is correct.
For example, all the names below can refer to the same person:
Berry Tsakala
Bernard Tsakala
Berry J. Tsakala
Tsakala, Berry
I'm trying to:
build (or copy) an algorithm which grades the relationship 2 input names
find an indexing method (for names in my database, for hash tables, etc.)
note:
My task isn't about finding names in text, but to compare 2 names. e.g.
name_compare( "James Brown", "Brown, James", "en-US" ) ---> 99.0%
I used Tanimoto Coefficient for a quick (but not super) solution, in Python:
"""
Formula:
Na = number of set A elements
Nb = number of set B elements
Nc = number of common items
T = Nc / (Na + Nb - Nc)
"""
def tanimoto(a, b):
c = [v for v in a if v in b]
return float(len(c)) / (len(a)+len(b)-len(c))
def name_compare(name1, name2):
return tanimoto(name1, name2)
>>> name_compare("James Brown", "Brown, James")
0.91666666666666663
>>> name_compare("Berry Tsakala", "Bernard Tsakala")
0.75
>>>
Edit: A link to a good and useful book.
Soundex is sometimes used to compare similar names. It doesn't deal with first name/last name ordering, but you could probably just have your code look for the comma to solve that problem.
We've just been doing this sort of work non-stop lately and the approach we've taken is to have a look-up table or alias list. If you can discount misspellings/misheard/non-english names then the difficult part is taken away. In your examples we would assume that the first word and the last word are the forename and the surname. Anything in between would be discarded (middle names, initials). Berry and Bernard would be in the alias list - and when Tsakala did not match to Berry we would flip the word order around and then get the match.
One thing you need to understand is the database/people lists you are dealing with. In the English speaking world middle names are inconsistently recorded. So you can't make or deny a match based on the middle name or middle initial. Soundex will not help you with common name aliases such as "Dick" and "Richard", "Berry" and "Bernard" and possibly "Steve" and "Stephen". In some communities it is quite common for people to live at the same address and have 2 or 3 generations living at that address with the same name. The only way you can separate them is by date of birth. Date of birth may or may not be recorded. If you have the clout then you should probably make the recording of date of birth mandatory. A lot of "people databases" either don't record date of birth or won't give them away due to privacy reasons.
Effectively people name matching is not that complicated. Its entirely based on the quality of the data supplied. What happens in practice is that a lot of records remain unmatched - and even a human looking at them can't resolve the mismatch. A human may notice name aliases not recorded in the aliases list or may be able to look up details of the person on the internet - but you can't really expect your programme to do that.
Banks, credit rating organisations and the government have a lot of detailed information about us. Previous addresses, date of birth etc. And that helps them join up names. But for us normal programmers there is no magic bullet.
Analyzing name order and the existence of middle names/initials is trivial, of course, so it looks like the real challenge is knowing common name alternatives. I doubt this can be done without using some sort of nickname lookup table. This list is a good starting point. It doesn't map Bernard to Berry, but it would probably catch the most common cases. Perhaps an even more exhaustive list can be found elsewhere, but I definitely think that a locale-specific lookup table is the way to go.
I had real problems with the Tanimoto using utf-8.
What works for languages that use diacritical signs is difflib.SequenceMatcher()