The Traveling Salesman as Socialite - google-maps

I have a Traveling Salesman problem with some additional constraint. (Note: this is not a homework problem, I've phrased it like one to abstract the problem.)
Given a list of events for a particular day with specific start times and end times, what is the optimal route to maximize the number of events attend? Assume the salesman/socialite must stay until the end of each event. While normal people might eye-ball the list and solve this the Socialite receives invites for up to 20 events each night.
How can one solve for something like this? So far I have investigated the directions API from Google Maps and ArcGIS route planner but the problem exceeds their capabilities.

I don't see how this is related to GIS. But it can be modelled as a linear integer programming problem and solved if the only objective is to maximise the number of events attended (and not minimise the distance travelled).
objective function: max z = x(1) + x(2) + .... + x(N)
constraints:
for all x(i): x(i) = 0 or 1
for all x(i), x(j) where i != j: M.x(i).E(i) + T(i,j) <= M.x(j).S(i)
x(i) equals 1 if the ith event is attended, 0 otherwise. S(i) and E(i) represent the start and end times of the events, in terms of hours into the day. For example, if the 5th event on the list starts at 10am and ends at 11.30am, then S(5) is 10 and E(5) is 11.5. T(i, j) is the travel time between the locations that host event i and event j. M is a large arbitrary constant for scaling.
The objective function maximises the number of events attended. The first constraint specifies that each event is either attended or not attended. The second constraint ensures that if two events are attended, then the socialite has enough time to travel, between the ending time of the first attended event and the starting time of the second attended event.

Related

Confused about Rewards in David Silver Lecture 2

While watching the Reinforcement Learning course by David Silver on youtube (and the slide: Lecture 2 MDP), I found the "Reward" and "Value Function" really confusing.
I tried to understand the "given rewards" marked on the slide (P11), but I cannot figure out why it is the case. Like, the "Class 1: R = -2" but "Pub: R = +1"
why the negative reward for Class and the positive reward for Pub? why the different value?
How to calculate the reward with the Discount Factor? (P17 and P18)
I think the lack of intuition for Reinforcement Learning is the main reason why I have encountered this kind of problem...
So, I'd really appreciate it if someone can give me a little hint.
You usually set the reward and the discount such that using RL you will drive the agent to solve a task.
In the student example the goal is to pass the exam. The student can spend his time attending a class, sleeping, on Facebook or at the pub. Attending a class is something "boring", so the student doesn't see the immediate benefits of doing it. Hence the negative reward. On the contrary, going to the pub is fun and gives a positive reward. However, only by attending all 3 classes the student can pass the exam and get the big final reward.
Now the question is: how much does the student value immediate vs future rewards? The discount factor tells you that: a small discount gives more importance to immediate rewards, because future rewards just "fade" in the long run. If we use a small discount, the student may prefer to always go to the pub or to sleep. With a discount close to 0, already after one step all rewards get close to 0 as well, so at each state the student will try to maximize the immediate reward, because after that "nothing else matter".
On the contrary, high discounts (max 1) value long-term rewards more: in this case the optimal student will attend all classes and pass the exam.
Choosing the discount can be tricky, especially if there is no terminal state (in this case "sleep" is terminal), because with a discount of 1 the agent may ignore the number of steps used to reach the highest reward. For instance, if classes would give a reward of -1 instead of -2, for the agent would be the same to spend time alternating between "class" and "pub" forever and at some point to pass the exam, because with discount 1 the rewards never fade, so even after 10 years the students will still get +10 for passing the exam.
Think also of a virtual agent having to reach a goal position. With discount 1, the agent would not learn to reach it in the least amount of steps: as long as it reaches it, it's the same for him.
Beside that, there is also a numerical problem with discount 1. Since the goal is to maximize the cumulative sum of the discounted reward, if rewards are not discounted (and the horizon is infinite) the sum will not converge.
Q1) First of all you should not forget that there rewards are given by the environment. The actions taken by the agent do not have an effect on the rewards of the environment, but of course it affects the reward gained by the followed trajectory.
In the example these +1 and -2 are just funny examples :) "As a student" you get bored during the class, so the reward of it is -2, while you have fun in the pub, so the reward is +1. Don't get confused with the reasons behind these numbers, they are environment given.
Q2) Let's do the calculation for the state with the value 4.1 in "Example: State-Value Function for Student MRP (2)":
v(s) = (-2) + 0.9 * [(0.4 * 1.9) + (0.6 * 10)] = (-2) + 6.084 =~ 4.1
Here David is using the Bellman Equation for MRPs. You can find it on the same slide.

Data model for timeline event synchronisation

I am looking for ideas on the data model for the following problem (and the proper CS terminology):
A (horizontal) "timeline" with several rows (A,B,C) contains "events" (1,2,3) width different durations (width) at different times (absolute x position or by delay "." after previous event):
A 1111....222222
B 33333
------------------
T 0123456789ABCDEF
(The rows are only interesting for graphical representation of overlapping/parallel "events", so they probably are not essential to the data model.)
Event duration may vary, affecting the whole timing:
A 11....222222
B 33333+3
------------------
T 0123456789ABCDEF
But let event 2 require events 1 and 3 to be finished, so the timing should look like this:
A 11.... 222222
B 33333+3
------------------
T 0123456789ABCDEF
(let's ignore that the original delay at T=7 is now missing.)
Originally I thought I'd have to have some "elastic" synchronization elements, one for each row:
A 11....####222222
B 33333+3#
------------------
T 0123456789ABCDEF
Thus the original problem of how to model and sync the sync elements in the two different "rows". But, as established above, this is only a matter of graphical/parallel representation.
Rather, the sync is a condition that could be "attached" to event 2, modifiying or determining its beginning.
If an event "has" a condition, it will not have an absolute or relative start time. Its start can only be determined at the ends of the "linked" events (1 and 3).
So, given (a list of) some events with variable duration and either an absolute start time or a delay relative to another event's end, how could the condition "events 1 and 3 ended" be modelled to determine the start of "event 2"?
(I will prototype this in JavaScript and eventually implement in C/C++, so any sample code provided should not use high-level data types or libraries.)
What you need is an object that I would call a TimeFrame. The object would have the attributes duration, link and type, where link can be a precise time or a link to another TimeFrame and type accounts for the kind of link. For instance, a given TimeFrame that starts at a known time would have that time as its link attribute and the type would be TIME. A TimeFrame that is linked to the end of another would have that other TimeFrame as its link attribute and START-END as its type and so on.
Using the combination between link and type you could also support other types of links such as START-START, END-START or END-END.
UPDATE
Also, in order to allow some time interval between say, the end of a TimeFrame and the start of the next, one can add the attribute lag, which represents any delay between events. So, for instance if tf1 and tf2 are TimeFrames such that tf2 must start 5 time units after the end of tf1 the attributes of tf2 would be link = tf1, type = START-END, duration = <something> and lag = 5. Note also that the lag could be negative, which would extend the expressiveness of the model to a broad range of relationships.
While #Leandro Caniglia nicely rephrased my question into an Object and Attributes, essentially, I see two options:
the whole list of "events" needs to be evaluated at "condition" (start/end) to check dependent "events".
adding a "link" to a "parent" also creates a link to the "child" (no need to evaluate all pending event's links).
Also:
The "link" property needs to be a List or Array to be able hold several references (e.g. 2:[1,3]).
Analogous to the link property start_me_on_condition a stop_me_on_condition association may be desirable (see Leandro's suggestion of type, it would need to be extended to support multiple links+type)
An independet delay "event" might be more practical than a lag property.

Proton CEP Fiware: delete old received events

I've got this kind of problem with Proton CEP: i currently have a "Sequence" EPA; its input are 2 events. But these events have different granularity: let's say i have A and B events; i receive N "A" events, and M "B" events, where M << N.
So i'd like to have a rule like "if event of type A is not consumed within X seconds, remove it", otherwise i've got a long A events queue; i only need the rule to be evaluated for closest (temporally) events.
Practically, i've got a fake room temperature sensor that sends its temperature updates every 5seconds, and i've got another program that checks external weather and sends it every minute.
Any idea how to solve this situation?
Thank you very much!
I guess that in "consume" you mean arrival, so do you want to evaluate the time the A event took to get to the proton pcoressor? or the time between A events? Do you want to ensure that the A events are indeed continuous in a fix rate? "Removing" an event means to ignore it, since events are not kept anywhere, just processed. At the end, what is that you want to detect here? Like, what is the trend of room temperature compared to the outside temperature? then, emit output events accordingly?
Thanks.
all the relevant event instances are kept within the local state of a corresponding EPA.
For each EPA operand you have policies which dictates how the state is gathered and how the matching set for event derivation is built.
For example, instance selection policy which is defined per operand, and has the values of "Each", "First" and "Last" will tell you if all A instances are examined for match with B instance, or the first (in the order of arrival), or the last.
The consumption policy says what to do with the operand state once a seqence is detected - should the instances of say A which participated in sequence be removed from EPA's state ("consume" value of the policy) or should they remain.
Playing with combination of those policies should give you the behaviour you require

Generate random variable in real-time without state

I want a function which takes, as input, the number of seconds elapsed since the last time it was called, and returns true or false for whether an event should have happened in that time period. I want it such that it will fire, on average, once per X time passed, say 5 seconds. I also am interested if it's possible to do without any state, which the answer from this question used.
I guess to be fully accurate it would have to return an integer for the number of events that should've happened, in the case of it being called once every 10*X times or something like that, so bonus points for that!
It sounds like you're describing a Poisson process, with the mean number of events in a given time interval is given by the Poisson distribution with parameter lambda=1/X.
The way to use the expression on the latter page is as follows, for a given value of lambda, and the parameter value of t:
Calculate a random number between zero and one; call this p
Calculate Pr(k=0) (ie, exp(-lambda*t) * (lambda*t)**0 / factorial(0))
If this number is bigger than p, then the number of simulated events is 0. END
Otherwise, calculate Pr(k=1) and add it to Pr(k=0).
If this number is bigger than p, then the answer is 1. END
...and so on.
Note that, yes, this can end up with more than one event in a time period, if t is large compared with 1/lambda (ie X). If t is always going to be small compared to 1/lambda, then you are very unlikely to get more than one event in the period, and so the algorithm is simplified considerably (if p < exp(-lambda*t), then 0, else 1).
Note 2: there is no guarantee that you will get at least one event per interval X. It's just that it'll average out to that.
(the above is rather off the top of my head; test your implementation carefully)
Asssume some event type happens on average once per 10 seconds, and you want to print a simulated list of timestamps on which the events happened.
A good method would be to generate a random integer in the range [0,9] each 1 second. If it is 0 - fire the event for this second. Of course you can control the resolution: You can generate a random integer in the range [0,99] each 0.1 second, and if it comes 0 - fire the event for this DeciSecond.
Assuming there is no dependency between events, there is no need to keep state.
To find out how many times the event happens in a given timeslice - just generate enough random integers - according to the required resolution.
Edit
You should use high resolution (at least 20 randoms per period of one event) for the simulation to be valid.

Reconstructing state from time series data events

For a particular project, we acquire data for a number of events and collect variables about those events at the same time. After the data has been collected, we perform a user-customizable analysis on said data to determine whatever it is that the user is interested in.
The data is collected in a form similar to this:
Timestamp Event
0 x = 0
0 y = 1
3 Event A occurred
3 x = 1
4 Event A occurred
4 x = 2
9 Event B occurred
9 y = 2
9 x = 0
To understand the entire state at any time, the most straightforward approach is to walk over the entire set of data. For example, if I start at time 0, and "analyze" until timestamp 5, I know that at that point x = 2, y = 1, and Event A has occurred twice. That's a really simple example. The user might be (and often is) interested in the time between events, say from A to B, and they might specify the first occurrence of A, then B, or the last occurrence of A, then B (respectively, 9-3 = 6 or 9-4 = 5). Like I said, this is easy to analyze when you're walking over the entire set.
Now, we need to adapt the model to analyze an arbitrary window of time. If we look at 0-N, that's the easy case. But if I look at 1-5, for instance, I have no notion of y unless I begin at 0 and know that y was initially 1 and did not change in the window 1-5.
Our approach is to essentially create a dictionary of variables, and run callbacks on events. If one analysis was "What is x when Event A occurs and time is > 3" then we would run that callback on the first Event A, and it would immediately return because time is not greater than 3. It would run again at 4, and and it would report that x was 1 at t=4.
To adapt to the "time-windowing", I think I am going to (in the background) tack on additional conditions to the analysis. If their analysis is just "What is x when Event A occurs", and the current window is 1-5, then I will change it to "What is x when Event A occurs and time >= 1 and time <= 5". Then if the next window is 6-10, I can readjust the condition as necessary.
My main question is: what pattern does this fit? We are obviously not the first people to approach a problem like this, but I have not been able to find how others have approached it. I probably just don't know what exactly to search on Google. Is there any other approach besides keeping a dictionary of the entire global state for looking up a single state at a given time? Note also that the data could have several, maybe tens of thousands of records, so the fewer iterations over the data set, the better.
I think your best approach here would be to take periodic "snapshots" of the full state data, say every 1000 samples (for example), along with recording the deltas. When you're storing your data as offsets from some original value (aka deltas), you don't have any choice but to reconstruct the full data starting with the original values. Storing periodic snapshots will lessen the amount of reconstruction you have to do - the design tradeoff is between low storage requirements but long reconstruction time on the one hand, and higher storage requirements but shorter reconstruction time on the other.
MPEGs, for example, store each frame as the differences between the current frame and the previous frame. Ordinarily, this would force an MPEG to be viewed from the beginning, but the format also periodically stores full frames so that the decoder doesn't have to backtrack all the way to the beginning of the file.
You can search by time in Log(N), and you can have a feeling about how many updates ares acceptable... hence here's my solution:
Pick a number, N, of updates that are acceptable in order to return a result. 256 might be good, given the scales you've mentioned so far.
Every N records, commit an entry of all state to a dictionary, with a timestamp.
Now, you have a tradeoff, dictionary size against speed. N->\infty is regular searching. N<-1 is your current solution, N anywhere else will require less memory, but be slower.
Your implementation is now (for time X):
Log(n) search of subsampled global dictionary to timestamp before X, (timestamped as Y).
Log(n) search of eventlist to timestamp Y, and perform less than N updates.
Picking N as a power of two will even allow you to do some nice shift tricks to do a rounded-down integer divide nice and fast.