Forecasting out of sample with Fourier regressors - fft

I'm trying to create a multivariate multi-step-ahead forecast using machine learning (weekly and yearly seasonality).
I use some exogenous variables, including Fourier terms. I'm happy with the results of testing the model with in sample data, but now I want to go for production and make real forecasts on completely unseen data. While I can update the other regressors (variables) since they are dummy variables and related to time, I don't know how I will generate new Fourier terms for the N steps ahead.
I have an understanding problem here and what to check it with you: when you generate the fourier terms based on periodicity and the number of sin/cos used to decompose the time serie you want to forecast this process should be independent on that values of the time series. Is that right?
If so, how do you extend the terms for the N steps?
Just for the sake of completeness, I use R.
Thank you

From what I am reading and understanding, you want to get future N terms on the Fourier. To do this, you need to shift your calculated time frame to be some point in the past (say N-1). This is just simple causality, you cannot model the future with Fourier (for example, you cant have (N-1) = a(N+1) + b(N-2) + c(N).

Related

Evaluating the performance of variational autoencoder on unlabeled data

I've designed a variational autoencoder (VAE) that clusters sequential time series data.
To evaluate the performance of VAE on labeled data, First, I run KMeans on the raw data and compare the generated labels with the true labels using Adjusted Mutual Info Score (AMI). Then, after the model is trained, I pass validation data to it, run KMeans on latent vectors, and compare the generated labels with the true labels of validation data using AMI. Finally, I compare the two AMI scores with each other to see if KMeans has better performance on the latent vectors than the raw data.
My question is this: How can we evaluate the performance of VAE when the data is unlabeled?
I know we can run KMeans on the raw data and generate labels for it, but in this case, since we consider the generated labels as true labels, how can we compare the performance of KMeans on the raw data with KMeans on the latent vectors?
Note: The model is totally unsupervised. Labels (if exist) are not used in the training process. They're used only for evaluation.
In unsupervised learning you evaluate the performance of a model by either using labelled data or visual analysis. In your case you do not have labelled data, so you would need to do analysis. One way to do this is by looking at the predictions. If you know how the raw data should be labelled, you can qualitatively evaluate the accuracy. Another method is, since you are using KMeans, is to visualize the clusters. If the clusters are spread apart in groups, that is usually a good sign. However, if they are closer together and overlapping, the labelling of vectors in the respective areas may be less accurate. Alternatively, there may be some sort of a metric that you can use to evaluate the clusters or come up with your own.

Anylogic: How to create an objective function using values of two dataset (for optimization experiment)?

In my Anylogic model I have a population of agents (4 terminals) were trucks arrive at, are being served and depart from. The terminals have two parameters (numberOfGates and servicetime) which influence the departures per hour of trucks leaving the terminals. Now I want to tune these two parameters, so that the amount of departures per hour is closest to reality (I know the actual departures per hour). I already have two datasets within each terminal agent, one with de amount of departures per hour that I simulate, and one with the observedDepartures from the data.
I already compare these two datasets in plots for every terminal:
Now I want to create an optimization experiment to tune the numberOfGates and servicetime of the terminals so that the departure dataset is the closest to the observedDepartures dataset. Does anyone know how to do create a(n) (objective) function for this optimization experiment the easiest way?
When I add a variable diff that is updated every hour by abs( departures - observedDepartures) and put root.diff in the optimization experiment, it gives me the eq(null) is not allowed. Use isNull() instead error, in a line that reads the database for the observedDepartures (see last picture), but it works when I run the simulation normally, it only gives this error when running the optimization experiment (I don't know why).
You can use the absolute value of the sum of the differences for each replication. That is, create a variable that logs the | difference | for each hour, call it diff. Then in the optimization experiment, minimize the value of the sum of that variable. In fact this is close to a typical regression model's objectives. There they use a more complex objective function, by minimizing the sum of the square of the differences.
A Calibration experiment already does (in a more mathematically correct way) what you are trying to do, using the in-built difference function to calculate the 'area between two curves' (which is what the optimisation is trying to minimise). You don't need to calculate differences or anything yourself. (There are two variants of the function to compare either two Data Sets (your case) or a Data Set and a Table Function (useful if your empirical data is not at the same time points as your synthetic simulated data).)
In your case it (the objective function) will need to be a sum of the differences between the empirical and simulated datasets for the 4 terminals (or possibly a weighted sum if the fit for some terminals is considered more important than for others).
So your objective is something like
difference(root.terminals(0).departures, root.terminals(0).observedDepartures)
+ difference(root.terminals(1).departures, root.terminals(1).observedDepartures)
+ difference(root.terminals(2).departures, root.terminals(2).observedDepartures)
+ difference(root.terminals(3).departures, root.terminals(2).observedDepartures)
(It would be better to calculate this for an arbitrary population of terminals in a function but this is the 'raw shape' of the code.)
A Calibration experiment is actually just a wizard which creates an Optimization experiment set up in a particular way (with a UI and all settings/code already created for you), so you can just use that objective in your existing Optimization experiment (but it won't have a built-in useful UI like a Calibration experiment). This also means you can still set this up in the Personal Learning Edition too (which doesn't have the Calibration experiment).

Computation consideration with different Caffe's network topology (difference in number of output)

I would like to use one of Caffe's reference model i.e. bvlc_reference_caffenet. I found that my target class i.e. person is one of the classes included in the ILSVRC dataset that has been trained for the model. As my goal is to classify whether a test image contains a person or not, I may achieve this by the following:
Use inference directly with 1000 number of output. This doesn't
require training/learning.
Change the network topology a little bit with the final FC layer's number of output (num_output) is set to 2 (instead of 1000). Retrain it as a binary classification problem.
My concern is about computational effort at deployment/prediction phase (testing). The latter looks more expensive computationally than the former. This is because during prediction phase it needs to compute those 1000 output possibilities to find the one with the highest score. What I'm not sure is that, it could be the case that there's a heuristic (which I'm not aware of) that simplifies the computation.
Can somebody please help cross check my understanding on this.

Implementing Dijkstra's algorithm using CUDA in c

I am trying to implement Dijsktra's algorithm using cuda.I got a code that does the same using map reduce this is the link http://famousphil.com/blog/2011/06/a-hadoop-mapreduce-solution-to-dijkstra%E2%80%99s-algorithm/ but i want to implement something similar as given in the link using cuda using shared and global memory..Please tell me how to proceed as i am new to cuda ..i dont know if it is necessary that i provide the input on host and device both in the form of matrix and also what operation should i perform in the kernel function
What about something like this(Dislaimer this is not a map-reduce solution).
Lets say you have a Graph G with N states an adjacency matrix A with entries A[i,j] for the cost of going from node i to node j in the graph.
This Dijkstras algorithm consists of having a vector denoting a front 'V' where V[i] is the current minimum distance from the origin to node i - in Dijkstras algorithm this information would be stored in a heap and loaded poped of the top of the heap on every loop.
Running the algorithm now starts to look a lot like matrix algebra in that one simply takes the Vector and applyes the adjancicy matrix to it using the following command:
V[i] <- min{V[j] + A[j,i] | j in Nodes}
for all values of i in V. This is run as long as there are updates to V (can be checked on the device, no need to load V back and forth to check!), also store the transposed version of the adjacency matrix to allow sequential reads.
At most this will have a running time corresponding to the longest non-looping path through the graph.
The interesting question now becomes how to distribute this across compute blocks, but it seems obvious to shard based on row indexes.
I suggest you study these two prominent papers on efficient graph processing on GPU. First can be found here. It's rather straightforward and basically assigns one warp to process a vertex and its neighbors. You can find the second one here which is more complicated. It efficiently produces the queue of next level vertices in parallel hence diminishing load imbalance.
After studying above articles, you'll have a better understanding why graph processing is challenging and where pitfalls are. Then you can make your own CUDA program.

Getting Data back from Filtered FFT

I have applied an FFT to some data that I'd like to process using Matlab. The resulting frequencies are quite noisy, so I have applied a moving average filter to the frequency/amplitude vectors. Now I am interested in getting the time domain data based on this filtered frequency domain data, to be used in a spectrograph later.
To get the frequency/amplitude components I used this code from a Mathworks example:
NFFT=2^nextpow2(L);
A=fft(a,NFFT)/L; %a is the data
f=Fs/2*linspace(0,1,NFFT/2+1);
and plotted using:
plot(f,2*abs(A(1:NFFT/2+1))
Can you recommend a way of getting the time domain data from the filtered FFT results? Is there an inverse FFT involved?
Thank you very much!
An IFFT is the inverse of an FFT. If you don't change the frequency data, you should get the same data back from an ifft(fft(x)) from the same library.
If you change the data, and want to get real data back, you have to filter all the imaginary components as well as the real components of the complex FFT results, and make sure that the frequency domain data is still complex conjugate symmetric before doing the IFFT. If you use the magnitudes only, that will throw away the phase information which can greatly distort the result.