I'm a dummy for Kalman Filter. My question is about dt in the prediction step of Kalman Filter. After initialization of Kalman Filters using covariance matrix and some uncertainty stuff, predict function generally takes the dt as an input, but filterpy implementation doesn't take the dt as an input. It confused me that kalman filter has to take dt in the prediction step or not. Here is the filterpy implementation:
https://github.com/rlabbe/filterpy/blob/master/filterpy/kalman/kalman_filter.py . Also, I'm not sure that maybe the user updates the transition matrix before calling the prediction function.
F is notation used for discrete transition matrix so it involves dt already and is seen in the example code given in the link you have provided. Your assumtions are correct if dt is not const F should be recalculated before calling predict ( in this particualar example ). Sometimes dt can be const and if model is linear F can be calculated only once at filter init stage
Related
I am beginner in deep learning.
I am using this dataset and I want my network to detect keypoints of a hand.
How can I make my output layer's nodes to be in range [-1, 1] (range of normalized 2D points)?
Another problem is when I train for more than 1 epoch the loss gets negative values
criterion: torch.nn.MultiLabelSoftMarginLoss() and optimizer: torch.optim.SGD()
Here u can find my repo
net = nnModel.Net()
net = net.to(device)
criterion = nn.MultiLabelSoftMarginLoss()
optimizer = optim.SGD(net.parameters(), lr=learning_rate)
lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer=optimizer, gamma=decay_rate)
You can use the Tanh activation function, since the image of the function lies in [-1, 1].
The problem of predicting key-points in an image is more of a regression problem than a classification problem (especially if you're making your model outputs + targets fall within a continuous interval). Therefore, I suggest you use the L2 Loss.
In fact, it could be a good exercise for you to determine which loss function that is appropriate for regression problems provides the lowest expected generalization error using cross-validation. There's several such functions available in PyTorch.
One way I can think of is to use torch.nn.Sigmoid which produces outputs in [0,1] range and scale outputs to [-1,1] using 2*x-1 transformation.
I am trying to write a kalman filter and I'm stuck on the H matrix. Right now I'm trying to get position and velocity data and I'm providing position, velocity and acceleration data. How do you set up an H matrix for this, or just in general?
There are 2 good articles about Kalman filter I used to understand how it works and how matrices should be set up:
http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures/
https://www.wikiwand.com/en/Kalman_filter#/Example_application
H matrix is the observation matrix.
It means, that if we have a simple model with variable position (x) and velocity (x') and our sensor provides us observations for positions (z), that we will have:
i have been following cs231n lectures of Stanford and trying to complete assignments on my own and sharing these solutions both on github and my blog. But i'm having a hard time on understanding how to modelize backpropagation. I mean i can code modular forward and backward passes but what bothers me is that if i have the model below : Two Layered Neural Network
Lets assume that our loss function here is a softmax loss function. In my modular softmax_loss() function i am calculating loss and gradient with respect to scores (dSoft = dL/dY). After that, when i'am following backwards lets say for b2, db2 would be equal to dSoft*1 or dW2 would be equal to dSoft*dX2(outputs of relu gate). What's the chain rule here ? Why isnt dSoft equal to 1 ? Because dL/dL would be 1 ?
The softmax function is outputs a number given an input x.
What dSoft means is that you're computing the derivative of the function softmax(x) with respect to the input x. Then to calculate the derivative with respect to W of the last layer you use the chain rule i.e. dL/dW = dsoftmax/dx * dx/dW. Note that x = W*x_prev + b where x_prev is the input to the last node. Therefore dx/dW is just x and dx/db is just 1, which means that dL/dW or simply dW is dsoftmax/dx * x_prev and dL/db or simply db is dsoftmax/dx * 1. Note that here dsoftmax/dx is dSoft we defined earlier.
I am trying to implement a custom RNN layer in Keras and I tried to follow what explained in this link, which basically instructs how to inherit from the existing RNN classes. However, the update equation of the hidden layer in my formulation is a bit different: h(t) = tanh(W.x + U.h(t-1) + V.r(t) + b) and I am a bit confused. In this equation, r(t) = f(x, p(t)) is a function of x, the fixed input distributed over time, and also p(t) = O(t-1).alpha + p(t-1), where O(t) is the Softmax output of each RNN cell.
I think after calling super(customRNN, self).step in the inherited step function, the standard h(t) should be overridden by my definition of h(t). However I am not sure how to modify the states and also get_constants function, and whether or not I need to modify any other parts of the recurrent and simpleRNN classes in Keras.
My intuition is that the get_constants function only returns the dropout matrices as extra states to the step function, so I am guessing at least one state should be added for the dropout matrix of V in my equations.
I have just recently started using Keras and I could not find many references on custom Keras layer definition. Sorry if my question is a bit overwhelmed with a lot of parameters, I just wanted to make sure that I am not missing any point. Thanks!
I was asked to create a leg follower robot (I already did it) and in the second part of this assignment I have to develop a Kalman filter in order to improve the following process of the robot. The robot gets from the person the distance where she is to the robot and also the angle (it is a relative angle, because the reference is the robot itself, not absolute x-y coordinates)
About this assignment I have a serious doubt. Everything I have read, every sample I have seen about kalman filter has been in one dimension (a car running distance or a rock falling from a building) and according to the task I would have to apply it in 2 dimensions. Is it possible to apply a kalman filter like this?
If it is possible to calculate kalman filter in 2 dimensions then I would understand that what is asked to do is to follow the legs in a linnearized way, despite a person walks weirdly (with random movements) --> About this I have the doubt of how to establish the function of the state matrix, could anyone please tell me how to do it or to tell me where I can find more information about this?
thanks.
Well you should read up on Kalman Filter. Basically what it does is estimate a state through its mean and variance separately. The state can be whatever you want. You can have local coordinates in your state but also global coordinates.
Note that the latter will certainly result in nonlinear system dynamics, in which case you could use the Extended Kalman Filter, or to be more correct the continuous-discrete Kalman Filter, where you treat the system dynamics in a continuous manner and the measurements in discrete time.
Example with global coordinates:
Assuming you have a small cubic mass which can drive forward with velocity v. You could simply model the dynamics in local coordinates only, where your state s would be s = [v], which is a linear model.
But, you could also incorporate the global coordinates x and y, assuming we are moving on a plane only. Then you would have s = [x, y, phi, v]'. We need phi to keep track of the current orientation since the cube can only move forward in respect to its orientation of course. Let's define phi as the angle between the cube's forward direction and the x-axis. Or in other words: With phi=0 the cube would move along the x-axis, with phi=90° it would move along the y-axis.
The nonlinear system dynamics with global coordinates can then be written as
s_dot = [x_dot, y_dot, phi_dot, v_dot]'
with
x_dot = cos(phi) * v
y_dot = sin(phi) * v
phi_dot = ...
v_dot = ... (Newton's Law)
In EKF (Extended Kalman Filter) Prediction step you would use the (discretized) equations above to predict the mean of the state in the first step of and the linearized (and discretized) equations for prediction of the Variance.
There are two things to keep in mind when you decide what your state vector s should look like:
You might be tempted to use my linear example s = [v] and then integrate the velocity outside of the Kalman Filter in order to obtain the global coordinate estimates. This would work, but you would lose the awesomeness of the Kalman Filter since you would only integrate the mean of the state, not its variance. In other words, you would have no idea what the current uncertainties for your global coordinates are.
The second step of the Kalman Filter, the measurement or correction update, requires that you can describe your sensor output as a function of your states. So you may have to add states to your representation just so that you can express your measurements correctly as z[k] = h(s[k], w[k]) where z are measurements and w is a noise vector with Gaussian distribution.