Giter Club home page Giter Club logo

Comments (8)

jnez71 avatar jnez71 commented on June 2, 2024

The default value of -1 for dtol effectively turns off that stopping criteria because abs(RMS[-1] - RMS[-1-dslew])/dslew < dtol will always evaluate to False. I.e., by default dtol does nothing unless you explicitly pass-in a positive value for it. If you don't, then the only remaining stopping criteria is nepochs, which is the number of times it iterates over the whole training set (essentially, the max number of iterations). That criteria is always met eventually.

The computation abs(RMS[-1] - RMS[-1-dslew])/dslew is the absolute rate of change in RMS error on the training set between the latest epoch and the one dslew epochs ago. If the RMS error isn't changing much between epochs, then the training is likely to be complete.

from kalmann.

YannikYang1 avatar YannikYang1 commented on June 2, 2024

Hello, thank you very much for your reply. Your program code helped me a lot to learn, but after some time to understand I still have a few questions want to ask you.
In 3d_dynamic.py, my understanding is that X is used as the observation, Xdot as the prediction and Xh as the final fusion result, but I don't understand why skip = int(per/dt) is defined separately? Can U=X[::skip], Y=Xdot[::skip] be directly replaced by U=X,Y=Xdot later in the program?
And how to determine the type of neuron and why is tanh chosen, and how to determine the respective values of nl, P, Q, and R? The choice of different values seems to have a considerable impact on the fusion results.
And, in the array Xh of fusion results, why is xh = xh + knn.feedforward(xh)*dt defined within the for loop? How should I choose the value of dt?
Also, I replaced all the values inside X and Xdot with the positional data (xyz coordinate values) of the two sensors in vehicle positioning, but the result Xh from my own run here is very different, how should I adjust those parameters?
Thank you so much, and waiting for your reply.
Yours sincerely Yannik Yang.

from kalmann.

jnez71 avatar jnez71 commented on June 2, 2024

I'm glad you found this project insightful. To answer your questions:

  • U=X[::skip], Y=Xdot[::skip] was just to not use the dataset so densely. I didn't want to wait forever for results, so instead of using every timestep of the simulation as a datapoint, I skipped over every few.
  • The choice of tanh for the neuron was arbitrary. I was more interested in comparing the EKF to SGD, not in comparing various network architectures.
  • Q, R and the initial P act more as "tuning-knobs" (hyperparameters) in this setting. However if you know the covariance matrix of the data generative process, you can use that for R. In principle it could be estimated from the dataset as well, but I don't do that here. The impact they have on the results is akin to the choice of stepsize for SGD; some values break everything while others are all indistinguishably adequate. You just have to tune them to get into the stable zone.
  • Xh is the estimated state trajectory when using the learned dynamic autonomously. I.e. I am running a simulation where the trained neural network is now standing-in for the true dynamic that generated the dataset. dt should be chosen just like you'd choose any other ODE integrator timestep.
  • The concept of Xh only makes sense if you are fitting a dynamics model. I'm not sure what your application is but try looking at the 1d_fit.py demo instead for a basic regression task you can build up from.

from kalmann.

YannikYang1 avatar YannikYang1 commented on June 2, 2024

Dear Mr. Jnez71
Thank you very much for your answers to the five questions I asked before. I am a graduate student and currently studying in the first year of my master's degree. Your answer helped me understand these knowledge and parts of your code are beneficial to my research direction of my paper. However, in these days of learning and understanding I am still confused by some codes:

  1. xh = xh + knn. feedForward (xh)*dt in your 3d_dynamic.py code. Is dt the difference between the corresponding time stamps of two adjacent rows of x, y and z coordinate data during collection?Because I still don't understand the timestep of the ODE integrator that you said, and why dt =0.01, tf=100? If dt=1, tf=10000, or dt of some other value can also meet the requirement of 10000 lines of initial data.
  2. I learned that the final fusion formula of Kalman filter is Xh[I] = Xdot[I] + K*(X[I] -h *Xdot[I]). I obtained the corresponding values of K and H through neural network training and substituted them into the above formula for calculation, but it could not run because of the column and row mismatch. I don't understand why we use xh = xh + knn.feedforward(xh)*dt here to determine the fusion result xh.
  3. sig (self._affine_dot(self.w [0], U)) and h = self._affine_dot(self.w [1], l) are used to define the output of feedforward neural network. How is the inside of a feedforward neural network set up?
    4, If the number of neurons nl setting will have a big impact on the network.what if we set other value of nl?
    I sincerely thank you for your first two times answered my question. could you take time out of your busy schedule to help me solve my confusion again. Your code is of great help to my research. At the same time, I wish you all the best.
    Yours sincerely Yannik Yang.

from kalmann.

jnez71 avatar jnez71 commented on June 2, 2024
  1. No. The training of the NN (via EKF) is already complete at that point. The computation of Xh is just using the trained network for the dynamics application that is specific to that particular demo script.
  2. Again, Xh is not a "final fusion result" nor would the equation you suggested be such a thing. The weights of the NN itself are the closest thing to what I'd call a "final fusion result." This repo is using an EKF to train an NN, i.e. to estimate the weights of the NN as explained in this book. Then in the 3d_dynamic demo, I happen to apply this methodology to the problem of approximating an ODE. I highly recommend you focus on the other two demos, which are more straight forward because the only dynamic is the training process.
  3. That's what each layer's computation looks like. The full NN is just a composition of those layers.
  4. Give it a try! It depends on the problem you're trying to solve.

from kalmann.

jnez71 avatar jnez71 commented on June 2, 2024

(I deleted a duplicate comment).

from kalmann.

YannikYang1 avatar YannikYang1 commented on June 2, 2024

Dear Mr. jnez71
Thank you very much for your answer. After your three answers, I have understood most of your code principles so far, I am full of gratitude. But currently there is still a very small part of my confusion, I would like to ask you again.
1, Before, I was confused about the setting of dt, but in 3d_dynamic.py,I see line 25 of x = x + Xdot[i]*dt # step simulation and line 40 of xh = xh + knn. feedforward(xh)*dt in line 40, I seem to understand. My understanding is that by training with ekf, we aim to make the value of knn.feedforward(xh) keep approaching the value of Xdot[i], and replace Xdot[i] with knn.feedforward(xh), so the output Xh[i] replaces X[i].Am I understanding correctly?
2. I have re-studied 1d_fit.py and 3d_dynamic.py, which are very similar. For 3d_dynamic.py, is the purpose of the program also to fit the corresponding curve from a finite number of points (or data), and then to get an infinite number of more accurate points derived between two adjacent X[i]? May I ask if this is the purpose of the program?
3. I have re-examined knn.py carefully, and another question is, for example, sig (self._affine_dot(self.w [0], U)) and h = self._affine_dot(self.w [1], l), etc. Where do these formulas come from, and where are their principles and provenance?
Thanks again for your answer. I look forward to hearing from you, and I wish you all the best.
Yours sincerely Yannik Yang.

from kalmann.

jnez71 avatar jnez71 commented on June 2, 2024
  1. Yes that sounds correct. Note that any method could have been used to train the neural network to approximate Xdot. I happen to use EKF training here because that is what I was experimenting with to compare against SGD.
  2. In a real application, there would be much better ways to approximate a dynamical system from data than least-squares fitting a shallow fully-connected neural network to the time derivative as was done here. There was no purpose to this particular experiment other than to play around with the idea of EKF training a neural network. I just thought approximating a dynamic would be a fun test.
  3. Those are the internal operations of a basic fully-connected neural network. The input is multiplied by a weight matrix, added to a bias, and then passed through a sigmoidal "activation" function. The affine_dot method just combines the multiplication by weight matrix and addition of bias into a single matrix operation. It's less efficient that way but at the time it helped me reason through the Jacobians necessary for the EKF implementation.

from kalmann.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.