Giter Club home page Giter Club logo

Comments (6)

sdobber avatar sdobber commented on June 21, 2024

Thanks for reporting, I'll have a look.

TPA-LSTM works fine for me for bigger datasets, but I've had similar issues with DSANet occasionally. The normalization layers sometimes lead to NaNs in the training loop, which makes the whole model output useless. Unfortunately, I've never gotten to the bottom of when and why exactly this happens.

from fluxarchitectures.jl.

lorrp1 avatar lorrp1 commented on June 21, 2024

the other models give me problems as well, could it be because of different flux version?(im using flux "0.11.1" )
i cant install properly the version used in your manifest (errors during "pre-compiling")

im not getting error though, just straight lines or NaN32

from fluxarchitectures.jl.

sdobber avatar sdobber commented on June 21, 2024

As far as I know, Flux 0.10 only works on Julia up to 1.4.2, that's why you can't use the version from the manifest files. When I update to Flux v0.11 and Julia 1.5.1, I can run all the files, but DSAnet gives NANs as you describe, and the training for the other models does not produce any usable results.

As for the latter, I think this might be related to some changes in recent Flux versions. There was an issue where training of recurrent neural networks was not handled properly, and there still seems to be some remaining bugs to be fleshed out (see e.g. FluxML/Flux.jl#1209 or FluxML/Flux.jl#1324). I would guess that the metaparameters in the example files (number of hidden layers etc.) are way off currently.

[Edit:] OK, now I'm having weird problems as well, with LSTnet throwing an error and DARNN running in some infinite loop. I seriously don't know yet what is causing this - I am using the same code in a bigger project where all models train fine...

from fluxarchitectures.jl.

lorrp1 avatar lorrp1 commented on June 21, 2024

im trying with julia-1.4.2, i had some issue compiling flux, but it should be fine now
using @show LSTnet return me this:
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.589704f30
...
Flux.mse(pred, target) = 1.589704f30 (after some minutes)
and the chart is very off.

DSAnet return nan32.
DARNN may be working instead (im still running it, but it is very slow). edit: it is turning into a line. TPALSTM works (it turns into a line as written in the repo)

from fluxarchitectures.jl.

sdobber avatar sdobber commented on June 21, 2024

@lorrp1 This is how far I got with fixing things. DSAnet is still broken, and I fear that the issue there is either rather complex or well hidden.

from fluxarchitectures.jl.

lorrp1 avatar lorrp1 commented on June 21, 2024

hello @sdobber i have tested the last update:
DSAnet works sometimes
LSTnet/TPA/DARNN work now

from fluxarchitectures.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.