Comments (6)
Thanks for reporting, I'll have a look.
TPA-LSTM works fine for me for bigger datasets, but I've had similar issues with DSANet occasionally. The normalization layers sometimes lead to NaNs in the training loop, which makes the whole model output useless. Unfortunately, I've never gotten to the bottom of when and why exactly this happens.
from fluxarchitectures.jl.
the other models give me problems as well, could it be because of different flux version?(im using flux "0.11.1" )
i cant install properly the version used in your manifest (errors during "pre-compiling")
im not getting error though, just straight lines or NaN32
from fluxarchitectures.jl.
As far as I know, Flux 0.10 only works on Julia up to 1.4.2, that's why you can't use the version from the manifest files. When I update to Flux v0.11 and Julia 1.5.1, I can run all the files, but DSAnet gives NANs as you describe, and the training for the other models does not produce any usable results.
As for the latter, I think this might be related to some changes in recent Flux versions. There was an issue where training of recurrent neural networks was not handled properly, and there still seems to be some remaining bugs to be fleshed out (see e.g. FluxML/Flux.jl#1209 or FluxML/Flux.jl#1324). I would guess that the metaparameters in the example files (number of hidden layers etc.) are way off currently.
[Edit:] OK, now I'm having weird problems as well, with LSTnet throwing an error and DARNN running in some infinite loop. I seriously don't know yet what is causing this - I am using the same code in a bigger project where all models train fine...
from fluxarchitectures.jl.
im trying with julia-1.4.2, i had some issue compiling flux, but it should be fine now
using @show LSTnet return me this:
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.6141138f30
Flux.mse(pred, target) = 1.589704f30
...
Flux.mse(pred, target) = 1.589704f30 (after some minutes)
and the chart is very off.
DSAnet return nan32.
DARNN may be working instead (im still running it, but it is very slow). edit: it is turning into a line. TPALSTM works (it turns into a line as written in the repo)
from fluxarchitectures.jl.
@lorrp1 This is how far I got with fixing things. DSAnet is still broken, and I fear that the issue there is either rather complex or well hidden.
from fluxarchitectures.jl.
hello @sdobber i have tested the last update:
DSAnet works sometimes
LSTnet/TPA/DARNN work now
from fluxarchitectures.jl.
Related Issues (20)
- Upcoming Flux v0.12 will possibly break models
- TPA-LSTM trains slower on GPU than on CPU HOT 4
- `get_data` creates NaNs on the GPU HOT 4
- Register package HOT 2
- TagBot trigger issue HOT 12
- Latest Zygote/Flux throws error in DARNN example file HOT 3
- Horizon must be larger than poollength in examples
- Compilation fails on Julia 1.7 HOT 2
- How can LSTnet output a 2 dims times Array? HOT 3
- Issues training models on FluxBench HOT 5
- FluxArchitectures not working with Flux v0.13
- Error running LSTnet - Copy-Paste Code example HOT 4
- Zygote v0.6.44 breaks tests
- Doubt about Stacked LSTMs
- Zygote must be pinned to versions lower than 0.6.44
- Implementing TPALSTM using own data HOT 4
- `prepare_data` normalises data even when normalise is set to false
- Compatibility with Julia 1.9
- Slices not defined HOT 6
- question, not an issue. How to get the predicted value ? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fluxarchitectures.jl.