Comments (8)
Hi @lorrp1 ,
sorry for the late reply.
The model gets as input a number of features from a time series, say from time t - poollength
to time t
. The task is to predict a future value t + horizon
from these features. Following the performance guide of Flux, all features get assembled in a matrix, where the last dimension corresponds to the different timesteps.
To give the model the past time series up to some point that is currently observable for a certain point in time, the poollength
parameter provides a window of poollength
steps back in time for all the features. All this happens in the second dimension of input
:
input[:,1,1,60]
is a vector of all the feature values at the 60th timestep, and input[:,2,1,60]
would be a vector of the feature values one step back from the 60th time step. (So for example input[:,2,1,61] == input[:,1,1,60]
holds true.) If you want to include the time series from which the target is derived as a feature or not is up to you, but I see no reason not to supply the model with what is known up to the current point in time.
m.ConvLayer
operates only on the first two dimensions, so for a fixed timepoint it should only be able to access the features for the current timestep and poollength
steps back in time. It's output is then a time series with convlayersize
new "features". This way, the model should not have access to future points in time, though I admit that I never checked that thoroughly.
The training now tries to optimize the model parameters so that the model output (seen as a time series) matches the target, which means that at time t
, it should predict the target variable at time t + horizon
.
What you mention about that the model basically just could output more or less the current point in time to get an almost perfect fit is a general problem in time-series forecasting and applies to all methods, which makes it a difficult problem.
Concerning your question number 3, I am not an expert on jingw2's forecasting setup, but it looks very different to what I am doing in my code. I am only interested in forecasting a short amount of time, and for a new forecast I have new data available I can feed to the model. So I use basically everything I have available up to a certain time to come up with a prediction, until the data is exhausted. On https://github.com/jingw2/demand_forecast, it looks like they train the model over a certain dataset, and then let it output forecasts for a longer period of time. I would try to inspect their code to see if one can get some hints about how they train, and how they forecast.
from fluxarchitectures.jl.
thank you for the explanation.
i think i have now understood how the model works.
but is there no easy way then to make the model accept a lower input data x in apred = model(input)
on a trained model
when i try: pred = model(input)
with a smaller data length from the the one used to initially train the model i get:
DimensionMismatch("arrays could not be broadcast to a common size")
the Conv((in, poolsize)
should be equal assuming there is enough "input data" for the pool size since there is not initialization of the data length in the model's initialization.
the error is in: m.RecurLayer (a)
(i have used a = m.ConvLayer(x) to see if the error was in the conv or recurrent layer)
the convlayersize is also be the same so i dont really understand the error on the recurrent layer
edit: another issue would be how to know if it isnt just overfitting without a test sample or a validation one
from fluxarchitectures.jl.
With the way Flux treats recurrent layers, their hidden state gets initialized to the correct size for the input data the first time you call a layer. When the size changes (e.g. by changing from training to test data), you get the DimensionMismatch("arrays could not be broadcast to a common size")
error. The solution is to call Flux.reset!(model)
before changing the size of the input. I normally include that in the loss function (that was mentioned in the documentation at one point, but now it seems to have been removed):
loss(x,y)= begin
l = Flux.mse(model(x),y)
Flux.reset!(model)
return l
end
An of course it is a good idea to have a training, validation and test data set. I just wanted to keep my code simple to focus on the networks, and not build a hole data handling structure around everything
from fluxarchitectures.jl.
thank you again,
im going to add test/validation maybe even changing the adam while training.
from fluxarchitectures.jl.
It seems the models return nan every time the pool length is higher than 2/3
from fluxarchitectures.jl.
I tried LSTNet
with poollength = 1, 2, 5, 10, 15, 20, 50, 100
, and that all worked fine. The variable defines a number of timesteps, so any non-integer values doesn't make sense.
from fluxarchitectures.jl.
By 2/3 i meant 2 or 3, but its now working, i changed how the dataloader to read csv but i made a mistake.
do you think it would be enough to change the size of the output of the last dense (and the loss) to turn LSTnet it into a classifier?
from fluxarchitectures.jl.
Might be worth a try. For my use case classification never really worked out, so my experience with this is limited.
from fluxarchitectures.jl.
Related Issues (20)
- DA-RNN Multi-step Prediction HOT 2
- Data for Electricity has 321 users not 370 HOT 3
- How to run DARNN on GPU HOT 2
- Upcoming Flux v0.12 will possibly break models
- TPA-LSTM trains slower on GPU than on CPU HOT 4
- `get_data` creates NaNs on the GPU HOT 4
- Register package HOT 2
- TagBot trigger issue HOT 12
- Latest Zygote/Flux throws error in DARNN example file HOT 3
- Horizon must be larger than poollength in examples
- Compilation fails on Julia 1.7 HOT 2
- How can LSTnet output a 2 dims times Array? HOT 3
- Issues training models on FluxBench HOT 5
- FluxArchitectures not working with Flux v0.13
- Error running LSTnet - Copy-Paste Code example HOT 4
- Zygote v0.6.44 breaks tests
- Doubt about Stacked LSTMs
- Zygote must be pinned to versions lower than 0.6.44
- Fix recurrent network calls
- SkipGRU not using correct hidden state
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fluxarchitectures.jl.