ermongroup / csdi Goto Github PK
View Code? Open in Web Editor NEWCodes for "CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation"
License: MIT License
Codes for "CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation"
License: MIT License
Thanks for your sharing and I tried the exe_physio.py in the "pretrained" mode, it had following error:
File "D:\Code\CSDI\CSDI-main\main_model.py", line 14, in init
self.emb_feature_dim = config["model"]["featureemb"]
KeyError: 'featureemb'
Hi, thank you for your nice work.
I saw the CRPS metric of the eq. (17) in your paper, my question is why the value of "i" is from 1 to 19.
Look forward to your reply. Thanks!
Can you provide the code of your paper, thank you
Thank you for your great work, the electricity dataset link is broken, could you please re-upload it
I am very interested in the research in your field and want to further explore the research. However, I can not get the same performance as the results on time series forecasting task described in the paper. Our result on electricity dataset for CRPS is 0.1594 while yours is 0.017. Could you please figure out the differences of experimental settings between mine and yours.
The implementation and experimental settings are as follows. First, we use the code from diff_models.py and main_model.py as the time series forecasting model, which is the same as your paper. As for the dataset settings, we use all of the 370 dimensions (clients) of electricity dataset and set prediction step as 24 as the settings in the related work[1]. We also applied test pattern strategy for target choice, which is written by ourselves. For each instance of time series data, we consider the first 24 steps as the observed part and the last 1 step as the target part to forecast the target value autoregressively.
I can not find the reason causing the difference between the performance. Could you please share the code of time series forecasting if it is available.
Congratulations on your paper being accepted to NeurIPS, and thank you for sharing your code! I thought the task as described might be a good fit for a DEformer-like model (hereafter "DEformer-CSDI"), so I decided to run an experiment on the 10% missing healthcare dataset and I thought you might be interested in the results (code here). While my test set is identical to yours, I changed the training/validation split to 95%/5% and I used an online strategy to generate missing values for each training sample. Specifically, every time a training sample was encountered, I randomly selected 10% of the observed values to serve as the missing values.
Like the DEformer, the input for DEformer-CSDI consists of a mix of identity feature vectors and identity/value feature vectors. The difference in this case is that DEformer-CSDI is not learning the joint distribution, so only the identity feature vectors are included for the missing values and the attention mask is now full instead of lower triangular (i.e., every input can attend to every other input). Identity was encoded as f(t, k) = [t, embed(k)] where t and k are the time and feature indices, respectively, for a data point. One interesting difference between DEformer-CSDI and CSDI is that DEformer-CSDI simply ignores missing values that are not being predicted.
With no hyperparameter tuning, DEformer-CSDI achieves a mean absolute error of 0.219 on the 10% missing healthcare dataset. I thought it was notable that DEformer-CSDI outperformed the flattened Transformer baseline from Table 7 by a wide margin. With that being said, DEformer-CSDI is much larger than CSDI (19,250,493 parameters), so it would be interesting to see if CSDI's performance could be improved further using this online sampling strategy.
Hello, thank you for your excellent work.
I would like to try time series forecasting using the test pattern strategy described in the paper.
Are there any the example code on github?
Dear author,
I have been reproducing the forecasting section of your paper recently. I have noticed that there is less mention of using CSDI for forecasting work in your paper. Do you have a specific explanation of how to do forecasting work? Also, you only provided the electricity dataset, would it be convenient to provide the other datasets from the gluonts.
I would greatly appreciate it.
I ran the code you posted on github and found that the time of training for a epoch takes about half a minute but the testing takes about half a hour(The command is “python exe_physio.py --testmissingratio 0.1 --nsample 100”). May I ask if this is in line with the actual situation.
Hi,
when I run exe_forecasting.py it gives me type error.
The type error is :
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not NoneType
It is from this part of the code:
main_model.py", line 356, in get_side_info
feature_embed = self.embed_layer(feature_id).unsqueeze(1).expand(-1,L,-1,-1)
It would be great if you fix this.
Hi, Thanks your great work!
I ran the experiment on the physio dataset with the pre-training model file you gave, But I still can't get the results in your paper。
for example, at a 10% missing ratio, the results in the paper are: 0.498/0.217/0.238(RMSE/MAE/CRPS), but I get 0.552/0.25/0.28
Is this gap too big?
I’m very interested in your work. But I have a question about the difference between deterministic imputation and interpolation.
Hi,
In this line, to generate noisy data, why is the square root of
The paper did not have a square root for the second coefficient.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.