jaydu1 / vitae Goto Github PK
View Code? Open in Web Editor NEWJoint Trajectory Inference for Single-cell Genomics Using Deep Learning with a Mixture Prior
Home Page: https://jaydu1.github.io/VITAE/
License: MIT License
Joint Trajectory Inference for Single-cell Genomics Using Deep Learning with a Mixture Prior
Home Page: https://jaydu1.github.io/VITAE/
License: MIT License
Hello,
I have an issue at the step in which the autoencoder is pretrained only when I give a preprocess anndata object (it works if the adata object is not preprocessed beforehand):
# fit in data
model.get_data(adata=data, # count or expression matrix, (dense or sparse) numpy array
labels = data.obs['cluster_label'], # (optional) labels, which will be converted to string
gene_names = data.var['features'], # (optional) gene names, which will be converted to string
cell_names = data.obs['sample_name'] # (optional) cell names, which will be converted to string
)
# preprocess data
model.preprocess_data(gene_num = 2000, # (optional) maximum number of influential genes to keep (the default is 2000)
data_type = 'Gaussian', # (optional) data_type can be 'UMI', 'non-UMI' or 'Gaussian' (the default is 'UMI')
npc = 64, # (optional) number of PCs to keep if data_type='Gaussian' (the default is 64)
processed=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-9-2da55840b803> in <module>
3 batch_size=256, # (Optional) the batch size for pre-training (the default is 32).
4 alpha=0.10, # (Optional) the value of alpha in [0,1] to encourage covariate adjustment. Not used if there is no covariates.
----> 5 num_epoch = 300, # (Optional) the maximum number of epoches (the default is 300).
6 )
~/anaconda3/lib/python3.7/site-packages/VITAE/VITAE.py in pre_train(self, stratify, test_size, random_state, learning_rate, batch_size, L, alpha, num_epoch, num_step_per_epoch, early_stopping_patience, early_stopping_tolerance, path_to_weights)
274 batch_size,
275 self.X[id_train].astype(tf.keras.backend.floatx()),
--> 276 self.scale_factor[id_train].astype(tf.keras.backend.floatx()))
277 self.test_dataset = train.warp_dataset(self.X_normalized[id_test],
278 None if self.c_score is None else self.c_score[id_test].astype(tf.keras.backend.floatx()),
TypeError: 'NoneType' object is not subscriptable
Thank you in advance.
Best regards.
Hello,
Thanks for developing VITAE.
I tried to use VITAE but I have an issue regarding the model.preprocess_data when I give an annData object as an input of model.get_data function.
The preprocession should be done by scanpy which it is installed but I get the error:
# fit in data
model.get_data(adata=data, # count or expression matrix, (dense or sparse) numpy array
labels = data.obs['cluster_label'], # (optional) labels, which will be converted to string
gene_names = data.var['features'], # (optional) gene names, which will be converted to string
cell_names = data.obs['sample_name'] # (optional) cell names, which will be converted to string
)
# preprocess data
model.preprocess_data(gene_num = 2000, # (optional) maximum number of influential genes to keep (the default is 2000)
data_type = 'UMI', # (optional) data_type can be 'UMI', 'non-UMI' or 'Gaussian' (the default is 'UMI')
npc = 64 # (optional) number of PCs to keep if data_type='Gaussian' (the default is 64)#)
)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-aa66b286b1ac> in <module>()
35 model.preprocess_data(gene_num = 2000, # (optional) maximum number of influential genes to keep (the default is 2000)
36 data_type = 'UMI', # (optional) data_type can be 'UMI', 'non-UMI' or 'Gaussian' (the default is 'UMI')
---> 37 npc = 64 # (optional) number of PCs to keep if data_type='Gaussian' (the default is 64)#)
38 )
2 frames
/usr/local/lib/python3.7/dist-packages/VITAE/preprocess.py in _recipe_seurat(adata, gene_num)
238 This uses a particular preprocessing
239 """
--> 240 cell_mask = sc.pp.filter_cells(adata, min_genes=200, inplace=False)[0]
241 adata = adata[cell_mask,:]
242 gene_mask = sc.pp.filter_genes(adata, min_cells=3, inplace=False)[0]
NameError: name 'sc' is not defined
I do not understand what is the issue because you import scanpy as sc in your defined function?
Thank you in advance.
Best regards.
Hello,
I found your manuscript to be interesting and I'm wondering whether you have any interest in implementing a version that takes a pre-trained scvi-tools model as input (e.g., scVI) . I think this would get a lot of usage in our package!
Hello,
model.init_inference is very slow to run using the CPU version (but it is running) but I cannot get it to run by using the GPU version.
I get the following error:
# initialize inference
model.init_inference(batch_size=128,
L=150, # L is the number of MC samples
dimred='umap', # dimension reduction methods
#**kwargs # extra key-value arguments for dimension reduction algorithms.
random_state=seed
)
# after initialization, we can access some variables by model.pc_x, model.w, model.w_tilde, etc..
Computing posterior estimations over mini-batches.
---------------------------------------------------------------------------
ResourceExhaustedError Traceback (most recent call last)
<ipython-input-27-91c48b13b6e4> in <module>()
4 dimred='umap', # dimension reduction methods
5 #**kwargs # extra key-value arguments for dimension reduction algorithms.
----> 6 random_state=seed
7 )
8 # after initialization, we can access some variables by model.pc_x, model.w, model.w_tilde, etc..
10 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58 ctx.ensure_initialized()
59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
62 if name is not None:
ResourceExhaustedError: OOM when allocating tensor with shape[128,150,1653,57] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node Tile_1 (defined at /usr/local/lib/python3.7/dist-packages/VITAE/model.py:367) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[Op:__inference__get_inference_4681280]
Function call stack:
_get_inference
I tried to reduce the batch size 64,32,16,8 but all failed. I am not running out of memory.
The is due to the size of the input data. When I reduce the number of cells in my data, it is working.
Thank you in advance.
Best regards
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.