jaydu1 / vitae Goto Github PK

View Code? Open in Web Editor NEW

26.0 26.0 8.0 849.42 MB

Joint Trajectory Inference for Single-cell Genomics Using Deep Learning with a Mixture Prior

Home Page: https://jaydu1.github.io/VITAE/

License: MIT License

Python 0.63% Jupyter Notebook 99.28% R 0.09% Dockerfile 0.01%

python single-cell-sequencing tensorflow trajectory-inference

vitae's People

Contributors

Stargazers

Watchers

Forkers

natnaelt iceshadows masterstormtrooper tianyucodings zktuong ronfinn pkuxklx

vitae's Issues

Issue in model.pre_train when setting processed=True in model.preprocess_data

Hello,

I have an issue at the step in which the autoencoder is pretrained only when I give a preprocess anndata object (it works if the adata object is not preprocessed beforehand):

Preprocess data step:

# fit in data
model.get_data(adata=data,                   # count or expression matrix, (dense or sparse) numpy array 
               labels = data.obs['cluster_label'],       # (optional) labels, which will be converted to string
               gene_names = data.var['features'], # (optional) gene names, which will be converted to string
               cell_names = data.obs['sample_name']    # (optional) cell names, which will be converted to string
              )


# preprocess data
model.preprocess_data(gene_num = 2000,        # (optional) maximum number of influential genes to keep (the default is 2000)
                     data_type = 'Gaussian', # (optional) data_type can be 'UMI', 'non-UMI' or 'Gaussian' (the default is 'UMI')
                      npc = 64,         # (optional) number of PCs to keep if data_type='Gaussian' (the default is 64)
                     processed=True)

Pretrain step:


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-2da55840b803> in <module>
      3                 batch_size=256,              # (Optional) the batch size for pre-training (the default is 32).
      4                 alpha=0.10,                  # (Optional) the value of alpha in [0,1] to encourage covariate adjustment. Not used if there is no covariates.
----> 5                 num_epoch = 300,             # (Optional) the maximum number of epoches (the default is 300).
      6                 ) 

~/anaconda3/lib/python3.7/site-packages/VITAE/VITAE.py in pre_train(self, stratify, test_size, random_state, learning_rate, batch_size, L, alpha, num_epoch, num_step_per_epoch, early_stopping_patience, early_stopping_tolerance, path_to_weights)
    274                                                 batch_size,
    275                                                 self.X[id_train].astype(tf.keras.backend.floatx()),
--> 276                                                 self.scale_factor[id_train].astype(tf.keras.backend.floatx()))
    277         self.test_dataset = train.warp_dataset(self.X_normalized[id_test], 
    278                                                 None if self.c_score is None else self.c_score[id_test].astype(tf.keras.backend.floatx()),

TypeError: 'NoneType' object is not subscriptable

Thank you in advance.

Best regards.

error in model.preprocess_data if an annData object is given as input in model.get_data

Hello,

Thanks for developing VITAE.

I tried to use VITAE but I have an issue regarding the model.preprocess_data when I give an annData object as an input of model.get_data function.

The preprocession should be done by scanpy which it is installed but I get the error:

# fit in data
model.get_data(adata=data,                   # count or expression matrix, (dense or sparse) numpy array 
               labels = data.obs['cluster_label'],       # (optional) labels, which will be converted to string
               gene_names = data.var['features'], # (optional) gene names, which will be converted to string
               cell_names = data.obs['sample_name']    # (optional) cell names, which will be converted to string
              )

# preprocess data
model.preprocess_data(gene_num = 2000,        # (optional) maximum number of influential genes to keep (the default is 2000)
                      data_type = 'UMI', # (optional) data_type can be 'UMI', 'non-UMI' or 'Gaussian' (the default is 'UMI')
                      npc = 64              # (optional) number of PCs to keep if data_type='Gaussian' (the default is 64)#)
                      )
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-aa66b286b1ac> in <module>()
     35 model.preprocess_data(gene_num = 2000,        # (optional) maximum number of influential genes to keep (the default is 2000)
     36                       data_type = 'UMI', # (optional) data_type can be 'UMI', 'non-UMI' or 'Gaussian' (the default is 'UMI')
---> 37                       npc = 64              # (optional) number of PCs to keep if data_type='Gaussian' (the default is 64)#)
     38                       )

2 frames
/usr/local/lib/python3.7/dist-packages/VITAE/preprocess.py in _recipe_seurat(adata, gene_num)
    238     This uses a particular preprocessing
    239     """
--> 240     cell_mask = sc.pp.filter_cells(adata, min_genes=200, inplace=False)[0]
    241     adata = adata[cell_mask,:]
    242     gene_mask = sc.pp.filter_genes(adata, min_cells=3, inplace=False)[0]

NameError: name 'sc' is not defined

I do not understand what is the issue because you import scanpy as sc in your defined function?

Thank you in advance.

Best regards.

Implement in scvi-tools

Hello,

I found your manuscript to be interesting and I'm wondering whether you have any interest in implementing a version that takes a pre-trained scvi-tools model as input (e.g., scVI) . I think this would get a lot of usage in our package!

Running model.init_inference in GPU version failed

Hello,
model.init_inference is very slow to run using the CPU version (but it is running) but I cannot get it to run by using the GPU version.

I get the following error:

# initialize inference
model.init_inference(batch_size=128, 
                     L=150,            # L is the number of MC samples
                     dimred='umap',    # dimension reduction methods
                     #**kwargs         # extra key-value arguments for dimension reduction algorithms.    
                     random_state=seed
                    ) 
# after initialization, we can access some variables by model.pc_x, model.w, model.w_tilde, etc..

Computing posterior estimations over mini-batches.

---------------------------------------------------------------------------

ResourceExhaustedError                    Traceback (most recent call last)

<ipython-input-27-91c48b13b6e4> in <module>()
      4                      dimred='umap',    # dimension reduction methods
      5                      #**kwargs         # extra key-value arguments for dimension reduction algorithms.
----> 6                      random_state=seed
      7                     ) 
      8 # after initialization, we can access some variables by model.pc_x, model.w, model.w_tilde, etc..

10 frames

/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

ResourceExhaustedError:  OOM when allocating tensor with shape[128,150,1653,57] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node Tile_1 (defined at /usr/local/lib/python3.7/dist-packages/VITAE/model.py:367) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
 [Op:__inference__get_inference_4681280]

Function call stack:
_get_inference

I tried to reduce the batch size 64,32,16,8 but all failed. I am not running out of memory.

The is due to the size of the input data. When I reduce the number of cells in my data, it is working.

Thank you in advance.

Best regards

jaydu1 / vitae Goto Github PK

vitae's People

Contributors

Stargazers

Watchers

Forkers

vitae's Issues

Issue in model.pre_train when setting processed=True in model.preprocess_data

error in model.preprocess_data if an annData object is given as input in model.get_data

Implement in scvi-tools

Running model.init_inference in GPU version failed

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent