Giter Club home page Giter Club logo

ddsm's People

Contributors

guillaumehu avatar jzthree avatar pavelavdeyev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ddsm's Issues

Specific PyTorch Version

Which specific version of PyTorch do you recommend? The official Selene documentation only supports up to PyTorch 1.9. I have been encountering issues with installing Selene myself. Therefore, I would like to know which version of PyTorch you are using.

memmap conversion issue

Hi Pavel,

I installed selene-sdk==0.5.3 and downloaded the data as guided. However, when I run the file make_genome_memmap.py, the following error pops out:

ModuleNotFoundError: No module named 'selene_utils'

I realize it comes from from selene_utils import MemmapGenome, so I tried to independently install selene_utils package. But I didn't find it. Could you provide any guidance on what was going on?

Jax version

Hello,

In the README you state

The Jax version of the code will be published soon.

Is there any update or timeline regarding this Jax release? Thanks!

how to control GPU allocations in high-dimensional pre-sampling?

Hi,
How can I control how much GPU ram is allocated during pre-sampling?
I've noticed that pre-sampling more than 4-5 dimensional categorical needs a lot of memory.
For instance, although the 2 and 4 dimensional examples (promoter and bernoulli) run fine,
I get the following error when running the sudoku (9-dim) presampling on a 24GB gpu:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.96 GiB. GPU 0 has a total capacty of 23.65 GiB of which 5.27 GiB is free. Process 643229 has 18.34 GiB memory in use. Of the allocated memory 17.89 GiB is allocated by PyTorch, and 9.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Confused about discretization

Thanks for the great paper and code :)

I am confused about how the continuous $x$ values get mapped to discrete values during the reverse diffusion process for Univariate Jacobi Diffusion:
image
taken from your paper here.

Or is it that the diffusion technically occurs over discrete probability distributions (which are obtained as samples from the reverse diffusion process)? And the final categorical values are obtained by again sampling from the sampled discrete probability distributions?

Also, any update on the JAX code would be very much appreciated :)

Promoter dataset source

Thank you for the interesting paper!

I am having trouble understanding which exact files you used from the FANTOM5 database and how you converted them to the files provided on the ZENODO platform. This was not made clear in the paper, and in the code it is also not stated, as far as I see.
Could you please add some information in the readme about which files you used from the FANTOM5 database and the code you used to preprocess these files?

Thank you in advance!

The performance of DDSM for unconditional DNA generation

Dear Team,

I have been working on developing the generative model for DNA sequences. For a fair comparison, I compare different algorithms in the unconditional generation case. It seems that DDSM fails to capture the motif distribution in the unconditional DNA sequence generation case. By unconditional generation, I mean the transcription profile is not supported as conditions.

I wonder if you have tried to use DDSM for unconditional DNA sequence generation and what is the expected result.

PS: I tried both time dilation and without dilation, and the generated samples don't seem to be capturing the motif distribution of input sequences. The training script is available.

Best,
Zehui

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.