Giter Club home page Giter Club logo

delucs's People

Contributors

millanp95 avatar pmillana avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

delucs's Issues

EvaluateDeLUCS.py

Hi,
Thanks for developing this tool!
While running EvaluateDeLUCS.py on test files (test 1) I got following error message:

Traceback (most recent call last):
File "EvaluateDeLUCS.py", line 3, in
import torch
File "/src/python3/envs/delucs/lib/python3.7/site-packages/torch/init.py", line 84, in
from torch._C import *
ImportError: /src/python3/envs/delucs/lib/python3.7/site-packages/torch/lib/libmkldnn.so.0: undefined symbol: cblas_sgemm_alloc

I tried to install mkl with intel (through conda install) and I am still getting the same error message.

Thanks,
Best regards,
Gautam

Alignement of non-DNA sequences

In the paper you specify explicitly the clusterning of DNA sequences. Can DeLUCS be used for clustering of non-DNA sequences, for example sequences of RNA viruses, or only specific genes?

TrainDeLUCS.py

TrainDeLUCS.py line 116
SingleRun.py line 114
parser.add_argument('--n_custers', action='store', type=int, default=0)
typo: n_custers => n_clusters

Ns

Hi.

I'm excited to try this tool out. Just wondering how it handles N's?

Liam

build_DP bug

HI,

I'm trying to test your tool on some COVID seqs downloaded from gisaid. I put 500 seqs in fasta format in a folder called 'fas', I got the error 'File name too long' so I just named them 1-500, but this error persists.

Just to be clear; I have a folder called fas. In it are 500 fastas, called 1.fa, 2,fa etc.

They are formatted as such:

head fas/1.fa

>1.B_1
ACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCAC
TCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACA

I ran them as below (on ubuntu with Python 3.8.5) and got the below error:


build_dp.py --data_path = fas 
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 12: 
This script builds a dataset in pickle
format from a folder with FASTA files. The
desired label of the file must be in the file ID
after the accession number separated by a dot.

:param dataset: Name of the Dataset.
:param data_path: Path of the folder with the sequences.
:returns: None

Example: python build_dp.py --data_path = '../data/Influenza'
: File name too long
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 14: import: command not found
from: can't read /var/mail/Bio
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 16: import: command not found
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 17: import: command not found
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 20: syntax error near unexpected token `('
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 20: `def replace(seq):'

It seems as though there are multiple import errors.

Any help would be appreciated,
Liam

bug in EvaluateDeLUCS/TrainDeLUCS

HI,

getting is error:

python TrainDeLUCS.py --data_dir=/home/binfie1/liam_dev/delucs/pairs --out_dir=/home/binfie1/liam_dev/delucs/train1
Traceback (most recent call last):
  File "TrainDeLUCS.py", line 193, in <module>
    main()
  File "TrainDeLUCS.py", line 127, in main
    x_train, x_test, y_test = pickle.load(open(filename, 'rb'))
NotADirectoryError: [Errno 20] Not a directory: '/home/binfie1/liam_dev/delucs/pairs/testing_data.p'

Seems to come from trying to open the output from get_pairs.py, which ran as.

python get_pairs.py --data_path=/home/binfie1/liam_dev/delucs/train --k=6 --modify='mutation' --output=/home/binfie1/liam_dev/delucs/pairs
............computing learning pairs................
......saving mutated pairs.....

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.