Giter Club home page Giter Club logo

dreg's People

Contributors

corcra avatar dankoc avatar ndukler avatar wzhy2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dreg's Issues

How to generate the Ground Truth?

Dear authors,

Thanks for your excellent work. I have read your codes but not found how to generate ground truth file (e.g. ***.negative.bed.rdata, ***.positive.bed.rdata link, ***. grocaptss.bed link and so on). Can you give more information about the label generation process?

I also found that there are two similar functions (get_test_set and get_test_set0 link), It seems that get_test_set is loaded directly from the positive and negative label files (mentioned above), so I cannot get more details. Is the process of generating negative samples the same as get_test_set0 ?

Further, I found that most of the main programs used get_test_set but the parameters passed were based on the definition of get_test_set0. Is the code update not completely completed link?

Looking forward to your reply!

Best,
Junbin

How to interpret the probability and max score in the output files?

Hi, sorry if I am not supposed to ask such question here, please let me know if there is a designated email or group.

Can you please elaborate on how should I interpret the probability and score given by dREG to each peak? For example, I used it on GROseq data from mice cell line, and there were many peaks which have probability as 0.0 or very low (~10E-15) but dREG score such as 0.3 or 0.4, does that mean something?

Also is there a way to identify enhancers from all peaks? Or should I just consider every detected peak outside a gene as an enhancer?

Thanks

Error in if (class(x) == "BigMatrix.refer") bigm.internal.nrow(x) else NROW(x) : the condition has length > 1

Running the test program with R 4.2.2 gives the error in the title and attached below.
error.txt

`R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rgtsvm_0.55          dREG_1.4.0           randomForest_4.7-1.1
 [4] mvtnorm_1.1-3        rmutil_1.1.10        data.table_1.14.8   
 [7] snowfall_1.84-6.2    snow_0.4-4           rphast_1.6.9        
[10] e1071_1.7-13         bigWig_0.2-9        

loaded via a namespace (and not attached):
[1] class_7.3-21   tools_4.2.2    bit64_4.0.5    bit_4.0.5      proxy_0.4-27  
[6] parallel_4.2.2 compiler_4.2.2`

dREG Error

Running dREG through the web interface produces the following error:
Validation failed refer the validationResults list for detail error. Validation errors : Validation Errors :

Inspecting the output files suggests that the bigWig files were made incorrectly, although both files are usable in other programs (genome viewers, for example). Output file states:
Every read might be mapped to a region, not a locus.

Further inspection of the files suggests that a solution could be found in the github repository Danko-Lab/dREGError, but I was unable to locate this repository, or any other info about why this error might come about.

Conversion to bigWig used RSeQC bam2wig.py and kentUtils wigToBigWig, as per the UCSC website. Additionally, I also attempted to create the bigWig files using bedtools to convert the sorted BAM files to BedGraph files, using bedtools slop and bedClip to ensure the regions did not extend past the ends of the chromosomes, and then used kentUtils bedGraphToBigWig, with the same result. See attached for scripts used and all output files.

wigToBigWig.txt
bedGraphToBigWig.txt
bam2wiggle.txt
SRR1105736.out.dREG.log
bg2bw.txt
slurm-17717042.out.txt

peak calling device initialization issue.

I'm getting "Error in device initialization" error when running peak calling with dREG, but the prediction function runs fines. .

I'm running on a node with:

OS: redhat 7.4 kernel 3.10.0-693
GPU: GTX1080
Cuda: cuda9.0
dreg version: latest version checked out from github
rgtsvm version: latest version
bedops: 2.4.14

All the data are test data downloaded from ftp server provided, below is the test run output:


Using: R --vanilla --slave --args test/K562_long_plus.bw test/K562_long_minus.bw test/k562/K562.TEST test/asvm.gdm.6.6M.20170828.rdata 4 0 < /home/installs/packs/rgtsvm/dREG/run_dREG.R
WARNING: ignoring environment value of R_HOME
Loading required package: dREG
Loading required package: bigWig
Loading required package: e1071
Loading required package: rphast
Loading required package: snowfall
Loading required package: snow
Loading required package: data.table
Loading required package: rmutil

Attaching package: ‘rmutil’

The following object is masked from ‘package:stats’:

    nobs

The following objects are masked from ‘package:base’:

    as.data.frame, units

Loading required package: mvtnorm
Loading required package: randomForest
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Warning message:
replacing previous import ‘stats::nobs’ by ‘rmutil::nobs’ when loading ‘dREG’ 
------------ Parameters ------------- 
Bigwig(plus): test/K562_long_plus.bw 
Bigwig(minus): test/K562_long_minus.bw 
Output: test/k562/K562.TEST 
dREG model: test/asvm.gdm.6.6M.20170828.rdata 
CPU cores: 4 
GPU ID: 0 
Using Rgtsvm: TRUE 
-------------------------------------
 [ 2019-01-23 11:21:43 ] 1) Checking bigWig files.
[ 2019-01-23 11:21:49 ] 2) Starting peak calling.
Loading required package: Rgtsvm

Attaching package: ‘Rgtsvm’

The following objects are masked from ‘package:e1071’:

    svm, tune.control, tune.svm

Error in checkForRemoteErrors(val) : 
  one node produced an error: Error in device initialization.
Calls: system.time ... clusterApply -> staticClusterApply -> checkForRemoteErrors
Timing stopped at: 117.8 3.9 144.7
Execution halted

What have I tried:

  1. running with 1 cpu and 1 gpu - same error.
  2. Tried to run with no gpu by not specifying a number after cpu or using "FALSE", got this error:
    Error in if (gpu_id > 0) { : missing value where TRUE/FALSE needed
    Execution halted

Please let me know of any insights to this problem.

No tagged releases

It'd be really cool if you all had some tagged releases as known reference points so that people trying to ensure consistency in environments could pull them instead of just cloning master.

asvm.RData

Hi I am trying to run dREG on PRO-seq data. I understand I need the parameter file from the trained SVM. In the bash script, this is

asvm.RData

but can't seem to find/missed it somewhere in the github directories?

Thank you very much! Looking forward to getting this up and running!
cheers!

asvm.RData

I have many questions. Is run_predict.bsh a more advanced version of run_dREG.bsh? I can't find the dREG_model/asvm.RData (asvm.mammal.RData seems to be a sample dataset). Does this mean that run_predict.bsh also needs to use asvm.gdm.6.6M.20170828.rdata?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.