danko-lab / dreg Goto Github PK

View Code? Open in Web Editor NEW

34.0 11.0 21.0 226.26 MB

Detecting Regulatory Elements using GRO-seq and PRO-seq

Makefile 0.08% Perl 0.46% R 82.94% C 3.40% Shell 12.69% Python 0.43%

gene-regulation gro-seq pro-seq chro-seq transcription-regulatory-elements

dreg's People

Contributors

Stargazers

Watchers

dreg's Issues

How to generate the Ground Truth?

Dear authors,

Thanks for your excellent work. I have read your codes but not found how to generate ground truth file (e.g. ***.negative.bed.rdata, ***.positive.bed.rdata link, ***. grocaptss.bed link and so on). Can you give more information about the label generation process?

I also found that there are two similar functions (get_test_set and get_test_set0 link), It seems that get_test_set is loaded directly from the positive and negative label files (mentioned above), so I cannot get more details. Is the process of generating negative samples the same as get_test_set0 ?

Further, I found that most of the main programs used get_test_set but the parameters passed were based on the definition of get_test_set0. Is the code update not completely completed link?

Looking forward to your reply!

Best,
Junbin

Cannot install dREG with R 4.0 because rphast is no longer in CRAN

According to the rphast package page on CRAN:

Package ‘rphast’ was removed from the CRAN repository.

Formerly available versions can be obtained from the archive.

Archived on 2020-03-03 as check problems were not corrected despite reminders.

A summary of the most recent check results can be obtained from the check results archive.

Please use the canonical form https://CRAN.R-project.org/package=rphast to link to this page.

How to interpret the probability and max score in the output files?

Hi, sorry if I am not supposed to ask such question here, please let me know if there is a designated email or group.

Can you please elaborate on how should I interpret the probability and score given by dREG to each peak? For example, I used it on GROseq data from mice cell line, and there were many peaks which have probability as 0.0 or very low (~10E-15) but dREG score such as 0.3 or 0.4, does that mean something?

Also is there a way to identify enhancers from all peaks? Or should I just consider every detected peak outside a gene as an enhancer?

Thanks

Docker file

I've created a docker container for dreg. Perhaps you want to reference it here, or make your own from my Dockerfile. https://hub.docker.com/r/samesense/dreg-docker/

Error in if (class(x) == "BigMatrix.refer") bigm.internal.nrow(x) else NROW(x) : the condition has length > 1

Running the test program with R 4.2.2 gives the error in the title and attached below.
error.txt

`R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rgtsvm_0.55          dREG_1.4.0           randomForest_4.7-1.1
 [4] mvtnorm_1.1-3        rmutil_1.1.10        data.table_1.14.8   
 [7] snowfall_1.84-6.2    snow_0.4-4           rphast_1.6.9        
[10] e1071_1.7-13         bigWig_0.2-9        

loaded via a namespace (and not attached):
[1] class_7.3-21   tools_4.2.2    bit64_4.0.5    bit_4.0.5      proxy_0.4-27  
[6] parallel_4.2.2 compiler_4.2.2`

dREG Error

Running dREG through the web interface produces the following error:
Validation failed refer the validationResults list for detail error. Validation errors : Validation Errors :

Inspecting the output files suggests that the bigWig files were made incorrectly, although both files are usable in other programs (genome viewers, for example). Output file states:
Every read might be mapped to a region, not a locus.

Further inspection of the files suggests that a solution could be found in the github repository Danko-Lab/dREGError, but I was unable to locate this repository, or any other info about why this error might come about.

Conversion to bigWig used RSeQC bam2wig.py and kentUtils wigToBigWig, as per the UCSC website. Additionally, I also attempted to create the bigWig files using bedtools to convert the sorted BAM files to BedGraph files, using bedtools slop and bedClip to ensure the regions did not extend past the ends of the chromosomes, and then used kentUtils bedGraphToBigWig, with the same result. See attached for scripts used and all output files.

wigToBigWig.txt
bedGraphToBigWig.txt
bam2wiggle.txt
SRR1105736.out.dREG.log
bg2bw.txt
slurm-17717042.out.txt

peak calling device initialization issue.

I'm getting "Error in device initialization" error when running peak calling with dREG, but the prediction function runs fines. .

I'm running on a node with:

OS: redhat 7.4 kernel 3.10.0-693
GPU: GTX1080
Cuda: cuda9.0
dreg version: latest version checked out from github
rgtsvm version: latest version
bedops: 2.4.14

All the data are test data downloaded from ftp server provided, below is the test run output:


Using: R --vanilla --slave --args test/K562_long_plus.bw test/K562_long_minus.bw test/k562/K562.TEST test/asvm.gdm.6.6M.20170828.rdata 4 0 < /home/installs/packs/rgtsvm/dREG/run_dREG.R
WARNING: ignoring environment value of R_HOME
Loading required package: dREG
Loading required package: bigWig
Loading required package: e1071
Loading required package: rphast
Loading required package: snowfall
Loading required package: snow
Loading required package: data.table
Loading required package: rmutil

Attaching package: ‘rmutil’

The following object is masked from ‘package:stats’:

    nobs

The following objects are masked from ‘package:base’:

    as.data.frame, units

Loading required package: mvtnorm
Loading required package: randomForest
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Warning message:
replacing previous import ‘stats::nobs’ by ‘rmutil::nobs’ when loading ‘dREG’ 
------------ Parameters ------------- 
Bigwig(plus): test/K562_long_plus.bw 
Bigwig(minus): test/K562_long_minus.bw 
Output: test/k562/K562.TEST 
dREG model: test/asvm.gdm.6.6M.20170828.rdata 
CPU cores: 4 
GPU ID: 0 
Using Rgtsvm: TRUE 
-------------------------------------
 [ 2019-01-23 11:21:43 ] 1) Checking bigWig files.
[ 2019-01-23 11:21:49 ] 2) Starting peak calling.
Loading required package: Rgtsvm

Attaching package: ‘Rgtsvm’

The following objects are masked from ‘package:e1071’:

    svm, tune.control, tune.svm

Error in checkForRemoteErrors(val) : 
  one node produced an error: Error in device initialization.
Calls: system.time ... clusterApply -> staticClusterApply -> checkForRemoteErrors
Timing stopped at: 117.8 3.9 144.7
Execution halted

What have I tried:

running with 1 cpu and 1 gpu - same error.
Tried to run with no gpu by not specifying a number after cpu or using "FALSE", got this error:
Error in if (gpu_id > 0) { : missing value where TRUE/FALSE needed
Execution halted

Please let me know of any insights to this problem.

No tagged releases

It'd be really cool if you all had some tagged releases as known reference points so that people trying to ensure consistency in environments could pull them instead of just cloning master.

asvm.RData

Hi I am trying to run dREG on PRO-seq data. I understand I need the parameter file from the trained SVM. In the bash script, this is

asvm.RData

but can't seem to find/missed it somewhere in the github directories?

Thank you very much! Looking forward to getting this up and running!
cheers!

asvm.RData

I have many questions. Is run_predict.bsh a more advanced version of run_dREG.bsh? I can't find the dREG_model/asvm.RData (asvm.mammal.RData seems to be a sample dataset). Does this mean that run_predict.bsh also needs to use asvm.gdm.6.6M.20170828.rdata?

danko-lab / dreg Goto Github PK

dreg's People

Contributors

Stargazers

Watchers

Forkers

dreg's Issues

How to generate the Ground Truth?

Cannot install dREG with R 4.0 because rphast is no longer in CRAN

How to interpret the probability and max score in the output files?

Docker file

Error in if (class(x) == "BigMatrix.refer") bigm.internal.nrow(x) else NROW(x) : the condition has length > 1

dREG Error

peak calling device initialization issue.

No tagged releases

asvm.RData

asvm.RData

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent