danko-lab / dreg Goto Github PK
View Code? Open in Web Editor NEWDetecting Regulatory Elements using GRO-seq and PRO-seq
Detecting Regulatory Elements using GRO-seq and PRO-seq
Dear authors,
Thanks for your excellent work. I have read your codes but not found how to generate ground truth file (e.g. ***.negative.bed.rdata
, ***.positive.bed.rdata
link, ***. grocaptss.bed
link and so on). Can you give more information about the label generation process?
I also found that there are two similar functions (get_test_set
and get_test_set0
link), It seems that get_test_set
is loaded directly from the positive and negative label files (mentioned above), so I cannot get more details. Is the process of generating negative samples the same as get_test_set0
?
Further, I found that most of the main programs used get_test_set
but the parameters passed were based on the definition of get_test_set0
. Is the code update not completely completed link?
Looking forward to your reply!
Best,
Junbin
According to the rphast package page on CRAN:
Package ‘rphast’ was removed from the CRAN repository.
Formerly available versions can be obtained from the archive.
Archived on 2020-03-03 as check problems were not corrected despite reminders.
A summary of the most recent check results can be obtained from the check results archive.
Please use the canonical form https://CRAN.R-project.org/package=rphast to link to this page.
Hi, sorry if I am not supposed to ask such question here, please let me know if there is a designated email or group.
Can you please elaborate on how should I interpret the probability and score given by dREG to each peak? For example, I used it on GROseq data from mice cell line, and there were many peaks which have probability as 0.0 or very low (~10E-15) but dREG score such as 0.3 or 0.4, does that mean something?
Also is there a way to identify enhancers from all peaks? Or should I just consider every detected peak outside a gene as an enhancer?
Thanks
I've created a docker container for dreg. Perhaps you want to reference it here, or make your own from my Dockerfile. https://hub.docker.com/r/samesense/dreg-docker/
Running the test program with R 4.2.2 gives the error in the title and attached below.
error.txt
`R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C
[4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rgtsvm_0.55 dREG_1.4.0 randomForest_4.7-1.1
[4] mvtnorm_1.1-3 rmutil_1.1.10 data.table_1.14.8
[7] snowfall_1.84-6.2 snow_0.4-4 rphast_1.6.9
[10] e1071_1.7-13 bigWig_0.2-9
loaded via a namespace (and not attached):
[1] class_7.3-21 tools_4.2.2 bit64_4.0.5 bit_4.0.5 proxy_0.4-27
[6] parallel_4.2.2 compiler_4.2.2`
Running dREG through the web interface produces the following error:
Validation failed refer the validationResults list for detail error. Validation errors : Validation Errors :
Inspecting the output files suggests that the bigWig files were made incorrectly, although both files are usable in other programs (genome viewers, for example). Output file states:
Every read might be mapped to a region, not a locus.
Further inspection of the files suggests that a solution could be found in the github repository Danko-Lab/dREGError, but I was unable to locate this repository, or any other info about why this error might come about.
Conversion to bigWig used RSeQC bam2wig.py and kentUtils wigToBigWig, as per the UCSC website. Additionally, I also attempted to create the bigWig files using bedtools to convert the sorted BAM files to BedGraph files, using bedtools slop and bedClip to ensure the regions did not extend past the ends of the chromosomes, and then used kentUtils bedGraphToBigWig, with the same result. See attached for scripts used and all output files.
wigToBigWig.txt
bedGraphToBigWig.txt
bam2wiggle.txt
SRR1105736.out.dREG.log
bg2bw.txt
slurm-17717042.out.txt
I'm getting "Error in device initialization" error when running peak calling with dREG, but the prediction function runs fines. .
I'm running on a node with:
OS: redhat 7.4 kernel 3.10.0-693
GPU: GTX1080
Cuda: cuda9.0
dreg version: latest version checked out from github
rgtsvm version: latest version
bedops: 2.4.14
All the data are test data downloaded from ftp server provided, below is the test run output:
Using: R --vanilla --slave --args test/K562_long_plus.bw test/K562_long_minus.bw test/k562/K562.TEST test/asvm.gdm.6.6M.20170828.rdata 4 0 < /home/installs/packs/rgtsvm/dREG/run_dREG.R
WARNING: ignoring environment value of R_HOME
Loading required package: dREG
Loading required package: bigWig
Loading required package: e1071
Loading required package: rphast
Loading required package: snowfall
Loading required package: snow
Loading required package: data.table
Loading required package: rmutil
Attaching package: ‘rmutil’
The following object is masked from ‘package:stats’:
nobs
The following objects are masked from ‘package:base’:
as.data.frame, units
Loading required package: mvtnorm
Loading required package: randomForest
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Warning message:
replacing previous import ‘stats::nobs’ by ‘rmutil::nobs’ when loading ‘dREG’
------------ Parameters -------------
Bigwig(plus): test/K562_long_plus.bw
Bigwig(minus): test/K562_long_minus.bw
Output: test/k562/K562.TEST
dREG model: test/asvm.gdm.6.6M.20170828.rdata
CPU cores: 4
GPU ID: 0
Using Rgtsvm: TRUE
-------------------------------------
[ 2019-01-23 11:21:43 ] 1) Checking bigWig files.
[ 2019-01-23 11:21:49 ] 2) Starting peak calling.
Loading required package: Rgtsvm
Attaching package: ‘Rgtsvm’
The following objects are masked from ‘package:e1071’:
svm, tune.control, tune.svm
Error in checkForRemoteErrors(val) :
one node produced an error: Error in device initialization.
Calls: system.time ... clusterApply -> staticClusterApply -> checkForRemoteErrors
Timing stopped at: 117.8 3.9 144.7
Execution halted
What have I tried:
Please let me know of any insights to this problem.
It'd be really cool if you all had some tagged releases as known reference points so that people trying to ensure consistency in environments could pull them instead of just cloning master.
Hi I am trying to run dREG on PRO-seq data. I understand I need the parameter file from the trained SVM. In the bash script, this is
asvm.RData
but can't seem to find/missed it somewhere in the github directories?
Thank you very much! Looking forward to getting this up and running!
cheers!
I have many questions. Is run_predict.bsh a more advanced version of run_dREG.bsh? I can't find the dREG_model/asvm.RData (asvm.mammal.RData seems to be a sample dataset). Does this mean that run_predict.bsh also needs to use asvm.gdm.6.6M.20170828.rdata?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.