ay-lab / dchic Goto Github PK
View Code? Open in Web Editor NEWdcHiC: Differential compartment analysis for Hi-C datasets
License: MIT License
dcHiC: Differential compartment analysis for Hi-C datasets
License: MIT License
Hi,
Thank you for the development of the dcHiC.
While I was using this tool to my Hi-C data, I stuck on the following error running "Rscript dchicf.r --file test_inputfiles.f.txt --pcatype cis --dirovwt T --cthread 2 --pthread 4". Could you please offer some help?
###########################################
...
...
...
Calculating expected counts from chromosome wise background
dist Weight
1 0 854838
2 100000 428198
3 200000 192287
4 300000 135688
5 400000 104093
6 500000 84077
A B Weight chr1 pos1 chr2 pos2 dist WeightOE
1: 23394 23394 640 chr9 0 chr9 0 0 0.9328551
2: 23394 23395 236 chr9 0 chr9 100000 100000 0.6861779
3: 23394 23396 117 chr9 0 chr9 200000 200000 0.7569310
4: 23394 23397 155 chr9 0 chr9 300000 300000 1.4199119
5: 23394 23398 76 chr9 0 chr9 400000 400000 0.9068045
6: 23394 23399 86 chr9 0 chr9 500000 500000 1.2693840
[1] 280317
[1] 17000000
[1] 280317
Writing chr9 .txt file
Calculating expected counts from chromosome wise background
Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate
Calls: lapply ... aggregate -> aggregate.formula -> aggregate.data.frame
Execution halted
rm: cannot remove '/home/xxx/xxx/tmp/RtmpVVQ6kO/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_3bf462780119cc': Directory not empty
Thanks,
Yuxiang
Hi, can you help me to better understand the method for multiple comparisons performed by dcHiC? My goal is to better understand how a single p-value is derived for each individual bin of the genome in examples in which >2 Hi-C datasets are analyzed.
From what I understand, first, the Hi-C maps are concatenated, then Multiple Factor Analysis is performed on the concatenated map. So this means, for example, an analysis of four biological samples results in four partial factor scores for each bin—is that correct?
These scores are used to derive a multivariate distance measure, the Mahalanobis distance. The distance measure detects outliers in scores among all samples. In the example analysis of four biological samples, there would be four score variables per each bin—is this correct? If one of those is detected as an outlier, then its significance is calculated using the weighted distance and the critical Chi-square distribution. Is that correct?
I appreciate any help you can provide. Thank you!
Hi,
This seems like a good tool that I could use in my project!! I work with single-cell data and In one of my analyses, I need to compare my data with previously published population data with replicates. On the GEO page, all they provide are just .allValidPairs
files.
I had started running the dchic pipeline on .cool files which I had got using hicpro2higlass
script. I realized that this is not the right way to do??
What I need to start with are .validPairs files?
I don't think there is a way to convert allValidPairs to validPairs?
any help will be great!
Hello,
I would like to use dcHiC on cow.
However it is not clear how is made the TSS file. In the "Technical Specifications" section, you give an exemple for hg38.tss.bed but it is a gtf file.
Cow golden path: https://hgdownload.cse.ucsc.edu/goldenpath/bosTau9/bigZips/
So, how to make this tss.bed file ?
Is it just a 2 columns table with chromosome, TSS position?
3 columns, chr, start and end?
Thanks
Hi @ay-lab. I'd just like to ask on how we specifically process the input data/files for dcHiC. There's an instruction on Wiki tab about using cooler's dump and preprocessing.py. I'm wondering where this Python script can be obtained.
Thank you!
I'm trying to run dcHiC chromosome-by-chromosome (after removing chrY), and most of the chromosomes seem to be ok, all of the differential compartment, etc files are there, but it fails on chr1, with the following:
[1] "Hierarchy"
[[1]]
[1] 5602 5602 5602 5602 5602 5602
[[2]]
[1] 1 1 1 1 1 1
DEBUG:Blank space skipped. No worries.
DEBUG:Blank space skipped. No worries.
DEBUG:Blank space skipped. No worries.
DEBUG:Blank space skipped. No worries.
DEBUG:Blank space skipped. No worries.
DEBUG:Blank space skipped. No worries.
Positions Not Shared Across All Data Sets, Chromosome 1:
None
5602
Rscript --vanilla /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/Hier.R BalancedChrMatrix_exp_N2P.D0.txt BalancedChrMatrix_exp_N2P.D14.txt BalancedChrMatrix_exp_N2P.D7.
R OUTPUT:
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/run.py", line 323, in <module>
dmfa_file = open(name, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'hmfa_chrRAW_N2P.D0_exp_1.txt'
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py", line 40, in <module>
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_N2P.D0_exp_1.txt'
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py", line 40, in <module>
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_N2P.D14_exp_2.txt'
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py", line 40, in <module>
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_N2P.D7_exp_3.txt'
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py", line 40, in <module>
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_NGN2.D0_exp_4.txt'
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py", line 40, in <module>
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_NGN2.D14_exp_5.txt'
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py", line 40, in <module>
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_NGN2.D7_exp_6.txt'
['1']
['N2P.D0', 'N2P.D14', 'N2P.D7', 'NGN2.D0', 'NGN2.D14', 'NGN2.D7']
['N2P.D0', 'N2P.D14', 'N2P.D7', 'NGN2.D0', 'NGN2.D14', 'NGN2.D7']
###################
Chromosome 1 Output
###################
/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic
6
6
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/run.py -nExp 6 -chrNum 1 -res 40000 -numGroups 6 -grouping 1 -ncp 2 -group 1 -group 1 -group 1 -group 1 -group 1 -g
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py -eigfile chr_1/hmfa_N2P.D0_exp_1.txt -chr 1 -exp N2P.D0
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py -eigfile chr_1/hmfa_N2P.D14_exp_2.txt -chr 1 -exp N2P.D14
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py -eigfile chr_1/hmfa_N2P.D7_exp_3.txt -chr 1 -exp N2P.D7
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py -eigfile chr_1/hmfa_NGN2.D0_exp_4.txt -chr 1 -exp NGN2.D0
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py -eigfile chr_1/hmfa_NGN2.D14_exp_5.txt -chr 1 -exp NGN2.D14
python /home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/makeBedGraph.py -eigfile chr_1/hmfa_NGN2.D7_exp_6.txt -chr 1 -exp NGN2.D7
Traceback (most recent call last):
File "/home/user2031/.conda/envs/hicpro/lib/python3.7/shutil.py", line 566, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: 'pc_N2P.D0_exp_1.txt' -> 'pcFiles/pc_N2P.D0_exp_1.txt'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user2031/work/repos/bcro/bit-bio/dcHiC/dchic/dchic.py", line 392, in <module>
shutil.move(pcaFileLocation, "pcFiles")
File "/home/user2031/.conda/envs/hicpro/lib/python3.7/shutil.py", line 580, in move
copy_function(src, real_dst)
File "/home/user2031/.conda/envs/hicpro/lib/python3.7/shutil.py", line 266, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/user2031/.conda/envs/hicpro/lib/python3.7/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'pc_N2P.D0_exp_1.txt'
This is a 40k matrix from HiC-Pro. Is it possible that the R script failed due to some error (memory limitation, etc)?
Hi @ay-lab
The following message always in my standand error file. Do I have to reinstall the application? If it is, could you provide me some guidance by any chance? Thank you in advance.
qt.qpa.xcb: could not connect to display login01:34.0
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Best regards,
Zheng zhuqing
Great tool!
I came across one issue. For a particular sample the PC1 is the comaprtment score for all chrs except one. For that chr PC1 came out as the chromosome arms. Is there an easy way to get PC2 for that chr only?
For all other chrs and for samples PC1 is accurate.
Hi! Congrats for the nice tool!
I am a bit puzzled at the very end of running dcHiC.py where it launches the differential calling:
python /path/dchic/differentialCalling.py -inputFile input.txt -chrFile chromosomes.txt -multiComp 1 -makePlots 1 -res 100000 -genome mm10 -blacklist /path/dcHiC/files/mm10blacklist_sorted.bed
With the following error:
Traceback (most recent call last):
File "/path/dchic/differentialCalling.py", line 469, in
main()
File "/path/dchic/differentialCalling.py", line 363, in main
checkInputs(0)
File "/path/dchic/differentialCalling.py", line 205, in checkInputs
names.append(temp[0])
IndexError: list index out of range
It seems that some of the files might not have the correct format. But it is strange , because the previous part of the code is running totally fine, getting the expected outputs.
I have a couple of samples per condition. My inputs looks like:
ctl.1 CTL /path/ctl.Rep1
ctl.2 CTL /path/ctl.Rep2
exp.1 EXP /path/exp.Rep1
exp.2 EXP /path/exp.Rep2
If I include -repParams it still gives me the same error. I will really appreciate if you could clarify. Thanks in advance
Kind regards,
S.
Hi, I was wondering if there are plans to add support for additional genomes (like Drosophila dm6 or Arabidopsis tair10).
Thanks!
Hi,
I would like to get compartment A/B information for each sample. I noticed that in xx_resolution/intra_pca/sample_res_mat folder, each chromosome has 10 files, e.g.,
[kun@G1400PNG-AP02LP NT1_100000_mat]$ wc -l chrX*
1554 chrX.bed
1491 chrX.cmat.txt
1522 chrX.distparam
1491 chrX.PC1.bedGraph
1491 chrX.PC2.bedGraph
1491 chrX.pc.bedGraph
1492 chrX.pc.txt
1491 chrX.precmat.txt
108 chrX.svd.rds
912204 chrX.txt
I wondered how could I extract compartment A/B information from these files? Thanks for your help!
Best,
Kun
Dear all,
I am trying to implement your tool in a snakemake pipeline (git clone from 2nd Aug 2022).
I just noticed that it would have been nice if there was an option to define in which folder the outputs should be stored (similar to diffdir
one) without to have to move by cd
in the corresponding folder.
Secondly, I tried to run the tool with the following parameters:
$CONDA_PREFIX/bin/Rscript /home/user/dcHiC/dchicf.r \
--cthread 4 \
--pthread 2 \
--file path/to/dcHiC_input_file_individual_samples_100kb.txt \
--pcatype cis \
--genome hg19
The package starts to run, however I got this error at the first sample:
[... previous output ...]
Writing chr22 .txt file
Calculating expected counts from chromosome wise background
dist Weight
1 0 64258.297
2 100000 32765.346
3 200000 15516.227
4 300000 10023.904
5 400000 7032.697
6 500000 5212.232
A B Weight chr1 pos1 chr2 pos2 dist WeightOE
1: 30587 30587 291.567129 chr21 9800000 chr21 9800000 0 2.1870383
2: 30587 30598 2.745532 chr21 9800000 chr21 10900000 1100000 0.8365094
3: 30587 30599 3.772637 chr21 9800000 chr21 11000000 1200000 1.3208504
4: 30587 30644 2.080144 chr21 9800000 chr21 15500000 5700000 5.4256370
5: 30587 30658 1.589394 chr21 9800000 chr21 16900000 7100000 5.3436055
6: 30587 30664 1.550337 chr21 9800000 chr21 17500000 7700000 5.5577334
[1] 25335
[1] 2500000
[1] 25335
Writing chr21 .txt file
Calculating expected counts from chromosome wise background
Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) :
no rows to aggregate
Calls: lapply ... aggregate -> aggregate.formula -> aggregate.data.frame
Execution halted
I was wondering whether with remove "non canonical" chromosomes is sufficient to not encounter this error?
Thanks in advance!
Hi,
It looks like some files (namely dchic.r and utility/reselectpc.r) use '\r' instead of '\n' to mark end of lines.
This causes errors like /usr/bin/env: ‘Rscript\r’: No such file or directory
when running e.g. dchic.r
on Ubuntu.
Could you please run dos2unix or a similar tool to replace '\r' with '\n'?
If you can't install the tool on your machines, feel free to grab the files from my fork here.
This is the command I used to run dos2unix on my fork:
find /tmp/dcHiC -type f -exec dos2unix -k -s -o {} \;
I was about to submit a PR with the fix, but decided against that because this change would make it look like I authored every single line in the incriminated files.
Hi Ay,
I'm trying to run dcHiC on a Human genome dataset, unfortunately, I got several errors and couldn't get what I want.
I have an error message when running the first step, saying that
"Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: Two levels of parallelism are used. See ?assert_cores
.
Calls: lapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors
Execution halted
"
I guess this might be due to the occurrence of chromosomes of Y, M, Z in my data (I do have these chromosomes), but I still wanna ask your opinion on it.
In the second step, which selects the best pc, I have an error saying that
"Error: Executable for bedtools not found! Please make sure that the software is correctly installed and, if necessary, path variables are set."
I'm not familiar with this tool, and I did not see you mention it as a prerequisite for this analysis, so I searched for it and tried installing it on the HPC I am using. However, because I don't have the administration authorization, I still cannot call it when running the program, thus the error is still there. Do you have any suggestions?
Although I have 12 .matrix and .bed files in the data folder, I only have the output of the first one listed in the input file (named NT1_20kb_pca), do you have any idea why was that?
I guess the other errors I encountered in the subsequent steps are the consequences of step 2.
Moreover, the content of my input file is as follows,
NT1_20000.matrix NT1_20000_abs.bed NT1_20Kb NT
NT2_20000.matrix NT2_20000_abs.bed NT2_20Kb NT
PT1_20000.matrix PT1_20000_abs.bed PT1_20Kb PT
PT2_20000.matrix PT2_20000_abs.bed PT2_20Kb PT
PT3_20000.matrix PT3_20000_abs.bed PT3_20Kb PT
PT4_20000.matrix PT4_20000_abs.bed PT4_20Kb PT
PT5_20000.matrix PT5_20000_abs.bed PT5_20Kb PT
RT1_20000.matrix RT1_20000_abs.bed RT1_20Kb RT
RT2_20000.matrix RT2_20000_abs.bed RT2_20Kb RT
RT3_20000.matrix RT3_20000_abs.bed RT3_20Kb RT
RT4_20000.matrix RT4_20000_abs.bed RT4_20Kb RT
RT5_20000.matrix RT5_20000_abs.bed RT5_20Kb RT
Thanks!
Hi dcHiC developers, I'd like to ask for help on installing functionsdchic_1.0.tar.gz. I've followed the instructions via Conda, and the following compilation error is encountered. Is there a way to address this issue?
functionsdchic.cpp:12:10: fatal error: BMAcc.h: No such file or directory
12 | #include <BMAcc.h>
Thank you!
Hi,
Thanks for developing this useful tool.
When running the step of fithic
, I got the following error.
Started calculating Marginalized Contact Count
chr start end index extraField mappable mid correct_index marginalizedContactCount
1 chr1 0 100000 1 0 1 50000 1 0
2 chr1 100000 200000 2 0 1 150000 2 0
3 chr1 200000 300000 3 0 1 250000 3 0
4 chr1 300000 400000 4 0 1 350000 4 0
5 chr1 400000 500000 5 0 1 450000 5 0
6 chr1 500000 600000 6 0 1 550000 6 0
Fithic requires a bias file. Please check the link for more details
https://github.com/ay-lab/fithic
Please generate the bias files for each sample provided in the input.txt file
Create an additional folder 'biases' under current path and dump all the *.biases.gz files inside it
Rerun the step again
Error in FUN(X[[i]], ...) : Exit!
Calls: fithicformat -> lapply -> FUN
Execution halted
I wonder how to specify the bias files for fithic.
Bests,
Yiwei
Hello,
Would you please tell me how to generate the -repParams
file?
In this issue (#2) you mentioned:
In this case, you can simply use multiple allValidPairs data and specify a pre-trained file with "-repParams" in the dchic.py call.
I took a look into both pre-trained files, https://github.com/ay-lab/dcHiC/blob/master/files/humanparams.txt and https://github.com/ay-lab/dcHiC/blob/master/files/miceparams.txt, and I guess the data is about the compartment "fluctuation" in each chromosome, is that correct? However, I still don't understand what the "m" and "s" columns are about.
In my project, the compartment profiles are not traditionally generated from Hi-C maps, but inferred from other epigenetic marks (this is what my project is about). Therefore, I prefer not to directly use the pre-trained files your project provide, but rather generate my own ones. Would you please let me know the meaning of the "m" and "s" columns, and how to generate them?
Thanks!
Dear all,
I found that the dcHiC can not be finished smoothly, giving following standard error and the ReplicateImages
directory is empty.
qt.qpa.xcb: could not connect to display login01:42.0
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: eglfs, minimal, minimalegl, offscreen, vnc, webgl, xcb.
According to the standard out, i need run following command to summarize the results (note that i have turned off the opinion -makePlots
);
python differentialCalling.py -inputFile input.txt -chrFile chr.txt -res 100000 -genome mm10 -blacklist mm10-blacklist.v2.bed
If possible, can you save these figures but not visual.
Best wishes,
Zheng zhuqing
Dear all,
After finished chromosome by chromosome analysis, an error occurred to me when generating the bedgraph results of all pairwise comparisons by using following command,
python /dchic/differentialCalling.py -inputFile input.txt -chrFile chr.txt -makePlots 1 -res 100000 -genome mm10 -multiComp 1 -blacklist /mm10/mm10-blacklist.v2.bed
DifferentialCompartment folder created
Learned parameter file found. Using the IHW to boost statistical power
Running 10MF /02.dcHiC/01.mm10/02.output/00.100K_resolution/chr_X/chrX.PC.coordinates.txt
Error in `[.data.frame`(df, , selected) : undefined columns selected
Calls: diffcmp -> apply -> [ -> [.data.frame
Execution halted
Sincerely,
Zheng zhuqing
Hi @ay-lab. I'd just like to ask how does the current version of dcHiC deal with data without replicates. The previous version of dcHiC (v1, in a separate branch) has the --repParams option that can be specified if replicates are not available. A trial run using the current master branch of dcHiC without replicates seems to work, but I'm just curious to know how does dcHiC work this one out internally?
Thank you.
Dear,
I found that the chromosome text file does not include the "chr" substring in the chromosomes. The default chromosomes in dihic.py
for human and mice genomes also do not have the "chr” substring. However, the chromosomes in goldenpathData and blacklist data have the "chr" substring. Does this will cause some problems when running dcHiC?
Best wishes,
zheng zhuqing
Hi,
Thanks for developing such powerful tools! But I got some errors when I performed PCA on cis interaction matrix.
Command:Rscript /share/home/jiqianzhao/04_Softwares/dcHiC-master/dchicf.r --file dchic.sum.info.txt --pcatype cis --dirovwt T --cthread 2 --pthread 4
and
Errors:Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: Two levels of parallelism are used. See
?assert_cores.Calls: lapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors
When I tried Rscript /share/home/jiqianzhao/04_Softwares/dcHiC-master/dchicf.r --file dchic.sum.info.txt --pcatype cis --dirovwt T
with default threads, it worked but run slowly.
Could you give me some suggestions?
Thanks in advance !
Best,
Qianzhao
error: Two levels of parallelism are used. See ?assert_cores
.
Plz reply.Ths.
:
Matrix dimension 1204 X 1204
Performing Z transformation : complete!
Performing block wise correlation calculation : complete!
Matrix dimension 6232 X 6232
Matrix dimension 880 X 880
Performing Z transformation : complete!
Performing block wise correlation calculation : complete!
Performing Z transformation : complete!
Performing block wise correlation calculation : complete!
Matrix dimension 5654 X 5654
Performing Z transformation : complete!
Performing block wise correlation calculation : complete!
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: Two levels of parallelism are used. See ?assert_cores
.
Calls: lapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors
Execution halted
Really great tool.
Is it possible to add a covariate to the model? Specifically I would like to add copy number as a covariate rather than make a cutoff and blacklist regions. Is there a simple way to do this?
Thanks so much
I am having an issue with subcompartment analysis step of the dchicf.r script. When I run:
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype subcomp --dirovwt T --diffdir all_samples_40Kb
I get the error:
Error in `[<-.data.frame`(`*tmp*`, , "state", value = 1:6) :
replacement has 6 rows, data has 3
Calls: subcompartment -> hmmsegment -> [<- -> [<-.data.frame
Execution halted
The subcompartment analysis runs for many chromosomes and samples, but then fails when it starts chr4 for the first sample in the input.txt
Prior to this I successfully ran:
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype cis --dirovwt T --cthread 2 --pthread 4 --genome hg38 --fdr 0.05
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype select --dirovwt T --genome hg38
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype analyze --dirovwt T --diffdir all_samples_40Kb
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype dloop --dirovwt T --diffdir all_samples_40Kb
I have also successfully run the subcompartment analysis on subsets of these samples, including for chr4 on the first sample in the input.txt, but I get this error when running all samples together.
Dear @ay-lab
I finished running dcHiC and identified only one B-to-A compartment switch region (500kb, under 100kb resolution) between wild and mutant type (following screenshot). I also done PCA using HiCExplorer and HOMER (100kb resolution), please note that I have used gene density to decide if the PC1 values of the eigenvector need a sign flip or not. However, this signature identified by dcHiC did not regisister in HiCExplorer and HOMER PCA, as the PC1 values between wild and mutant did not show significant changes. Thus, I want to know if any visually validating can be used to further support this signal. Thank you very much.
Sincerely,
Zheng zhuqing
Line 15 in 6af6d0e
Here it should be:
cytoband <- paste0("curl -O http://hgdownload.cse.ucsc.edu/goldenPath/",genome,"/database/cytoBand.txt.gz")
I am able to run some samples without error, but I am currently getting an error for a particular group of samples when I run:
Rscript ~/dcHiC/dchicf.r --file BC005_input.txt --pcatype analyze --dirovwt T --diffdir BC005_100Kb
I get the error:
Error in solve.default(cov, ...) :
Lapack routine dgesv: system is exactly singular: U[3,3] = 0
Calls: pcanalyze ... do.call -> CovSde -> mahalanobis -> solve -> solve.default
Execution halted
Any idea what could be causing this and how to fix it?
Dear all,
Because of low sequencing depths for some samples in my study, I have constructed another library for these samples and sequenced them to a high coverage. When preparing the input files to dcHiC
, I'm not sure I'm doing the right thing. For these samples, I simply merged the validPairs not the allValidPairs by using cat
, and then converted the merged file to sparse matrix/bed files by using buildmatrix
.
Sincerely.
Zheng zhuqing
Hi,
I usually use normalized count (between 0 to 1) instead of raw (integer) counts for matrix processing. But for the loop analysis using FitHiC, we must used raw count with a bias files containing the normalization vector.
What do you advice for the compartment analysis with your tool, norm or raw counts ? I think we should use normalized matrices to take into account some biases.
Does the normalized bedgraph (using quantile) allow in a certain way to replace the normalization of matrices? I think it is only useful to compare between samples, right?
Hi,
when I checked out the differential_compartments.bedGraph file generated by dcHiC, the number of differential compartments is about 100. However, when I extracted significantly differential compartments at the FDR of 0.05 0r 0.01, from full_compartment_details.bedGraph, the number of significantly differential compartments is greater than 1000.
Could you explain to me how dcHiC output the differential_compartments.bedGraph file?
Another question: How to define High B and low B?
for example, the PC scores in a given region between 2 different conditions (tumor and normal) are -4 and -1, respectively. So the tumor in this region is defined as High B or Low B?
Best,
Hi --
I've installed the software, extracted the data from .hic files (excluding all 'abnormal' chromosomes) according to the GitHub instructions (all files look as expected) and am now trying to run the overall script (adapted from the GitHub page):
Rscript dchicf.r --file pro_v_sen.txt --pcatype cis --dirovwt T --cthread 2 --pthread 4
Rscript dchicf.r --file pro_v_sen.txt --pcatype select --dirovwt T --genome hg38
Rscript dchicf.r --file pro_v_sen.txt --pcatype analyze --dirovwt T --diffdir pro_v_sen_100Kb
#Rscript dchicf.r --file pro_v_sen.txt --pcatype fithic --dirovwt T --diffdir pro_v_sen_100Kb --fithicpath "/anaconda/dchic/bin/fithic" --pythonpath "/anaconda/dchic/bin/python"
Rscript dchicf.r --file pro_v_sen.txt --pcatype dloop --dirovwt T --diffdir pro_v_sen_100Kb
Rscript dchicf.r --file pro_v_sen.txt --pcatype subcomp --dirovwt T --diffdir pro_v_sen_100Kb
Rscript dchicf.r --file pro_v_sen.txt --pcatype viz --diffdir pro_v_sen_100Kb --genome hg38
Rscript dchicf.r --file pro_v_sen.txt --pcatype enrich --genome hg38 --diffdir conditionA_vs_conditionB --exclA F --region both --pcgroup pcQnm --interaction intra --pcscore F --compare F
However, I am receiving the following error from the pcatype select (pcselect) step:
Running intra chr1 in pro_v_sen_100kb sample
Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate
Calls: pcselect ... aggregate -> aggregate.formula -> aggregate.data.frame
In addition: Warning messages:
1: In (function (..., deparse.level = 1) :
number of columns of result is not a multiple of vector length (arg 1)
2: In (function (..., deparse.level = 1) :
number of columns of result is not a multiple of vector length (arg 1)
Execution halted
Any ideas on how to fix this? Please let me know if you need any additional information. Thanks!
Dear @ay-lab
I think this question should be a general question. As the changes in 3D organization of chromatin can lead to different levels of changing in genome organization, including AB, TADs, and loop. If one region have been identified as B->A switch, does it mean we will also find many changings at TADs or loop level when we have sufficient sequencing data? Could you kindly let me know your comments on this issues. Any comments from you will be highly appreciated.
Best regards,
Zheng zhuqing
Hello,
thanks for developing dcHiC analyzing Hi-C without replicates.
Running dcHiC Without Replicates
Differential calling with dcHiC "learns" the amount that PC (compartment) values vary between biological replicate datasets and uses those parameters for significance thresholds. However, it is also possible to run dcHiC from start to finish without replicates (for users using HiC-Pro, this means using the allValidPairs file).
In the input.txt file, put the same name for the "replicate" and "cell line" columns.
HMEC HMEC /path/to/HMEC
MCF7 MCF7 /path/to/MCF7
MCF10 MCF10 /path/to/MCF10
After building input.txt file, I am trying to running dchic.py, but got the following errors:
Traceback (most recent call last):
File "/data/software/dcHiC-dcHiCv2.0/dchic/run.py", line 421, in
filenum = int(file.split("_")[3].split(".")[0])
ValueError: invalid literal for int() with base 10: 'exp'
python /data/software/dcHiC-dcHiCv2.0/dchic/makeBedGraph.py -eigfile chr_1/hmfa_HiC3-CD8T-Health2-Veh_combine_exp_1.txt -chr 1 -exp HiC3-CD8T-Health2-Veh_combine
Traceback (most recent call last):
File "/data/software/dcHiC-dcHiCv2.0/dchic/makeBedGraph.py", line 40, in
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_HiC3-CD8T-Health2-Veh_combine_exp_1.txt'
python /data/software/dcHiC-dcHiCv2.0/dchic/makeBedGraph.py -eigfile chr_1/hmfa_HiC4-CD8T-Health2-GYY_combine_exp_2.txt -chr 1 -exp HiC4-CD8T-Health2-GYY_combine
Traceback (most recent call last):
File "/data/software/dcHiC-dcHiCv2.0/dchic/makeBedGraph.py", line 40, in
with open(results.eigfile, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'chr_1/hmfa_HiC4-CD8T-Health2-GYY_combine_exp_2.txt'
Dear all,
When running dcHiC, an error occurred to me, likes following:
Chromosome chr1 Output
###################
/path/dchic
4
2
python /path/run.py -nExp 4 -chrNum chr1 -res 100000 -numGroups 2 -grouping 1 -ncp 2 -group 2 -group 2 -expNames mm8WF-1 -expNames mm8WF-2 -expNames mm8WM-1 -expNames mm8WM-2 -groupNames mm8WF -groupNames mm8WM -blacklist /mm10/mm10-blacklist.v2.bed -genome mm10 -alignData /mm10/goldenpathData -prePath /path/CorrelationMatrices/mm8WF-1_mat/ -prePath /path/CorrelationMatrices/mm8WF-2_mat/ -prePath /path/CorrelationMatrices/mm8WM-1_mat/ -prePath /path/CorrelationMatrices/mm8WM-2_mat/
Traceback (most recent call last):
File "/path/run.py", line 128, in <module>
obj = open(results.tag_list[itempos], "r")
IsADirectoryError: [Errno 21] Is a directory: '/path/CorrelationMatrices/mm8WF-1_mat/'
python /path/makeBedGraph.py -eigfile chr_chr1/hmfa_mm8WF_exp_1.txt -chr chr1 -exp mm8WF
cat input.txt
mm8WF-1 mm8WF /path/CorrelationMatrices/mm8WF-1_mat
mm8WF-2 mm8WF /path/CorrelationMatrices/mm8WF-2_mat
mm8WM-1 mm8WM /path/CorrelationMatrices/mm8WM-1_mat
mm8WM-2 mm8WM /path/CorrelationMatrices/mm8WM-2_mat
my command is as following:
python /path/dchic.py -res 100000 -inputFile input.txt -chrFile chr.txt -input 2 -genome mm10 -alignData /mm10/goldenpathData -keepIntermediates 1 -blacklist /mm10/mm10-blacklist.v2.bed
Best wishes,
Zheng zhuqing
Hi author:
I correct the A/B compartment by use order "Rscript /public/home/xhhuang/biosoft/dcHiC/dchicf.r --file input_f.txt --pcatype select --dirovwt T --gfolder Gbar_100000_goldenpathData --genome Gbar". But got an unexpected result and I think this result is wrong. Because both ends of the chromosome are rich in many active genes, very high probability of being A compartment region, the result presents it with B compartment status. I run cworld and hicexplorer to compare the result. The result of cworld and hicexplorer are more consistent and as expected. Although, I am not sure about the exact correction algorithm, which may be related to GC content and gene location, I think there is a problem in correcting A/B compartment of cotton, does it occur similarly in other non-model organisms or in anther plant?
I have more questions and I hope the author will understand. I would very much like this software to be applicable to a wider range of scenarios.
I sent you the reference genome of cotton by email, I hope it will help to improve the software.
Nuturetree
Hi,
The default value for numberclust
is 1, if I increase it to 2
, then no more significant compartments, which is what I want.
My question is that, based on your description on Compartment clustering
, it makes sense to remove lone differential compartments
. But is it OK to just change numberclust
and keep distclust
as default (-1
)?
Thanks,
Yichao
Hi,
Thank you for supplying such an excellent tool!
When calling fithic with Rscript /srv/jh_users/cvahlensieck/dcHiC/dchicf.r --file input_zgf.txt --pcatype fithic --dirovwt T --diffdir diff_zgf_100k --fithicpath "/opt/jupyterhub/lib/python3.9/site-packages/fithic/fithic.py" --pythonpath "/opt/jupyterhub/bin/python"
, the script crashes in line 1611:
Error in
[.data.table(mat_rep[[j]][.(ids_rep)], , 12) : Item 1 of j is 12 which is outside the column number range [1,ncol=11] Calls: fithicformat -> unlist -> [ -> [.data.table Execution halted
I could trace back the error to an output file of FitHiC, FitHiC.spline_pass1.res100000.significances.txt.gz
. This file only contains 9 instead of 10 columns, lacking the ExpCC column. This is why the resulting table in the function fithicformat in dchicf.r only contains 11 columns, which leads to the error described above. Additionally, bias1 and bias2 in the FitHiC output file are always 1. Therefore I am not sure if this error emerges from a programming bug or if it is an issue with my data.
Thanks!
Christian
Hi, is it possible to get the % variance explained value for PC1 or PC2 using dcHiC results? A bit lost between the different intermediate files that dcHiC produces. Trying to produce a plot similar to 4B here. If not, what would you recommend? Just run a regular R prcomp() + summary() on the normalized HiC counts and check the results? Thanks for any pointers!
Dear all,
When running following command, the program will exit with the error, No such file or directory: 'hmfa_chrRAW_10MF-1_exp_1.txt'
. However, I remember I can run without any errors about two months ago, after updating the version of dcHiC, now the program fails.
python dchic/dchic.py -res 50000 -inputFile ${pre}.txt -parallel 6 -chrFile chr.txt -input 2 -genome mm10 -alignData goldenpathData -keepIntermediates 1 -blacklist mm10-blacklist.v2.bed
Error in `[<-.data.frame`(`*tmp*`, m$tss > tskeep, "tskeep", value = "yes") :
missing values are not allowed in subscripted assignments of data frames
Calls: pcselect -> [<- -> [<-.data.frame
Execution halted
Traceback (most recent call last):
File "dchic/run.py", line 323, in <module>
dmfa_file = open(name, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'hmfa_chrRAW_10MF-1_exp_1.txt'
Error in `[<-.data.frame`(`*tmp*`, m$tss > tskeep, "tskeep", value = "yes") :
missing values are not allowed in subscripted assignments of data frames
Calls: pcselect -> [<- -> [<-.data.frame
Execution halted
Traceback (most recent call last):
File "dchic/run.py", line 323, in <module>
dmfa_file = open(name, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'hmfa_chrRAW_10MF-1_exp_1.txt'
Error in `[<-.data.frame`(`*tmp*`, m$tss > tskeep, "tskeep", value = "yes") :
missing values are not allowed in subscripted assignments of data frames
Calls: pcselect -> [<- -> [<-.data.frame
Execution halted
Traceback (most recent call last):
File "dchic/run.py", line 323, in <module>
dmfa_file = open(name, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'hmfa_chrRAW_10MF-1_exp_1.txt'
Error in `[<-.data.frame`(`*tmp*`, m$tss > tskeep, "tskeep", value = "yes") :
missing values are not allowed in subscripted assignments of data frames
Calls: pcselect -> [<- -> [<-.data.frame
Execution halted
...
Sincerely,
Zheng zhuqing
Hi,
dcHiC looks a good tool to analyze differential compartments.
Have you compared dcHiC with other tools counterpart?
Best,
Dear all,
Very nice tool. I would like to have a try in my own project. However, some questions about the input used in dcHiC confused me.
1, Why using validPairs interactions but not the allValidPairs (the main different between them should be that the duplication has been removed from allValidPairs) to generate the matrix? In my mind, the duplication should be removed.
2, If I want to process from .hic file, I think the .hic file should be generated using hicpro2juicebox.sh (this also used the allValidPairs but not the validPairs). Is this right?
3, Does fanc (https://github.com/vaquerizaslab/fanc) can be used to generate the inputs for dcHiC
?
Best wishes,
Zheng zhuqing
Hi author
when I was run "Rscript /public/home/xhhuang/biosoft/dcHiC/dchicf.r --file input_f2.txt --pcatype cis --dirovwt T --cthread 1 --pthread 1", the result show that
Error in $<-.data.frame
(*tmp*
, "expcc", value = 1) :
replacement has 1 row, data has 0
Calls: lapply -> FUN ->
Execution halted
I am sure the input file is ture.
the input file format is that
./data/DPA0_chr1.matrix ./data/chr1_abs.bed DPA0_100kb DPA0
./data/DPA5_chr1.matrix ./data/chr1_abs.bed DPA5_100kb DPA5
can you tell me how I can resolve this problem?
thanks
Thanks for your great software—I've been eager to try it out.
I've encountered an error when running gofilter.py
. I'd like to ask for your advice for troubleshooting. I wonder if the error arises from the .bed file containing gene positions?
Calling gofilter.py
:
python /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/gofilter.py \
-dir /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC \
-diffcompt /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC/DifferentialCompartment/MultiComparison_differential_compartments.bedGraph \
-config config_cardioD0.txt \
-outprefix u1_a1.cardioD0 \
-genome hg38 \
-geneBed /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/data/generate_annotation_model/gencode.v35.annotation.bed \
-runOption 1 \
-orientation 1
Head of gencode.v35.annotation.bed
:
chr1 11868 14409 ENSG00000223972
chr1 14403 29570 ENSG00000227232
chr1 17368 17436 ENSG00000278267
chr1 29553 31109 ENSG00000243485
chr1 30365 30503 ENSG00000284332
chr1 34553 36081 ENSG00000237613
chr1 52472 53312 ENSG00000268020
chr1 57597 64116 ENSG00000240361
chr1 65418 71585 ENSG00000186092
chr1 89294 133723 ENSG00000238009
Error:
Slack Given In GO Analysis: 0
Rscript /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/cluster.r direction Active2Inactive cardioD2,cardioD5,cardioD14,endoD0,endoD2,endoD6,endoD14 cardioD0 /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC 1
[1] "1"
Rscript /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/cluster.r 0 /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/gencode.v35.annotation.bed u1_a1.cardioD0.bedGraph 2
[1] "2"
curl -H 'Content-Type: text/json' -d '{"Symbols":["gene","ensg00000231510","ensg00000260972","ensg00000284616","ensg00000284666","ensg00000284692","ensg00000116641","ensg00000132849","ensg00000132854","ensg00000162607","ensg00000200174","ensg00000201153","ensg00000234088","ensg00000234204","ensg00000236646","ensg00000237227","ensg00000240563","ensg00000242860","ensg00000263908","ensg00000283690","ensg00000125703","ensg00000132855","ensg00000213703","ensg00000229537","ensg00000234318","ensg00000235545","ensg00000237163","ensg00000270549","ensg00000278967","ensg00000088035","ensg00000142856","ensg00000187140","ensg00000203605","ensg00000223683","ensg00000224209","ensg00000227485","ensg00000228734","ensg00000229225","ensg00000230798","ensg00000236674","ensg00000252259","ensg00000252784","ensg00000275836","ensg00000286429","ensg00000286455","ensg00000064886","ensg00000085465","ensg00000116455","ensg00000116459","ensg00000121933","ensg00000134216","ensg00000134255","ensg00000143110","ensg00000156171","ensg00000162777","ensg00000173947","ensg00000199890","ensg00000200360","ensg00000203878","ensg00000225672","ensg00000227179","ensg00000229283","ensg00000232240","ensg00000233337","ensg00000234020","ensg00000236012","ensg00000236040","ensg00000243960","ensg00000252760","ensg00000260948","ensg00000272982","ensg00000273221","ensg00000282608","ensg00000159455","ensg00000163202","ensg00000163206","ensg00000163207","ensg00000169474","ensg00000169509","ensg00000172155","ensg00000184148","ensg00000185962","ensg00000185966","ensg00000186207","ensg00000186226","ensg00000186844","ensg00000187170","ensg00000187173","ensg00000187180","ensg00000187223","ensg00000187238","ensg00000196734","ensg00000197084","ensg00000198854","ensg00000203786","ensg00000224308","ensg00000226947","ensg00000229713","ensg00000233819","ensg00000235942","ensg00000240386","ensg00000244057","ensg00000283227","ensg00000285753","ensg00000285946","ensg00000143546","ensg00000143556","ensg00000159516","ensg00000159527","ensg00000163209","ensg00000163216","ensg00000163218","ensg00000163220","ensg00000163221","ensg00000169469","ensg00000184330","ensg00000196805","ensg00000197364","ensg00000203781","ensg00000203782","ensg00000203783","ensg00000203784","ensg00000203785","ensg00000207321","ensg00000224784","ensg00000229035","ensg00000229699","ensg00000230779","ensg00000234262","ensg00000237008","ensg00000241794","ensg00000244094","ensg00000252920","ensg00000117036","ensg00000132694","ensg00000224520","ensg00000228239","ensg00000229961","ensg00000235700","ensg00000237842","ensg00000253831","ensg00000271736","ensg00000284592","ensg00000286005","ensg00000286073","ensg00000286151","ensg00000165733","ensg00000196693","ensg00000229630","ensg00000230425","ensg00000231009","ensg00000232109","ensg00000233515","ensg00000233837","ensg00000234420","ensg00000234864","ensg00000234944","ensg00000251783","ensg00000252416","ensg00000259869","ensg00000263795","ensg00000270552","ensg00000270762","ensg00000272319","ensg00000272387","ensg00000277479","ensg00000285884","ensg00000070748","ensg00000178440","ensg00000178645","ensg00000187714","ensg00000197444","ensg00000204149","ensg00000204152","ensg00000222108","ensg00000225830","ensg00000226389","ensg00000227345","ensg00000229870","ensg00000230166","ensg00000271237","ensg00000285803","ensg00000288603","ensg00000099290","ensg00000188611","ensg00000198964","ensg00000225137","ensg00000225303","ensg00000226631","ensg00000233011","ensg00000235618","ensg00000279863","ensg00000286401","ensg00000149452","ensg00000149742","ensg00000168004","ensg00000184999","ensg00000196600","ensg00000197658","ensg00000239924","ensg00000253547","ensg00000256041","ensg00000256181","ensg00000256847","ensg00000256863","ensg00000275598","ensg00000287412","ensg00000111371","ensg00000134294","ensg00000239397","ensg00000257261","ensg00000257496","ensg00000258096","ensg00000274591","ensg00000275481","ensg00000278896","ensg00000139209","ensg00000271642","ensg00000272369","ensg00000272963","ensg00000274723","ensg00000139211","ensg00000179715","ensg00000199566","ensg00000247774","ensg00000257807","ensg00000257906","ensg00000257924","ensg00000257925","ensg00000258116","ensg00000258181","ensg00000258352","ensg00000258369","ensg00000263838","ensg00000264906","ensg00000276454","ensg00000123338","ensg00000123360","ensg00000135413","ensg00000135426","ensg00000135447","ensg00000161634","ensg00000172551","ensg00000257634","ensg00000257780","ensg00000257824","ensg00000270858","ensg00000123307","ensg00000170605","ensg00000179695","ensg00000179899","ensg00000179919","ensg00000184954","ensg00000185821","ensg00000187857","ensg00000188324","ensg00000196534","ensg00000197706","ensg00000203408","ensg00000205327","ensg00000205328","ensg00000205329","ensg00000205330","ensg00000205331","ensg00000213451","ensg00000224622","ensg00000227423","ensg00000230307","ensg00000233606","ensg00000257350","ensg00000257414","ensg00000257757","ensg00000257870","ensg00000258763","ensg00000111596","ensg00000135643","ensg00000222405","ensg00000257139","ensg00000257815","ensg00000258168","ensg00000279530","ensg00000287132","ensg00000226118","ensg00000237175","ensg00000276476","ensg00000225777","ensg00000226507","ensg00000230535","ensg00000231650","ensg00000232187","ensg00000234685","ensg00000237952","ensg00000253094","ensg00000262198","ensg00000262619","ensg00000283075","ensg00000287357","ensg00000102683","ensg00000151835","ensg00000207157","ensg00000227893","ensg00000229483","ensg00000229558","ensg00000232163","ensg00000232977","ensg00000233440","ensg00000235205","ensg00000236803","ensg00000252952","ensg00000151332","ensg00000188831","ensg00000229415","ensg00000238540","ensg00000252312","ensg00000257520","ensg00000257585","ensg00000257720","ensg00000257826","ensg00000258342","ensg00000258844","ensg00000259104","ensg00000283098","ensg00000211935","ensg00000211937","ensg00000211938","ensg00000211941","ensg00000211942","ensg00000211943","ensg00000211944","ensg00000211945","ensg00000211946","ensg00000211947","ensg00000211949","ensg00000211950","ensg00000211951","ensg00000211952","ensg00000211955","ensg00000211956","ensg00000211957","ensg00000211958","ensg00000211959","ensg00000228757","ensg00000228966","ensg00000231475","ensg00000232216","ensg00000238275","ensg00000253149","ensg00000253240","ensg00000253294","ensg00000253325","ensg00000253345","ensg00000253359","ensg00000253367","ensg00000253387","ensg00000253412","ensg00000253440","ensg00000253441","ensg00000253458","ensg00000253462","ensg00000253465","ensg00000253467","ensg00000253482","ensg00000253491","ensg00000253587","ensg00000253709","ensg00000253763","ensg00000253780","ensg00000253883","ensg00000253895","ensg00000253957","ensg00000253989","ensg00000254045","ensg00000254046","ensg00000254053","ensg00000254174","ensg00000254203","ensg00000254215","ensg00000254228","ensg00000254289","ensg00000254326","ensg00000270474","ensg00000270550","ensg00000270816","ensg00000271201","ensg00000273894","ensg00000276210","ensg00000276775","ensg00000278473","ensg00000282122","ensg00000282639","ensg00000282651","ensg00000283195","ensg00000283464","ensg00000283562","ensg00000283607","ensg00000283948","ensg00000182256","ensg00000200326","ensg00000214254","ensg00000228740","ensg00000258624","ensg00000258970","ensg00000259152","ensg00000259168","ensg00000261426","ensg00000104044","ensg00000228992","ensg00000232394","ensg00000258594","ensg00000258853","ensg00000287922","ensg00000122254","ensg00000243716","ensg00000257838","ensg00000260905","ensg00000260973","ensg00000277041","ensg00000283213","ensg00000052344","ensg00000089280","ensg00000099365","ensg00000103490","ensg00000103496","ensg00000103507","ensg00000103510","ensg00000140675","ensg00000140678","ensg00000140682","ensg00000140688","ensg00000140691","ensg00000151006","ensg00000156885","ensg00000156886","ensg00000167394","ensg00000167395","ensg00000167397","ensg00000169896","ensg00000169900","ensg00000176723","ensg00000177238","ensg00000178226","ensg00000232748","ensg00000255439","ensg00000260060","ensg00000260267","ensg00000260304","ensg00000260740","ensg00000260757","ensg00000260911","ensg00000261124","ensg00000261245","ensg00000261359","ensg00000261385","ensg00000261474","ensg00000262366","ensg00000262766","ensg00000263343","ensg00000277543","ensg00000278133","ensg00000280132","ensg00000280160","ensg00000131797","ensg00000169877","ensg00000180663","ensg00000185947","ensg00000197302","ensg00000197476","ensg00000213547","ensg00000237185","ensg00000259810","ensg00000259874","ensg00000259950","ensg00000260010","ensg00000260218","ensg00000260472","ensg00000260568","ensg00000260625","ensg00000260628","ensg00000260631","ensg00000260722","ensg00000260883","ensg00000261284","ensg00000261289","ensg00000261457","ensg00000261475","ensg00000261614","ensg00000261648","ensg00000261731","ensg00000261741","ensg00000276867","ensg00000278885","ensg00000205456","ensg00000223931","ensg00000230267","ensg00000259822","ensg00000260048","ensg00000260307","ensg00000260327","ensg00000260344","ensg00000260402","ensg00000260516","ensg00000260540","ensg00000260575","ensg00000260584","ensg00000260649","ensg00000260662","ensg00000260847","ensg00000260866","ensg00000261127","ensg00000261233","ensg00000261541","ensg00000261704","ensg00000261727","ensg00000270472","ensg00000279997","ensg00000286473","ensg00000200434","ensg00000256642","ensg00000259987","ensg00000259990","ensg00000260087","ensg00000260207","ensg00000261197","ensg00000261440","ensg00000262561","ensg00000279800","ensg00000283065","ensg00000284209","ensg00000286968","ensg00000288300","ensg00000102910","ensg00000121270","ensg00000140798","ensg00000196470","ensg00000240793","ensg00000260347","ensg00000260688","ensg00000261017","ensg00000261538","ensg00000261802","ensg00000275909","ensg00000280067","ensg00000288026","ensg00000108381","ensg00000127780","ensg00000132359","ensg00000141255","ensg00000142163","ensg00000159961","ensg00000172146","ensg00000172150","ensg00000180016","ensg00000180042","ensg00000180068","ensg00000180090","ensg00000183024","ensg00000184166","ensg00000221882","ensg00000255095","ensg00000261848","ensg00000262085","ensg00000262106","ensg00000262628","ensg00000267129","ensg00000280268","ensg00000285760","ensg00000108684","ensg00000263435","ensg00000264643","ensg00000265115","ensg00000265125","ensg00000265356","ensg00000265544","ensg00000265689","ensg00000265697","ensg00000279668","ensg00000283381","ensg00000283417","ensg00000006059","ensg00000094796","ensg00000108417","ensg00000108516","ensg00000108759","ensg00000126337","ensg00000131737","ensg00000131738","ensg00000171360","ensg00000171396","ensg00000180386","ensg00000186860","ensg00000187272","ensg00000188581","ensg00000196156","ensg00000197079","ensg00000198083","ensg00000198090","ensg00000198271","ensg00000198443","ensg00000204873","ensg00000204880","ensg00000204887","ensg00000212657","ensg00000212658","ensg00000212659","ensg00000212721","ensg00000212722","ensg00000212724","ensg00000212725","ensg00000212901","ensg00000213416","ensg00000213417","ensg00000214518","ensg00000221852","ensg00000221880","ensg00000223125","ensg00000225438","ensg00000226776","ensg00000229351","ensg00000233014","ensg00000234859","ensg00000236473","ensg00000237183","ensg00000237230","ensg00000239886","ensg00000240542","ensg00000240871","ensg00000241241","ensg00000241595","ensg00000244537","ensg00000248807","ensg00000251439","ensg00000287602","ensg00000067900","ensg00000141449","ensg00000221139","ensg00000244527","ensg00000251886","ensg00000263748","ensg00000265751","ensg00000265948","ensg00000265984","ensg00000134504","ensg00000154080","ensg00000171885","ensg00000260372","ensg00000263382","ensg00000263677","ensg00000263846","ensg00000265369","ensg00000266184","ensg00000266549","ensg00000275805","ensg00000275900","ensg00000276221","ensg00000277534","ensg00000105568","ensg00000142556","ensg00000167554","ensg00000167555","ensg00000196214","ensg00000196267","ensg00000197608","ensg00000197619","ensg00000198464","ensg00000198633","ensg00000204611","ensg00000207265","ensg00000208002","ensg00000221923","ensg00000243680","ensg00000256087","ensg00000258405","ensg00000260160","ensg00000267827","ensg00000267927","ensg00000268015","ensg00000268458","ensg00000269102","ensg00000269535","ensg00000269776","ensg00000269834","ensg00000270248","ensg00000274380","ensg00000275055","ensg00000277562","ensg00000277977","ensg00000278543","ensg00000288253","ensg00000114999","ensg00000115008","ensg00000125538","ensg00000125571","ensg00000125611","ensg00000125630","ensg00000136688","ensg00000144130","ensg00000144136","ensg00000169607","ensg00000180152","ensg00000207383","ensg00000227368","ensg00000228251","ensg00000231747","ensg00000232090","ensg00000236124","ensg00000237753","ensg00000243389","ensg00000280228","ensg00000287937","ensg00000125618","ensg00000125637","ensg00000136682","ensg00000136689","ensg00000136694","ensg00000136695","ensg00000136696","ensg00000136697","ensg00000184492","ensg00000189223","ensg00000201805","ensg00000231292","ensg00000234174","ensg00000234997","ensg00000272563","ensg00000080293","ensg00000115107","ensg00000144119","ensg00000155368","ensg00000171227","ensg00000186132","ensg00000229867","ensg00000231013","ensg00000264833","ensg00000100987","ensg00000100994","ensg00000100997","ensg00000101003","ensg00000101004","ensg00000154930","ensg00000197586","ensg00000202414","ensg00000225069","ensg00000225344","ensg00000227379","ensg00000230725","ensg00000274414","ensg00000274507","ensg00000275358","ensg00000276952","ensg00000277938","ensg00000279322","ensg00000286472","ensg00000101109","ensg00000101443","ensg00000124102","ensg00000124107","ensg00000124134","ensg00000124145","ensg00000124155","ensg00000124157","ensg00000124159","ensg00000124232","ensg00000124233","ensg00000124251","ensg00000168703","ensg00000175121","ensg00000204070","ensg00000232880","ensg00000233352","ensg00000237068","ensg00000237464","ensg00000243995","ensg00000244274","ensg00000252021","ensg00000254806","ensg00000273555","ensg00000275894","ensg00000277022","ensg00000283142","ensg00000264063","ensg00000264462","ensg00000278931","ensg00000279167","ensg00000279213","ensg00000279501","ensg00000279579","ensg00000279615","ensg00000279990","ensg00000280243","ensg00000286033","ensg00000177822","ensg00000225356","ensg00000251336","ensg00000251433","ensg00000251742","ensg00000286860","ensg00000287948","ensg00000122012","ensg00000145703","ensg00000248127","ensg00000249014","ensg00000249777","ensg00000250348","ensg00000251107","ensg00000251235","ensg00000251342","ensg00000251668","ensg00000252833","ensg00000254893","ensg00000113391","ensg00000185261","ensg00000286577","ensg00000133302","ensg00000175471","ensg00000232578","ensg00000243806","ensg00000249175","ensg00000249545","ensg00000251340","ensg00000251544","ensg00000254132","ensg00000270133","ensg00000276514","ensg00000079819","ensg00000118507","ensg00000118520","ensg00000218857","ensg00000219776","ensg00000130363","ensg00000164691","ensg00000164694","ensg00000224478","ensg00000226032","ensg00000231178","ensg00000233682","ensg00000234777","ensg00000235086","ensg00000271913","ensg00000285492","ensg00000286533","ensg00000112096","ensg00000112110","ensg00000120437","ensg00000120438","ensg00000130368","ensg00000146453","ensg00000146457","ensg00000197081","ensg00000206910","ensg00000207392","ensg00000216480","ensg00000220305","ensg00000236823","ensg00000237927","ensg00000251988","ensg00000276413","ensg00000285427","ensg00000112499","ensg00000146477","ensg00000175003","ensg00000213071","ensg00000213073","ensg00000216516","ensg00000230234","ensg00000268257","ensg00000287656","ensg00000152926","ensg00000173041","ensg00000182722","ensg00000189316","ensg00000196247","ensg00000197008","ensg00000198039","ensg00000213462","ensg00000213640","ensg00000213642","ensg00000223476","ensg00000223974","ensg00000224172","ensg00000224669","ensg00000228653","ensg00000234338","ensg00000235349","ensg00000270948","ensg00000271550","ensg00000275667","ensg00000276475","ensg00000277206","ensg00000286342","ensg00000286456","ensg00000287317","ensg00000287580","ensg00000287869","ensg00000002726","ensg00000002933","ensg00000055118","ensg00000106560","ensg00000106565","ensg00000133561","ensg00000133574","ensg00000164867","ensg00000177590","ensg00000179144","ensg00000196329","ensg00000213203","ensg00000213205","ensg00000232361","ensg00000241134","ensg00000243853","ensg00000270990","ensg00000271568","ensg00000281887","ensg00000105982","ensg00000105983","ensg00000130675","ensg00000146909","ensg00000182648","ensg00000206938","ensg00000224903","ensg00000230033","ensg00000234450","ensg00000279418","ensg00000164808","ensg00000188873","ensg00000200986","ensg00000215177","ensg00000222099","ensg00000248347","ensg00000248531","ensg00000253502","ensg00000253817","ensg00000254348","ensg00000255366","ensg00000285992","ensg00000046889","ensg00000165078","ensg00000254253","ensg00000255130","ensg00000255531","ensg00000165084","ensg00000248801","ensg00000252210","ensg00000253964","ensg00000254051","ensg00000255206","ensg00000008513","ensg00000212273","ensg00000253165","ensg00000253561","ensg00000253593","ensg00000253916","ensg00000253970","ensg00000254313","ensg00000261220","ensg00000271240","ensg00000276193","ensg00000277576","ensg00000283176","ensg00000283197","ensg00000276128","ensg00000277737","ensg00000286506","ensg00000286916","ensg00000287800","ensg00000154330","ensg00000187559","ensg00000196873","ensg00000224958","ensg00000225337","ensg00000226904","ensg00000229019","ensg00000233178","ensg00000234394","ensg00000279706","ensg00000107242","ensg00000181778","ensg00000187866","ensg00000207000","ensg00000224025","ensg00000226337","ensg00000234506","ensg00000236733","ensg00000236998","ensg00000095303","ensg00000106852","ensg00000119421","ensg00000119446","ensg00000136834","ensg00000148187","ensg00000175764","ensg00000185681","ensg00000233616","ensg00000234156","ensg00000266583","ensg00000269970","ensg00000011454","ensg00000056586","ensg00000136939","ensg00000136940","ensg00000148215","ensg00000165202","ensg00000165204","ensg00000171448","ensg00000171459","ensg00000171481","ensg00000171496","ensg00000171501","ensg00000171505","ensg00000173679","ensg00000186130","ensg00000197233","ensg00000212447","ensg00000222351","ensg00000226783","ensg00000228914","ensg00000232387","ensg00000233425","ensg00000239590","ensg00000261094","ensg00000280094","ensg00000286718"]}' https://toppgene.cchmc.org/API/lookup
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 64912 0 46532 100 18380 32562 12862 0:00:01 0:00:01 --:--:-- 45393
curl -H 'Content-Type: text/json' -d '{"Genes":[105376686,85440,10207,163782,7398,106479731,100271142,54596,106480970,100422902,100422946,84938,27329,105378769,29929,23421,27022,107075109,199899,100996301,441887,100873288,102464823,102724319,1117,5016,79084,515,57413,27159,10390,128346,128338,79961,128344,106481431,149620,441897,106480360,100420342,100873292,140,26239,84648,4184,3713,6698,54544,353134,163778,353142,353145,254910,353135,353131,199834,353139,353140,353141,353143,353133,100129271,448834,101927988,450210,450209,450208,448835,353137,353144,110806278,450211,6279,6278,6706,114771,6707,6703,57115,6280,6283,6699,338324,6701,645922,127481,4014,574414,149018,6704,106479637,112488748,101928009,391102,6700,6705,2117,9826,149501,360155,440695,101929959,7582,104266963,728064,100312807,101929397,143341,100129482,107984178,101929445,106480081,106481458,100847014,1103,113218477,282966,6572,55753,101060581,100652748,106479006,267004,728407,100271422,109729125,109617009,101930591,56624,259230,728532,728990,100421577,9376,114571,117245,387775,387601,283238,644436,100127954,100533643,81539,54407,100129799,55089,100421550,387853,347902,91523,100233209,100127978,111082992,100506099,100288129,100616486,100616478,3071,5153,90070,9840,5502,117159,118430,644076,58158,441639,341416,121364,390327,390326,390323,283365,254783,403284,403282,254786,390321,403288,403285,81140,390318,390320,390322,4848,27345,106481192,101928062,100873208,100874501,100506622,100873817,6445,26278,100873808,100506680,100874124,100506697,100419955,51562,400206,253970,106480813,106480839,101101773,28473,28457,28452,100293211,101930405,28448,28447,28468,28445,28444,28442,28467,28455,28400,28434,28395,28432,28429,28394,28398,28426,28372,28431,28382,28374,28430,28348,28354,28351,28433,28441,28376,28347,28373,28371,28438,28470,28453,28355,28446,28366,28443,28346,28469,28353,28471,28369,28427,28440,28383,28435,28439,338005,102724971,28428,57289,102723170,28367,107548099,28350,2567,100873644,100420466,101928869,4948,100271207,9956,100132247,653786,5652,2521,112755,29108,6810,10295,84148,6524,3687,7041,64755,79798,339105,1339,3681,79759,9726,79001,3684,260434,283933,493829,146547,100652740,106479052,100132341,51327,10308,107983990,730196,100129315,100131641,100131118,100128384,100130603,102724018,102724127,100533705,28307,100873571,649159,106481738,100887074,731605,100507577,85320,94160,6477,443,8388,23108,84690,8392,8383,26189,8387,9596,390756,4994,8390,4991,4995,653166,100288728,8391,8386,100856809,40,100506677,3883,3881,8688,3882,8689,100653049,3884,8687,84616,100505724,83902,83901,81851,85290,3886,81870,81871,85289,85285,83900,728224,728255,100505753,100533177,100507608,653240,100132386,730755,81872,83896,83755,728279,83895,81850,106481644,85345,106480773,100505782,106480422,85343,728318,101930568,85280,85291,6093,80000,440487,106479619,100128324,284252,83539,361,147429,105372035,728606,102466874,5518,80110,162963,84436,90321,162962,102725206,284370,147657,147658,90317,693228,400713,9668,147660,102724105,102466984,100312842,150465,3552,3553,27178,84269,84172,56300,284958,6574,150468,100128413,7849,23550,150472,3557,27179,26525,27177,84639,200350,654433,106481552,6344,55240,165257,1622,140738,130355,100874111,107105282,106481050,30813,5834,26090,9837,22981,84532,955,284798,6789,10406,5266,6590,3787,6385,51604,6407,8785,11317,6406,27296,128488,149708,90196,107075105,140749,55861,767557,102465487,103504728,103504731,105377573,100128118,106479100,22987,10788,106480689,100873448,83989,285600,84250,79772,100533629,100270852,113523641,109729150,100873257,106478942,2037,9465,383,100421246,100271180,83861,117289,84624,102724053,105378083,100129518,29074,39,677812,4142,154197,9589,3482,677806,642738,100132803,106481866,6582,6581,6580,80350,729603,100271873,109504726,340252,51427,7697,102724456,2086,106481697,100127907,100418814,106480286,102465505,26,55365,3757,26157,28959,474344,55303,4846,474345,168537,100527949,170575,100288724,100874395,100874394,140545,64327,3110,64434,100506380,106480546,101927858,100129517,64433,23514,253986,106479928,100996586,106479109,100128541,80243,57094,100132812,106480785,116328,286189,6482,101927798,101927822,107075177,100873179,100131760,102724904,5239,286380,101060578,572558,8395,169693,116224,106479929,347097,105376072,101927015,5742,26468,4702,92400,347168,92399,158135,254956,100616312,23637,54542,254973,5082,392391,158131,392392,57684,392390,26735,138881,138882,138883,26737,10773,26740,692206,106481711,26742,100631239,26219,347169],"Categories":[{"Type": "GeneOntologyBiologicalProcess", "PValue": 0.05, "MinGenes": 1, "MaxGenes": 1500, "MaxResults": 30, "Correction": "FDR"}]}' https://toppgene.cchmc.org/API/enrich
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 57257 0 52651 100 4606 65081 5693 --:--:-- --:--:-- --:--:-- 70775
Traceback (most recent call last):
File "/Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/gofilter.py", line 154, in <module>
GeneList = results[a]['Genes'][0]['Symbol']
KeyError: 0
I encounter the same errors when using .bed files with gene names and Entrez IDs, e.g.,
chr1 11868 14409 DDX11L1
chr1 14403 29570 NA
chr1 17368 17436 MIR6859-1
chr1 29553 31109 NA
chr1 30365 30503 MIR1302-2
chr1 34553 36081 FAM138A
chr1 52472 53312 NA
chr1 57597 64116 NA
chr1 65418 71585 OR4F5
chr1 89294 133723 LOC100996442
chr1 11868 14409 100287102
chr1 14403 29570 NA
chr1 17368 17436 102466751
chr1 29553 31109 NA
chr1 30365 30503 100302278
chr1 34553 36081 645520
chr1 52472 53312 NA
chr1 57597 64116 NA
chr1 65418 71585 79501
chr1 89294 133723 100996442
When I use the .bed file in your hg38_goldenpathData
directory from Dropbox, I encounter this error:
Slack Given In GO Analysis: 0
Rscript /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/cluster.r direction Active2Inactive cardioD2,cardioD5,cardioD14,endoD0,endoD2,endoD6,endoD14 cardioD0 /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC 1
[1] "1"
Rscript /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/cluster.r 0 /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/data/hg38_goldenpathData/hg38.refGene.bed u1_a1.cardioD0.bedGraph 2
[1] "2"
*****
***** ERROR: Requested column 7, but database file - only has fields 1 - 6.
Error in read.table(text = system(cmd, wait = T, intern = T), h = F) :
no lines available in input
Calls: mapGenes -> read.table
Execution halted
Head of the .bed in the hg38_goldenpathData
, hg38.refGene.bed
:
chr1 11874 11875
chr1 17435 17436
chr1 17435 17436
chr1 17435 17436
chr1 17435 17436
chr1 29369 29370
chr1 30366 30367
chr1 30366 30367
chr1 30366 30367
chr1 30366 30367
I tried a .bed with gene names and without NAs, similar to your mm10 mm10_gene_pos.bed
; head of .bed without NAs:
chr1 11868 14409 DDX11L1
chr1 17368 17436 MIR6859-1
chr1 30365 30503 MIR1302-2
chr1 34553 36081 FAM138A
chr1 65418 71585 OR4F5
chr1 89294 133723 LOC100996442
chr1 187890 187958 MIR6859-2
chr1 450702 451697 OR4F29
chr1 586070 827796 LOC101928626
chr1 685678 686673 OR4F16
The call:
python /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/gofilter.py \
-dir /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC \
-diffcompt /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC/DifferentialCompartment/MultiComparison_differential_compartments.bedGraph \
-config config_cardioD0.txt \
-outprefix u1_a1.cardioD0 \
-genome hg38 \
-geneBed /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/gencode.v35.annotation.2.noNA.bed \
-runOption 1 \
-orientation 1
The error:
Slack Given In GO Analysis: 0
Rscript /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/cluster.r direction Active2Inactive cardioD2,cardioD5,cardioD14,endoD0,endoD2,endoD6,endoD14 cardioD0 /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/results/kga0/2021_0226_pipeline_dcHiC 1
[1] "1"
Rscript /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/cluster.r 0 /Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/gencode.v35.annotation.2.noNA.bed u1_a1.cardioD0.bedGraph 2
[1] "2"
curl -H 'Content-Type: text/json' -d '{"Symbols":["gene","loc105376686","dock7","kank4","l1td1","mir3116-1","mir3116-2","patj","usp1","angptl3","atg4c","linc01739","alg6","foxd3","foxd3-as1","itgb3bp","linc00466","loc102724319","mir6068","adora3","atp5pb","c1orf162","cept1","chi3l2","chia","chiap2","dennd2d","dram2","ovgp1","pgbp","pifo","tmigd3","wdr77","c1orf68","crct1","ivl","kprp","lce1a","lce1b","lce1c","lce1d","lce1e","lce1f","lce2a","lce2b","lce2c","lce2d","lce3a","lce3b","lce3c","lce3d","lce3e","lce4a","lce5a","lce6a","linc01527","smcp","sprr1a","sprr4","lelp1","loc101928009","lor","pglyrp3","pglyrp4","prr9","s100a12","s100a7","s100a7a","s100a8","s100a9","sprr1b","sprr2a","sprr2b","sprr2d","sprr2e","sprr2f","sprr2g","sprr3","arhgef11","cycsp52","etv3","etv3l","bms1","linc01264","linc01518","linc02623","mir5100","znf33b","znf37bp","agap6","c10orf53","chat","ercc6","linc00843","ogdhl","parg","slc18a3","timm23b","asah2","fam21ep","sgms1","washc2a","plaat5","slc22a10","slc22a24","slc22a25","slc22a8","slc22a9","slc38a1","slc38a2","slc38a4","amigo2","mir4494","mir4698","pced1b","pced1b-as1","dcd","glycam1","lacrt","mucl1","nckap1l","pde1b","ppp1r1a","tespa1","neurod4","or10a7","or6c1","or6c2","or6c3","or6c6","or6c65","or6c68","or6c70","or6c74","or6c75","or6c76","or9k2","cnot2","cnot2-dt","kcnmb4","linc00540","linc00327","sacs","sacs-as1","sgcg","linc00609","mbip","sfta3","linc00221","gabrg3","gabrg3-as1","oca2","hs3st2","npipb5","otoap1","armc5","bckdk","c16orf58","cox6a2","fus","itgad","itgam","itgax","kat8","prss36","prss53","prss8","pycard","pycard-as1","pydc1","slc5a2","stx1b","stx4","tgfb1i1","trim72","vkorc1","znf646","znf668","znf843","ahsp","cluhp3","frg2kp","vn1r3","znf267","znf720","tp53tg3d","abcc11","abcc12","lonp2","siah1","aspa","loc100288728","or1a1","or1a2","or1d2","or1d4","or1d5","or1e1","or1e2","or1e3","or1g1","or3a1","or3a2","or3a3","or3a4p","rap1gap2","spata22","aa06","asic2","krt31","krt32","krt33a","krt33b","krt34","krt35","krt36","krt37","krt38","krtap1-1","krtap1-3","krtap1-4","krtap1-5","krtap16-1","krtap17-1","krtap2-1","krtap2-2","krtap2-3","krtap2-4","krtap29-1","krtap3-1","krtap4-1","krtap4-11","krtap4-12","krtap4-2","krtap4-3","krtap4-4","krtap4-5","krtap4-6","krtap4-7","krtap4-8","krtap4-9","krtap9-1","krtap9-2","krtap9-3","krtap9-4","krtap9-6","krtap9-7","krtap9-8","krtap9-9","loc100505782","greb1l","rock1","aqp4","aqp4-as1","chst9","kctd1","loc105372035","mir8057","pcat18","mir643","mir6801","ppp2r1a","znf432","znf480","znf528","znf528-as1","znf534","znf578","znf610","znf614","znf615","znf616","znf766","znf836","znf841","znf880","chchd5","ckap2l","il1a","il1b","il36g","il37","nt5dc4","polr1b","slc20a1","ttl","cbwd2","foxd4l1","il1f10","il1rn","il36a","il36b","il36rn","pax8","pax8-as1","psd4","c1ql2","c2orf76","dbi","loc107105282","sctr","steap3","steap3-as1","tmem37","abhd12","acss1","entpd6","gins1","loc284798","ninl","pygb","vsx1","dbndd2","kcns1","matn4","mir6812","pi3","pigt","rbpjl","sdc4","semg1","semg2","slpi","stk4","sys1","sys1-dbndd2","tp53tg5","wfdc12","wfdc2","wfdc5","loc101930100","mir3648-2","temn3-as1","iqgap2","sv2c","fam172a","kiaa0825","mctp1","slf1","akap7","arg1","epb41l2","fndc1","linc02529","loc105378083","loc112267968","rsph3","tagap","acat2","igf2r","mas1","mrpl18","pnldc1","snora20","snora29","sod2","sod2-ot1","tcp1","wtap","airn","loc729603","lpal2","slc22a1","slc22a2","slc22a3","erv3-1","loc441239","mir6839","znf107","znf117","znf138","znf273","znf680","aoc1","gimap1","gimap1-gimap5","gimap2","gimap4","gimap5","gimap6","gimap7","kcnh2","nos3","tmem176a","tmem176b","linc00244","linc01006","lmbr1","loc101927858","mnx1","nom1","rnf32","spidr","cpa6","prex2","c8orf34","c8orf34-as1","loc101927798","loc101927822","st3gal1","loc102724904","cbwd3","foxd4l3","pgm5","pgm5-as1","fam122a","linc01506","pip5k1b","tmem252","tmem252-dt","lhx6","mir4478","morn5","mrrf","ndufa8","or1j1","ptgs1","rbm18","ttll11","gpr21","or1b1","or1j2","or1j4","or1k1","or1l1","or1l3","or1l4","or1l6","or1l8","or1n1","or1n2","or1q1","or5c1","pdcl","rc3h2","snord90","zbtb26","zbtb6"]}' https://toppgene.cchmc.org/API/lookup
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 32235 0 28191 100 4044 43774 6279 --:--:-- --:--:-- --:--:-- 50132
curl -H 'Content-Type: text/json' -d '{"Genes":[105376686,85440,163782,54596,100422902,100422946,10207,7398,27329,84938,105378769,29929,27022,100996301,23421,199899,102724319,102464823,140,515,128346,10390,1117,27159,149620,79961,128338,5016,441897,128344,57413,79084,100129271,54544,3713,448834,353131,353132,353133,353134,353135,353137,353139,26239,353140,353141,353142,353143,353144,84648,353145,199834,254910,448835,101927988,4184,6698,163778,149018,101928009,4017,114771,57115,574414,6283,6278,338324,6279,6280,6699,6700,6701,6703,6704,6705,6706,6707,9826,360155,2117,440695,9790,104266963,101929397,101929445,100847014,7582,100129482,414189,282966,1103,2074,102902672,55753,8505,6572,100652748,56624,100421577,259230,387680,117245,387775,283238,387601,9376,114571,81539,54407,55089,347902,100616478,100616486,91523,100233209,117159,644076,90070,118430,3071,5153,5502,9840,58158,121364,390321,341416,254786,283365,403282,403284,390327,254783,390323,390326,441639,4848,101928062,27345,100506622,100506697,26278,100506680,6445,101101773,51562,253970,338005,2567,101928869,4948,9956,100132247,653786,79798,10295,64755,1339,2521,3681,3684,3687,84148,146547,339105,5652,29108,100652740,260434,6524,112755,6810,7041,493829,79001,9726,79759,283933,51327,100132341,102724018,317702,10308,124411,729264,85320,94160,83752,6477,443,100288728,8383,26189,4991,653166,8386,8387,8388,8389,8390,4994,4995,8392,390756,23108,84690,100506677,40,3881,3882,3883,3884,3885,3886,8689,8688,8687,81851,81850,728255,83895,100505753,83902,81872,728279,730755,85294,100533177,83896,85285,653240,83755,85291,85290,84616,85289,81871,100132476,728224,100132386,728318,83899,83900,85280,100507608,100505724,83901,81870,100505782,80000,6093,361,147429,83539,284252,105372035,102466874,728606,693228,102466984,5518,9668,147657,84436,102724105,147658,147660,162963,80110,284370,90317,90321,162962,284371,400713,84269,150468,3552,3553,56300,27178,284958,84172,6574,150465,150472,200350,84639,3557,27179,27177,26525,7849,654433,23550,165257,130355,1622,6344,55240,100874111,140738,26090,84532,955,9837,284798,22981,5834,30813,55861,3787,8785,102465487,5266,51604,11317,6385,6406,6407,6590,6789,90196,767557,27296,128488,10406,149708,101930100,103504731,105377573,10788,22987,83989,285600,79772,84250,9465,383,2037,84624,102724053,105378083,112267968,83861,117289,39,3482,4142,29074,154197,677806,677812,6648,100129518,6950,9589,100271873,80350,6580,6582,6581,2086,441239,102465505,51427,51351,7697,10793,340252,26,170575,100527949,26157,55303,55340,474344,168537,3757,4846,55365,28959,64433,100506380,64327,3110,64434,140545,23514,57094,80243,116328,286189,101927798,101927822,6482,102724904,445571,286380,5239,572558,116224,101927015,8395,169693,105376072,26468,100616312,254956,92399,4702,347168,5742,92400,158135,2844,347169,26740,26219,392392,26737,26735,254973,392390,138881,138883,138882,158131,392391,5082,54542,692206,57684,10773],"Categories":[{"Type": "GeneOntologyBiologicalProcess", "PValue": 0.05, "MinGenes": 1, "MaxGenes": 1500, "MaxResults": 30, "Correction": "FDR"}]}' https://toppgene.cchmc.org/API/enrich
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 54105 0 51091 100 3014 69322 4089 --:--:-- --:--:-- --:--:-- 73313
Traceback (most recent call last):
File "/Volumes/SSHD/Dropbox/UW/projects/2020_endothelial-diff/src/dcHiC/dchic/gofilter.py", line 154, in <module>
GeneList = results[a]['Genes'][0]['Symbol']
KeyError: 0
Hi!
I installed the latest version of dchic. Everything is fine but I can't install hashmap.
I'm using R version 3.6.1 and I saw that hashmap was build in version 3.2.1
When I tried to install on version 3.2.1 and I have the issue with the R library bigstatsr that I can't install for that R version.
Any suggestion?
Hello,
The first step Rscript dchicf.r --file input_files.txt --pcatype cis
is working very well for the compartment calling.
But I am getting an error for the second step Rscript dchicf.r --file input_files.txt --pcatype select
:
Error in hclust(as.dist(round(1 - cor(pc.mat), 4))) :
NA/NaN/Inf in foreign function call (arg 10)
Calls: pcselect -> pcselectioncore -> hclust
Execution halted
It seems that it is working for the first two samples as the stdout return:
Running intra 1 in poll_0197 sample
Running intra 1 in poll_3654 sample
Here is my input files:
Bovin-0197.ARS-UCD1.2.mapq_10.50000.txt Bovin-0197.ARS-UCD1.2.mapq_10.50000.bed poll_0197 poll
Bovin-3654.ARS-UCD1.2.mapq_10.50000.txt Bovin-3654.ARS-UCD1.2.mapq_10.50000.bed poll_3654 poll
Bovin-669.ARS-UCD1.2.mapq_10.50000.txt Bovin-669.ARS-UCD1.2.mapq_10.50000.bed unp_669 unp
Bovin-977.ARS-UCD1.2.mapq_10.50000.txt Bovin-977.ARS-UCD1.2.mapq_10.50000.bed unp_977 unp
Any idea?
I'm running an analysis where I have no replicates, and was following the wiki instructions. However, analysis of chr Y failed, and I found out that it fails at this line, as the humanparams.txt file does not contain info on chr Y. This info also seems to be missing from the mouse file. Is this intentional (I'm not an expert on HiC analysis)?
Thank you for developing this tool.
I have finished running visualization without error and get the html file. However, it shows error when I open the html file.
My code is:
Rscript dchicf.r --file input_smed.txt --pcatype viz --diffdir smed_dchic_150Kb --gfolder smed_150000_goldenpathData --genome g4w.
Is it because the web IGV doesn't have my genome. How can I open it locally.
Thanks!
Hi,
Thanks for providing this useful tool! I meet a error
Performing block wise correlation calculation Error: upper value must be greater than lower value Execution halted
when I tried to run
Rscript /data/kun/Softwares/dcHiC/demo/dcHiC_demo/scripts/dchicf.r --file input.NT_PT_RT.txt --pcatype cis --dirovwt T --cthread 1 --pthread 1
It looks like this error arises from generating chrY.pca.txt. I attached the files related to generate chrY.pca.txt. Thanks for your help!
chrY.txt
chrY.precmat.txt
chrY.bed.txt
chrY.distparam.txt
Best,
Kun
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.