xiaotaowang / hicpeaks Goto Github PK
View Code? Open in Web Editor NEWA Python implementation for BH-FDR and HiCCUPS
License: GNU General Public License v3.0
A Python implementation for BH-FDR and HiCCUPS
License: GNU General Public License v3.0
HI,
is there a way to call peaks genome-wide instead of the chromosome by chromosome? I guess I could make multiple chrxx_chrxx.txt files, and then concatenate all the calls, but i was hoping there was a more streamlined way of doing this.
HI
Thank you for your work.
When I use the toCooler to change the matrix obtain from HiC-Pro into cool format, an error occured:
Traceback (most recent call last):
File "/opt/conda/bin/toCooler", line 131, in
run()
File "/opt/conda/bin/toCooler", line 112, in run
from hicpeaks.utilities import Genome, balance
File "/opt/conda/lib/python2.7/site-packages/hicpeaks/utilities.py", line 13, in
from cooler.io import create, parse_cooler_uri, CoolerMerger
ImportError: cannot import name parse_cooler_uri
so I edited the utilities.py as
from cooler.util import binnify, parse_cooler_uri
from cooler.io import create
from cooler.reduce import CoolerMerger
and it passed. Please check and confirm.
In pyHICCUPS, I get the following error:
ValueError: Offset 1687 (index 1687) out of bounds
It occurs in sparse.diags in the following code chunk (~line 135). Do you know what causes it? –
H = Lib.matrix(balance=False, sparse=True).fetch(key)
cHeatMap = Lib.matrix(balance=True, sparse=True).fetch(key)
# Customize Sparse Matrix ...
chromLen = H.shape[0]
num = args.maxapart // resolution + args.maxww + 1
Diags = [H.diagonal(i) for i in np.arange(num)]
M = sparse.diags(Diags, np.arange(num), format='csr')
Hi XiaoTao,
Looks like your pyHICCUPS implementation does not support inter-chromosomal loops. Is there a quick fix to the code to output inter-chromosomal loops or you already implemented it?
Thanks!
Dear,
when I use example provided for call loops, command below:
python ../scripts/pyHICCUPS -O K562-MboI-HICCUPS-loops.txt -p K562-MboI-parts.cool::40000 --pw 1 --ww 3
An error occured:
root INFO @ 09/03/18 13:59:59: Loading Hi-C data ...
root INFO @ 09/03/18 13:59:59: Calling Peaks ...
root INFO @ 09/03/18 13:59:59: Chromosome 21 ...
Traceback (most recent call last):
File "../scripts/pyHICCUPS", line 522, in
run()
File "../scripts/pyHICCUPS", line 181, in run
results = map_(worker, Params)
File "../scripts/pyHICCUPS", line 130, in worker
Diags = [H.diagonal(i) for i in np.arange(num)]
TypeError: diagonal() takes exactly 1 argument (2 given)
Could you help me?
Hi, thank you very much for this tool, I have tried to use it and was pleasantly surprised - very easy to use and fast!
I however have a small question about the output coordinates of pyHICCUPS. What do they correspond to? What is the difference between loc_1
and centroid_x
? Sometimes they are the same, and sometimes they are not... And how is radius determined?
Thank you,
Ilya
Hello Xiaotao,
I am a postdoctor from HZAU and now is learning data analysis for Hi-C. Recently I am using the HiCPeaks software to transform the raw matrix generated by HiC-pro to cool file. Some problems can't be solved.
According to your guidelines, I tried to substract interaction information for chr01 from the raw matrix HPC9_150000.matrix. According to file HPC9_150000_abs.bed , the chr01 is binned to 754 windows. So I generated a file with the code
awk '$1<=754&&$2<=754{print}' HPC9_150000.matrix >1_1.txt
head -5 HPC9_150000.matrix
1 1 1599
1 2 577
1 3 117
1 4 103
1 5 68
head -5 HPC9_150000_abs.bed
Chr01 0 150000 1
Chr01 150000 300000 2
Chr01 300000 450000 3
Chr01 450000 600000 4
Chr01 600000 750000 5
Then I run toCooler with code
toCooler -O HPC9_1.cool -d datasets --nproc 1 --chromsizes-file Ga_1.chromsizes &
It generates error "IndexError: index 754 is out of bounds for axis 0 with size 754"
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/hicpeaks-0.3.4-py3.6.egg/EGG-INFO/scripts/toCooler", line 128, in run
balance(cooler_uri, nproc=args.nproc)
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/hicpeaks-0.3.4-py3.6.egg/hicpeaks/utilities.py", line 417, in balance
map=map_)
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/cooler/balance.py", line 332, in balance_cooler
.reduce(add, np.zeros(n_bins))
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/cooler/tools.py", line 244, in reduce
return reduce(binop, iter(self.run()), init)
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/cooler/tools.py", line 54, in apply_pipeline
data = func(chunk, data)
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/cooler/balance.py", line 46, in _zero_trans
mask = chrom_ids[pixels['bin1_id']] != chrom_ids[pixels['bin2_id']]
File "/public/home/software/opt/bio/software/HiCPeaks/0.3.4/lib/python3.6/site-packages/pandas/core/arrays/categorical.py", line 2149, in __getitem__values=self._codes[key], dtype=self.dtype, fastpath=True
IndexError: index 754 is out of bounds for axis 0 with size 754
I noticed that the number of first two columes in input 1_1.txt file should be smaller than binned chr windows 754, instead of equal or larger than 754.
I tried to analyze the chr02, I used the code
awk '$1>=755&&$1<=1415&&$2>=755&&$2<=1415{print}' HPC9_150000.matrix >2_2.txt
I replaced 1_1.txt with 2_2.txt under directory ./150K/, then it generated similar errors "IndexError: index 755 is out of bounds for axis 0 with size 661" 661 is the binned number of chr02.
How to prepare the input file correctlly?
By the way, should I prepare the chr_chr.txt files for all the chromosomes one by one ?
Should I put all these chr_chr.txt files under the same ./150K/ directory ?
I hope you can reply. Thank you so much !!!
You can reply through email [email protected] if you think it is more convenient.
Best wishes.
Pengcheng
pyBHFDR -O K562-MboI-BHFDR-loops.txt -p Ga.40000.a.cool::40000 -C 4 --pw 1 --ww 3
root INFO @ 10/08/23 15:58:09: Python Version: 3.6.15
root INFO @ 10/08/23 15:58:09:
root INFO @ 10/08/23 15:58:10: Loading Hi-C data ...
root INFO @ 10/08/23 15:58:10: Calling Peaks ...
Traceback (most recent call last):
File "/data/Software/miniconda3/envs/TADlib/bin/pyBHFDR", line 185, in
run()
File "/data/Software/miniconda3/envs/TADlib/bin/pyBHFDR", line 169, in run
for key, pixel_table in results:
File "/data/Software/miniconda3/envs/TADlib/bin/pyBHFDR", line 116, in worker
cHeatMap = Lib.matrix(balance=args.clr_weight_name, sparse=True).fetch(key)
File "/data/Software/miniconda3/envs/TADlib/lib/python3.6/site-packages/cooler/core/_selectors.py", line 150, in fetch
return self._slice(self.field, i0, i1, j0, j1)
File "/data/Software/miniconda3/envs/TADlib/lib/python3.6/site-packages/cooler/api.py", line 384, in _slice
self._is_symm_upper,
File "/data/Software/miniconda3/envs/TADlib/lib/python3.6/site-packages/cooler/api.py", line 710, in matrix
+ "calculate balancing weights or set balance=False."
ValueError: No column 'bins/weight'found. Use cooler.balance_cooler
to calculate balancing weights or set balance=False.
toCooler -O Ga.40000.a.cool -d datasets --chromsizes-file aa.leng --no-balance --nproc 1
root INFO @ 10/08/23 16:06:11: Python Version: 3.6.15
root INFO @ 10/08/23 16:06:11:
hicpeaks.utilities INFO @ 10/08/23 16:06:12: Read chromosome sizes from /gpfs/Project/wangzw_Project/TDA_analysis/test/HiCPeaks/aa.leng
hicpeaks.utilities INFO @ 10/08/23 16:06:12: Done
hicpeaks.utilities INFO @ 10/08/23 16:06:12: Extract and save data into cooler format for each resolution ...
hicpeaks.utilities INFO @ 10/08/23 16:06:12: Current resolution: 40000bp
hicpeaks.utilities INFO @ 10/08/23 16:06:12: Generate bin table ...
first, do toCooler convert txt file to cool file format .
then call peaks using pyBHFDR , has some error message . don't know how to solve it.
thanks
Hi Xiaotiao,
Another wonderful tool! Thank you so much.
Could HiCPeaks return a score value when do APA analysis with apa-analysis?
Thank you,
Pinpin
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.