Giter Club home page Giter Club logo

fithic's People

Contributors

ay-lab avatar gongyh avatar souryacs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fithic's Issues

Minor bug when testing installation

Dear Ay lab,

Thank you for making and maintaining this great software for the community.

I caught one small bug when I was testing my fithic installation and thought I would let you know. In the directory biasPerLocus, the file Dixon_IMR90_HindIII_hg19_w100000.gz is missing positional and normalization information for chrY loci. This results in an error (stdout):

Error. Bias file does not contain chromosome chr10 or chrY. Please ensure you're using correct file.

A little weird b/c chr10 is there in the file. Anyway, I added dummy information for chrY (set all normalization values to 1; see attached file) and that solved the problem.

Thanks again,
Kris

Dixon_IMR90_HindIII_hg19_w100000_chrYnow.tsv.gz

how to prepare my `input file`?

This software is really great, but I don't know how to prepare my input file? How should I make the input file from .HiC , cool or h5 format HiC file?

Running error

Hi,
I got a such error when running fithic with command:
python /home/user/liu/Software/fithic/fithic/fithic.py -f fithic.fragmentMappability.gz -i fithic.interactionCounts.gz -o ./Inter -r 40000 -t fithic.biases.gz -x interOnly
Btw, the test running is working successfully.

#######################################################
Reading fragments file from: fithic.fragmentMappability.gz
Reading interactions file from: fithic.interactionCounts.gz
Output path created ./Inter
Fixed size option detected... Fast version of FitHiC will be used
Resolution is 40.0 kb
Reading bias file from: fithic.biases.gz
The number of spline passes is 1
The number of bins is 100
The number of reads required to consider an interaction is 1
The name of the library for outputted files will be Emu
Upper Distance threshold is inf
Lower Distance threshold is 0
Only inter-chromosomal regions will be analyzed
Lower bound of bias values is 0.5
Upper bound of bias values is 2
All arguments processed. Running FitHiC now...

Reading the contact counts file to generate bins...
Interactions file read. Time took 154.33874917030334
Traceback (most recent call last):
File "/home/user/liu/Software/fithic/fithic/fithic.py", line 1317, in
main()
File "/home/user/liu/Software/fithic/fithic/fithic.py", line 323, in main
(binStats,noOfFrags, maxPossibleGenomicDist, possibleIntraInRangeCount, possibleInterAllCount, interChrProb, baselineIntraChrProb) = generate_FragPairs(observedInterAllCount, observedInterAllSum, binStats, fragsFile, resolution)
File "/home/user/liu/Software/fithic/fithic/fithic.py", line 597, in generate_FragPairs
maxFrags[ch]=max([int(i)-resolution/2 for i in allFragsDic[ch]])
ValueError: max() arg is an empty sequence
#######################################################

Could you please help with it?

Best~
Jing

Scikit-learn broken command

It looks like there's something wrong with a call to the isotonic module in scikit-learn. When running the example out of the box I got the error:

TypeError: fit() got an unexpected keyword argument 'increasing'

After looking at your code, it looks like you are calling the fit_transform function from the Isotonic Regression module in scikit-learn which passes it to the fit function. It's hard to tell (https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/isotonic.py#L325) but I wasn't sure if it accepts additional parameters. It could via some mixin but removing "increasing=False" from line 276 seems to do the trick. (https://github.com/ay-lab/fithic/blob/master/fithic/fithic.py#L276)

Not sure if this is an update to scikit-learn or something that caused a change or if I managed to run it wrong.

HiCKRy.py KeyError: '0'

Hi,

I'm trying to generate a bias values file and it gives me this error:

Creating sparse matrix...
Traceback (most recent call last):
File "/gpfs/data/reinberglab/home/kl3488/fithic/fithic/utils/HiCKRy.py", line 283, in
main()
File "/gpfs/data/reinberglab/home/kl3488/fithic/fithic/utils/HiCKRy.py", line 276, in main
matrix,revFrag = loadfastfithicInteractions(args.interactions, args.fragment s)
File "/gpfs/data/reinberglab/home/kl3488/fithic/fithic/utils/HiCKRy.py", line 45, in loadfastfithicInteractions
x.append(fragDic[chrom1][mid1])
KeyError: '0'

My contact counts file and fragment mappability files seem to look ok:

Contact counts:

chr10 100467500 chr10 100592500 1
chr10 100467500 chr10 100597500 2
chr10 100467500 chr10 100602500 2

Fragment mappability:

chr10 895000 897500 1 1
chr10 900000 902500 1 1
chr10 905000 907500 1 1

Difference between ICE and KR biases. Setting up the bias limits.

Hi,
I am processing a HiC-Pro results from a small genome (in scaffolds) at 5000bp resolution. I have used HiCPro2FitHiC utility to convert data and also generated the KR biases file using HiCKRy.py.

For example, I see 2 extremely high values in the ICE:
HiC_scaffold_83 6297500 88.3104735696205
HiC_scaffold_20 672500 43.444427428612
that are far from corresponding KR values:
HiC_scaffold_83 6297500 1.71088667499396
HiC_scaffold_20 672500 0.824151827905017
...otherwise, the distributions are similar, with a number of -1 values.

Which of the bias version is preferable?
My other question is in which cases the -bL and -bU need to be modified and whether it is appropriate to adjust them to the bias method or other genome/data-specific factors.

Thank you!

FitHic crashes

Hi,

I try to detect loops with FitHic, I installed the tool via bioconda.

First, I converted a cooler file to hicpro and the required bed file, next, I used your script HiCPro2FitHiC (which is actually not part of the bioconda installation... maybe that should be included?) to transform the data to FitHic compatible input.

However, I run now FitHic with:

fithic -i hmec_100kb/fithic.interactionCounts.gz -f hmec_100kb/fithic.fragmentMappability.gz -o hmec_100kb/ -r 100000

but get a crash:

GIVEN FIT-HI-C ARGUMENTS
=========================
Reading fragments file from: hmec_100kb/fithic.fragmentMappability.gz
Reading interactions file from: hmec_100kb/fithic.interactionCounts.gz
Output path being used from hmec_100kb/
Fixed size option detected... Fast version of FitHiC will be used
Resolution is 100.0 kb
No bias file
The number of spline passes is 1
The number of bins is 100
The number of reads required to consider an interaction is 1
The name of the library for outputted files will be FitHiC
Upper Distance threshold is inf
Lower Distance threshold is 0
Only intra-chromosomal regions will be analyzed
Lower bound of bias values is 0.5
Upper bound of bias values is 2
All arguments processed. Running FitHiC now...
=========================


Reading the contact counts file to generate bins...
Interactions file read. Time took 244.58135843276978
Traceback (most recent call last):
  File "/home/wolffj/miniconda3/envs/fithic2/bin/fithic", line 10, in <module>
    sys.exit(main())
  File "/home/wolffj/miniconda3/envs/fithic2/lib/python3.6/site-packages/fithic/fithic.py", line 310, in main
    (binStats,noOfFrags, maxPossibleGenomicDist, possibleIntraInRangeCount, possibleInterAllCount, interChrProb, baselineIntraChrProb)= generate_FragPairs(binStats, fragsFile, resolution)
  File "/home/wolffj/miniconda3/envs/fithic2/lib/python3.6/site-packages/fithic/fithic.py", line 542, in generate_FragPairs
    maxFrags[ch]=max([int(i)-resolution/2 for i in allFragsDic[ch]])
ValueError: max() arg is an empty sequence

Do you have any idea what I need to do to get it running?

Thanks a lot,

Joachim

Missing chr1 in output

Hi,
Thanks for the magic tool. But, I can't find either inter or intra-chromosome data related with chr1 in the output file. I wonder how to fix that.

Where to find fragments argument from HiC Pro Pipeline?

Hello,

I am trying to run Fit-HiC to analyze HiChIP data I have already aligned using the HiC-pro pipeline. I have inferred from your readme that the interactions argument would be my .allvalidpairs file generated from hic-pro pipeline. However, I cannot see anything in a format that could work for the fragments file.

Do you know if the HiC-pro pipeline generates a file that would work as a fragments file for Fit-HiC?

Any help would be greatly appreciated.

Thanks!

Question about the output of merge-filter.sh

Hi:
I tried to run merge-filter.sh for merging spatially close, significant interactions from FitHiC2, but I am confused about the output files of merge-filter.sh .
There are many columns, like chr1 mid1 chr2 mid2 and bin1_low bin1_high bin2_low bin2_high, in output files of merge-filter.sh , which one is the location of merged interactions ? And, about the parameter "fdr", do you have any recommended settings? I found there are too many merged interactions(about more than 200000 rows) if I set "fdr" to 0.01.
Thanks in advance !
Best wishes
Qianzhao

error while running fithic with data generated by hic-pro

I used the script in HiC-Pro to transform the 40Kb ICE-normalized Hi-C contact result matrix to a raw interaction count file and a bias file calculated by ICE, but got these error informations, could you please help me to fix this?

fithic -f fithic.fragmentMappability.gz -i fithic.interactionCounts.gz -t fithic.biases.gz -o fithic_sample2 -l TU-2 -v -x intraOnly -r 40000

Reading the contact counts file to generate bins...
Interactions file read. Time took 47.469826459884644
Fragments file read. Time took 0.20023465156555176
Traceback (most recent call last):
File "/home/wg_xialin/.local/bin/fithic", line 8, in
sys.exit(main())
File "/home/wg_xialin/.local/lib/python3.8/site-packages/fithic/fithic.py", line 314, in main
biasDic = read_biases(biasFile)
File "/home/wg_xialin/.local/lib/python3.8/site-packages/fithic/fithic.py", line 671, in read_biases
chrom=words[0]; midPoint=int(words[1]); bias=float(words[2])
IndexError: list index out of range

sequencing depth for loops

Hello,
I want to call loops using FitHiC2 at 10kb resolution. Do you know how many valid paired reads are rquired by FitHiC2 to call loops at 10kb resolution? 300million valid paired reads?

error while using fithic for the interchromosomal interactions

Dear Dr. Ay,
I used fithic for the intrachromosomal contacts and it worked perfectly.
I am trying to use fithic 2.0.7 for the interchromosomal contacts and here is my command:

python fithic.py -f fragments_list.gz -i chr9_chrX_1mb_fithic.gz -o test/ -r 1000000 -t bias.gz -x All
python fithic.py -f fragments_list.gz -i chr9_chrX_1mb_fithic.gz -o test/ -r 1000000 -t bias.gz -x interOnly

but it gives me this error:

File "fithic.py", line 310, in main
(binStats,noOfFrags, maxPossibleGenomicDist, possibleIntraInRangeCount, possibleInterAllCount, interChrProb, baselineIntraChrProb)= generate_FragPairs(binStats, fragsFile, resolution)
File "fithic.py", line 560, in generate_FragPairs
currBin = binStats[binTracker]
KeyError: 0

Can you help me solve this issue???

Best regards,
Noha

Illegal fragment pair

Hello When I use the fithic ,I got a error :Illegal fragment pair, and the following is my input file
$ head fithic.interactionCounts
chr1 25000 chr1 25000 624.0
chr1 25000 chr1 75000 760.0
chr1 25000 chr1 125000 315.0
chr1 25000 chr1 175000 146.0
chr1 25000 chr1 225000 113.0
chr1 25000 chr1 275000 98.0
chr1 25000 chr1 325000 80.0
chr1 25000 chr1 375000 56.0
chr1 25000 chr1 425000 48.0
chr1 25000 chr1 475000 38.0

$ head fithic.fragmentMappability
chr1 0 25000 19005.0 1
chr1 50000 75000 28207.0 1
chr1 100000 125000 26814.0 1
chr1 150000 175000 18005.0 1
chr1 200000 225000 24461.0 1
chr1 250000 275000 23232.0 1

can you help me to solve this problem?
Thanks!
Best.

Standard Error is always 0

Between lines 887 and 897 in fithic/fithic/fithic.py there is some commenting out of some standard error calculation?

TypeError: can only concatenate str (not "int") to str

Hello,
My test run (fithic/tests/run_tests-git.sh) finished successfully, but while running it on my files using this command using version 2.0.7:

python3 fithic.py -f fithic.fragmentMappability.gz -i fithic.interactionCounts.gz -o FitHicAmphioxus -t fithic.biases.gz -r 150000

I received this error:

Reading the contact counts file to generate bins...
Interactions file read. Time took 26.23392629623413
Traceback (most recent call last):
File "/home/user/sarigoel/Programs/FITHIC/fithic/fithic/fithic.py", line 1324, in
main()
File "/home/user/sarigoel/Programs/FITHIC/fithic/fithic/fithic.py", line 323, in main
(binStats,noOfFrags, maxPossibleGenomicDist, possibleIntraInRangeCount, possibleInterAllCount, interChrProb, baselineIntraChrProb) = generate_FragPairs(observedInterAllCount, observedInterAllSum, binStats, fragsFile, resolution)
File "/home/user/sarigoel/Programs/FITHIC/fithic/fithic/fithic.py", line 600, in generate_FragPairs
print("ERROR - the chromosome " + ch + " has " + len(allFragsDic[ch]) + " valid fragments/bins and should be removed from the input fragment information !!! ")
TypeError: can only concatenate str (not "int") to str

Here is how my input files look like:

[sarigoel@myotis AMPHIOXUS]$ zcat fithic.biases.gz | head -n2
Sc7u5tJ_517 75000 1.970547623956338
Sc7u5tJ_517 225000 0.40157523166875075
[sarigoel@myotis AMPHIOXUS]$ zcat fithic.fragmentMappability.gz | head -n2
Sc7u5tJ_517 0 75000 17395 1
Sc7u5tJ_517 150000 225000 2437 1
[sarigoel@myotis AMPHIOXUS]$ zcat fithic.interactionCounts.gz | head -n2
Sc7u5tJ_517 75000 Sc7u5tJ_517 75000 1700
Sc7u5tJ_517 75000 Sc7u5tJ_517 225000 5

I used an old HicPro (version 2.10.0) to generate my initial data and used this command/script to convert it:

python3 HiCPro2FitHiC.py -i Sample1_150000.matrix -b Sample1_150000_abs.bed -s Sample1_150000_iced.matrix.biases -o . -r 150000

These files had these lengths:
3776446 Sample1_150000.matrix
3769 Sample1_150000_abs.bed
3769 Sample1_150000_iced.matrix.biases

and first two lines were as below:

**==> Sample1_150000.matrix <==
1 1 1700
1 2 5

==> Sample1_150000_abs.bed <==
Sc7u5tJ_517 0 150000 1
Sc7u5tJ_517 150000 246623 2

==> Sample1_150000_iced.matrix.biases <==
1.917118534333063673e+00
3.906869898508548156e-01**

Sample1_150000_iced.matrix.biases file had also nan values which were I guess converted to -1.

Following the conversion the files kept their original lengths:

[sarigoel@myotis AMPHIOXUS]$ zcat fithic.interactionCounts.gz | wc -l
3776446
[sarigoel@myotis AMPHIOXUS]$ zcat fithic.fragmentMappability.gz | wc -l
3769
[sarigoel@myotis AMPHIOXUS]$ zcat fithic.biases.gz | wc -l
3769

As for the chromosome names, all start with Sc7u5tJ_ and there is no other special character than an underscore, each followed by a scaffold number.

The log file had these lines:

###########
Interactions file read successfully
Observed, Intra-chr in range: pairs= 275495 totalCount= 6213510
Observed, Intra-chr all: pairs= 275495 totalCount= 6213510
Observed, Inter-chr all: pairs= 3500951 totalCount= 7397792
Range of observed genomic distances [0 35250000]

Making equal occupancy bins
Observed intra-chr read counts in range 6213510
Desired number of contacts per bin 62135.1,
Number of bins 100
Equal occupancy bins generated

Looping through all possible fragment pairs in-range_
############

Can you think of a reason that may have caused the error?
Thank you!

How to decide the bin size in interaction loops?

Hello,

I am running fithic. From the output, it only shows the midpoint of fragment in interaction loops. I am just wondering how I can come to determine the bin size to know which range of genomic regions can be potentially interacted with another range of genomic regions? I assume just the output from previous FitHiChip results. Thank you!

Zhikai

Matplotlib Backend Error

A common problem on a headless OS, but when I tried to run fithic with the --visual flag:

fithic -f data/yeast_fragments.gz -i data/yeast_counts.gz -o sample -p 10 -l yeast --visual

I got the following error:

/home/asur/.local/lib/python2.7/site-packages/fithic/fithic.py:119: UserWarning: 
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.
The backend was *originally* set to u'TkAgg' by the following code:
  File "/usr/local/bin/fithic", line 7, in <module>
    from fithic.fithic import main
  File "/home/asur/.local/lib/python2.7/site-packages/fithic/fithic.py", line 25, in <module>
    from pylab import *
  File "/home/asur/.local/lib/python2.7/site-packages/pylab.py", line 1, in <module>
    from matplotlib.pylab import *
  File "/home/asur/.local/lib/python2.7/site-packages/matplotlib/pylab.py", line 257, in <module>
    from matplotlib import cbook, mlab, pyplot as plt
  File "/home/asur/.local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 72, in <module>
    from matplotlib.backends import pylab_setup
  File "/home/asur/.local/lib/python2.7/site-packages/matplotlib/backends/__init__.py", line 14, in <module>
    line for line in traceback.format_stack()

  matplotlib.use('Agg')

While I noticed you did include the work around in your code by importing the appropriate matplotlib backend, it looks like pylab imports the wrong backend before that line. By moving

import matplotlib
matplotlib.use('Agg')

to the top of the fithic.py script, I was able to avoid the error. I suppose you call the optional import before the pylab import?

HTML file has broken links to FitHiC outputs

I noticed the links in the HTML generated are missing the resolution specified when running FitHiC.

e.g. the equal occupancy bin statistics file:

SAMPLE.fithic_pass1.txt

should be (for resolution=5kb):

SAMPLE.fithic_pass1.[res5000.]txt

Can't Open File

When using merge-filter.sh I run into:

python3: can't open file 'CombineNearbyInteraction.py': [Errno 2] No such file or directory

This is because a variable has been capitalized:

script=""$UTILITYFOLDER"CombineNearbyInteraction.py"

So the variable $UTILITYFOLDER is empty, whereas the variable $utilityfolder contains the argument needed. Not sure if this is bash version issue or something.

test run AttributeError

Hi,

I am trying to check whether Fithic is running correctly.
When I run the run_tests-git.sh, it gives me the following error:

Traceback (most recent call last):
File "../fithic.py", line 30, in
import matplotlib
File "/gpfs/share/apps/anaconda3/gpu/5.2.0/envs/fithic/lib/python3.6/site-packages/matplotlib/init.py", line 107, in
from . import cbook, rcsetup
File "/gpfs/share/apps/anaconda3/gpu/5.2.0/envs/fithic/lib/python3.6/site-packages/matplotlib/rcsetup.py", line 28, in
from matplotlib.fontconfig_pattern import parse_fontconfig_pattern
File "/gpfs/share/apps/anaconda3/gpu/5.2.0/envs/fithic/lib/python3.6/site-packages/matplotlib/fontconfig_pattern.py", line 15, in
from pyparsing import (Literal, ZeroOrMore, Optional, Regex, StringEnd,
File "/gpfs/share/apps/anaconda3/gpu/5.2.0/envs/fithic/lib/python3.6/site-packages/pyparsing/init.py", line 133, in
version = version_info.version
AttributeError: 'version_info' object has no attribute 'version'

A similar error appears when I run fithic using only the mandatory arguments.
I wonder if you have some ideas how to fix this?

fithic crash

Hi,
I am trying to run fithic downstream of HiC-pro.
I managed to convert the Hicpro output and the iced base using the HiCPro2FitHiC utility function.
Nevertheless when running FitHiC with the produced files I get the following error which is hard for me to grasp.
Any insight is appreciated
Thanks
Francesco

Reading the contact counts file to generate bins...
Interactions file read. Time took 535.2248961925507
Traceback (most recent call last):
File "/hpcnfs/data/GN2/fgualdrini/tools/anaconda3/envs/EnvFITHIC2/bin/fithic", line 10, in
sys.exit(main())
File "/hpcnfs/data/GN2/fgualdrini/tools/anaconda3/envs/EnvFITHIC2/lib/python3.6/site-packages/fithic/fithic.py", line 310, in main
(binStats,noOfFrags, maxPossibleGenomicDist, possibleIntraInRangeCount, possibleInterAllCount, interChrProb, baselineIntraChrProb)= generate_FragPairs(binStats, fragsFile, resolution)
File "/hpcnfs/data/GN2/fgualdrini/tools/anaconda3/envs/EnvFITHIC2/lib/python3.6/site-packages/fithic/fithic.py", line 542, in generate_FragPairs
maxFrags[ch]=max([int(i)-resolution/2 for i in allFragsDic[ch]])
ValueError: max() arg is an empty sequence

intra vs inter chromosomal counts in input

Hi, this is not a technical issue but rather it's not documented in the manual. I know fithic is for mid-range intrachromosomal contacts. In the input files, specifically the fragments file, does the marginalized count need to be for the intra counts only or for all counts?

Thanks!

q-value is always 1 when running FitHiC2 using allValidPairs input

Hi!

I hope you are doing well. I ran FitHiC2 by first using the validPairs2FitHiC-fixedSize.sh for interactions file input, HiCKRy.py for bias file input, and createFitHiCFragments-fixedsize.py for fragments file input. When obtaining the output, nearly all the q-values are 1, and the bias values are consistently -1. When running directly from HiCPro2FitHiC, many of the q-values are below 1. I think the difference in results may have to do with setting the percentOfSparseToRemove (-x) parameter when generating the bias file. I wanted to know if you had any suggestions on how you would determine what parameters and cutoffs to use when generating the bias file as well as for running the FitHiC command after using the validPairs2FitHiC-fixesSize.sh, createFitHiCFragments-fixedsize.py, and HicKRy.py for generating the input files.

Thanks so much in advance!

Best,
Shanta

Update Readme

Hi, I think the Readme might be a little outdated. The docs mention a ./runall but I didn't find any executable called that in the folder, and installing from pypi worked well enough to where you might just link the data files to run a test, or include them as a unit test in the module.

Also it looks like _tkinter was a requirement so you may want to add that to the requirements.

HiCKRy.py generated extreme bias value

Hi:
When I tried to calculate bias values using HiCKRy.py, I found there are some extreme values as like chr1 30875000 2043.101940700636 ; but the bias file generated by ICE normalization using HiCPro is normal, chr1 30875000 0.8739076476642095. My script is simple : python $KR -i $in -f $frag -o $bias
Meanwhile when I used the bias file generated by HiCKRy.py to finding inter-chromosomal significant interactions, I obtained too much interactions (5183320,q<0.01), but when I used ICE normalized bias file, I only obtained some interactions (493135,q<0.01).
It seems that the inter-chromosomal significant interactions from ICE normalization were more solid through comapred with contact heatmap using juicer.
Maybe I should choose the ICE normalization for calling inter-chromosomal significant interactions. Counld you give me some suggestions?

Best wishes
Qianzhao

HiCKRy.py Key errors

Hi

I have been trying to run HiCKRy.py on data dumped from Juicer. The contact counts were dumped using no normalisation at 1kb resolution (we have greater than 4 billion contacts) The error i keep getting from HiCKRy.py is as follows:

Creating sparse matrix...
Traceback (most recent call last):
File "HiCKRy.py", line 283, in
main()
File "HiCKRy.py", line 276, in main
matrix,revFrag = loadfastfithicInteractions(args.interactions, args.fragments)
File "HiCKRy.py", line 45, in loadfastfithicInteractions
x.append(fragDic[chrom1][mid1])
KeyError: '1'

The contacts file generated looks like this:

1 87000 1 87000 2.0
1 87000 1 88000 1.0
1 137000 1 139000 1.0
1 181000 1 181000 17.0
1 181000 1 182000 2.0
1 182000 1 182000 1.0
1 187000 1 190000 1.0
1 190000 1 191000 1.0
1 597000 1 598000 1.0
1 598000 1 599000 1.0

The fragment file generated looks like this:

chr1 0 500 1 1
chr1 1000 1500 1 1
chr1 2000 2500 1 1
chr1 3000 3500 1 1
chr1 4000 4500 1 1
chr1 5000 5500 1 1
chr1 6000 6500 1 1
chr1 7000 7500 1 1
chr1 8000 8500 1 1
chr1 9000 9500 1 1

Do you have any suggestions for this or would it be easier to dump the contacts from Juicer with the KR normalisation already applied?

Thanks in advance,

James

TypeError: coercing to Unicode: need string or buffer, NoneType found

Hello,
I install the fithic via pip and get follows error when running fithic in the command line:

Generating all possible intra-chromosomal fragment pairs and counting the number of all possible inter-chr fragment pairs
------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/public/home/zpxu/miniconda2/bin/fithic", line 11, in <module>
    load_entry_point('fithic==1.1.3', 'console_scripts', 'fithic')()
  File "/public/home/zpxu/miniconda2/lib/python2.7/site-packages/fithic/fithic.py", line 181, in main
    generate_FragPairs(options.fragsfile)
  File "/public/home/zpxu/miniconda2/lib/python2.7/site-packages/fithic/fithic.py", line 756, in generate_FragPairs
    infile = open(infilename, 'r')
TypeError: coercing to Unicode: need string or buffer, NoneType found

Any help is much appreciated.
Thanks.

fithic run error

**Dear professor,
I download the fithic, and use the command to setup:
python setup.py install --user

after setup I run the command:sh run_test.sh
but an error occured below:
**

observedIntraInRangeSum 2210827 desiredPerBin 22108 noOfBins 100
Plotting Duan_yeast_EcoRI.fithic_pass1.png
/share/nas2/genome/biosoft/Python/2.7.8/lib/python2.7/site-packages/matplotlib-1.4.3-py2.7-linux-x86_64.egg/matplotlib/collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
if self._edgecolors == str('face'):
Writing Duan_yeast_EcoRI.fithic_pass1.txt
Fit a univariate spline to the probability means
baseline intra-chr probability: 1.4239355014175277e-06 baseline inter-chr probability: 1.1954383983420704e-07
Traceback (most recent call last):
File "/home/liufuyan/.local/bin/fithic", line 9, in
load_entry_point('fithic==1.0.8', 'console_scripts', 'fithic')()
File "/home/liufuyan/.local/lib/python2.7/site-packages/fithic-1.0.8-py2.7.egg/fithic/fithic.py", line 184, in main
splineXinit,splineYinit,splineResidual,isOutlier,splineFDRxinit,splineFDRyinit=fit_Spline(x,y,yerr,options.intersfile,sortedInteractions,biasDic,libname+".spline_pass1",1)
File "/home/liufuyan/.local/lib/python2.7/site-packages/fithic-1.0.8-py2.7.egg/fithic/fithic.py", line 276, in fit_Spline
newSplineY = ir.fit_transform(splineX, splineY, increasing=False)
File "/home/liufuyan/.local/lib/python2.7/site-packages/sklearn/base.py", line 436, in fit_transform
return self.fit(X, y, **fit_params).transform(X)
TypeError: fit() got an unexpected keyword argument 'increasing'

Also I meet another issues:

cannot connect to X server localhost

Could you provide information how to avoid use the X serve.

Thank you !

fuyan

Out of 70041 loci 70041 were discarded with biases not in range [0.5 2]

Hi,
Thank you for such a nice tool for Hi-C data processing!
I am processing DNase-HiC data. I want to apply Fithic2 to my data; I have prepared the input files from HiC-Pro.

For example:
-i
Screen Shot 2022-07-15 at 4 25 27 PM
-f
Screen Shot 2022-07-15 at 4 27 43 PM
-t
Screen Shot 2022-07-15 at 4 28 24 PM

I proceed to fithic with these input files; however, I encounter error #42. After, reading all comments about #42 and #39, I have removed the biases using: awk '{if ($3 > "0.5" && $3 < "2"){ print}}' and from fragments file I've removed all the validpairs with 0 using: awk '{if ($4 > "0" && $5 > "0"){print }}'.

After, that now I have
-t
Screen Shot 2022-07-15 at 4 38 14 PM
-f
Screen Shot 2022-07-15 at 4 37 20 PM

After this trimming these files, I used this command to run Fithic:

  • fithic -i fithic.interactionCounts.gz -f fragmentMappability.gz -t biases.gz -r 10000 -o ./fithic-out3/

After Fithic2 ran successfully on my files, when I checked the log file, I noticed that " Out of 70041 loci 70041 were discarded with biases not in range [0.5 2]".

Here, is the log file:
Screen Shot 2022-07-15 at 4 54 35 PM

And my significance file has all the values are equal to zero and this is the significance file:
Screen Shot 2022-07-15 at 4 56 39 PM

I am wondering how to resolve this issue. I don't know where I am making mistakes. I really appreciate it if you have a look into this and help me to fix this problem.

Thank you!
Best,
Nisar

FitHiC run error

Dear Ferhat,

I am interested in your developed tool "FitHiC", however, I met a problem when I trying to find the loops in some specific genome region . It reported an error that:
###################
Running generate_FragPairs method ...
Complete generate_FragPairs method [OK]
Running read_ICE_biases method ...
Complete read_ICE_biases method [OK]
Running read_All_Interactions method ...
Error in Ops.factor(chr1, chr2) : level sets of factors are different
##################

my original command is:
FitHiC(fragsfile = "fithic.fragmentMappability",intersfile = "fithic.interactionCounts", outdir=getwd(),biasfile="fithic.biases",libname="ESC",noOfBins=20,distUpThres=14000000,distLowThres=13500000,visual=TRUE)

I don't know what is the problem? Could you give me some help?

Thank you so much!
Best,
Garen

ValueError: max() arg is an empty sequence

Hi,
I am running version Fit-Hi-C 2.0.7.

The input files look like this:
image
image
image

I used the command below
fithic -i fithic.interactionCounts.gz -f fithic.fragmentMappability.gz -t fithic.biases.gz -r 5000 -x intraOnly -U 5000000 -L 20000 -o ./23_10k -l 23_10k-intrachrom -v

and encounter the error
image

When I just extracted one chromosome infomation from the three files to new files, and run the same command, it works.

Please help me out.
Thank you in advance for your help,
Chengming

Merge filter generalizable?

Having multiple adjacent bin pairs all passing some constant significance threshold despite stemming from the same underlying interaction can be an issue in quite a number of other Hi-C analyses, for instance in calling differential interactions by SELFISH as implemented by your group.

The merging filer introduced here seems like a nice approach to this problem, but I was curious if there has been any testing done on whether such a method is suited to prune the output of other tools?

Thanks!

Any example of using .hic file as the input?

I am struggling with the output from juicer to fit with fithic. I dump my .hic file and then tried creatFitHiCContact-hic.sh but end up with nothing. I am wondering if we have any examples using .hic as the start to run Fithic.
Thanks

fragmentMid

Hello,
would you like to tell me what fragmentMid reperesent?
how to get the coordinates of 2 loop anchors?

image

error while running FitHiC

Hi,
I am running version Fit-Hi-C 2.0.7.
I used the command below
fithic -f /project/roselai_228/priyatap/HiC_work/fithic_output/fithic.fragmentMappability.gz -i /project/roselai_228/priyatap/HiC_work/fithic_output/fithic.interactionCounts.gz -U 1000000 -r 1000000 -l fithit_SRR1030745_1MB -o /project/roselai_228/priyatap/HiC_work/fithic_output

and encounter the error
Screen Shot 2020-12-11 at 7 26 34 PM

Please help me, how to resolve the issue.
FYI, I used the below function to generate the input for fithic. I got error in this command too but it generated three zip files. SO, I used 2 files in the above code.

python HiCPro2FitHiC.py -i /project/roselai_228/priyatap/HiC_work/output/hic_results/matrix/SRR1030745/raw/1000000/SRR1030745_1000000.matrix -b /project/roselai_228/priyatap/HiC_work/output/hic_results/matrix/SRR1030745/raw/1000000/SRR1030745_1000000_abs.bed -s /project/roselai_228/priyatap/HiC_work/output/hic_results/matrix/SRR1030745/iced/1000000/SRR1030745_1000000_iced.matrix.biases -o /scratch/priyatap/Hicpro_output

The error I got from the HiCPro2FitHiC.py is below

Screen Shot 2020-12-11 at 6 29 45 PM

Please help me out.
Thank you in advance for your help,
Priya

Advice on converting pairix or cooler format to fithic format

Dear fithic developers

I had a question on how to basically convert a cooler (.cool) or pairix format files into a fithic input format.
Main reason I have a cool format (or pairix format) is because I'm working on an experimental long read based Hi-C procedure, hence standard Hi-C pipelines like HiC Pro is not relavent for me and I dont think I can use the HiCPro2FitHiC scripts provided by the fithic package. Perhaps I could ask some questions for clarifications so that it can help people with cool files and want to use fithic?

So basically what Im trying to do is change the .cool file into the interactions and fragments file for fithic.
It seems like for the interactions file can be created with cooler dump function, specifically

cooler dump --join fubar.cool

which will give you a 7 column "matrix" where first three column represents bin_i_chromosome bin_i_start bin_i_end, and the next three gives you the interacting bin, and the last column is contact count between the two bins. So basically if you average the start and end of each bin I think it would correspond to fragmentMid1 & fragmentMid2 of the interaction file. My first question is, it seems like you don't count the diagonal cells in the interaction file? In other words the contact count for bin_i vs. bin_i should not be in the interaction file?

For the fragments file I think the 2nd and 5th column would hold some dummy values (as it doesnt matter what they are?) and I dont think its easily found with a cool file. So the column marginalizedContactCount is most important to fill, and based on the description it seems like its just a summation of the entire contacts for a given bin? So if one had a N x N matrix representing the Hi-C matrix, the marginalizedContactCount can just be a summation of each row?

Thank you for the help.

merge_filter reduces significant interactions from 12M to 10k

I am using HiC-Pro/FitHiC tools for calling interactions from two Hi-C replicates for a large genome. FitHiC2 at 20kb resolution (40kb-2Mb) found 12M interactions with q-value <0.005 (from ".spline_pass2.res20000.significances.txt.gz" file). After running the default merge_filter.sh, the number went down to 10kb with FDR=0.05. Is this normal or is it possible that I need to adjust some parameters in CombineNearbyInteraction.py? I could not find details on this step.
Thank you for the great toolkits and help!
Pavla

Fragment length is not consistent with fithic resolution (-r) in tests data

When running the tests data, I noticed that you set resolution (-r) to 100000, however, the input file has a fragment size of 1000000.

Here is tests script:

#/fithic/fithic/tests/run_tests-git.sh
#line 46-49
for i in Dixon_IMR90_HindIII_hg19_w100000; do
    python3 ../fithic.py -r 100000 -l "$i" -i $inI/$i.gz -f $inF/$i.gz -b $noOfBins -p $noOfPasses -o outputs/${i}.interOnly -x interOnly
    python3 ../fithic.py -r 100000 -l "$i" -i $inI/$i.gz -f $inF/$i.gz -b $noOfBins -p $noOfPasses -o outputs/${i}.all -x All
done

Here is tests data:

#/fithic/fithic/tests/contactCounts/Dixon_IMR90_HindIII_hg19_w100000.gz
chr10   500000  chr10   500000  13850
chr10   500000  chr10   1500000 3472
chr10   500000  chr10   10500000        370

Here is log file:

#Dixon_IMR90_HindIII_hg19_w100000.fithic.log
Interactions file read successfully
-----------------------------------------------------------------------------------
-
Observed, Intra-chr in range: pairs= 215762      totalCount= 91387585
Observed, Intra-chr all: pairs= 218642   totalCount= 121700752
Observed, Inter-chr all: pairs= 3878618  totalCount= 99952107
Range of observed genomic distances [1000000 249000000]

Making equal occupancy bins
-----------------------------------------------------------------------------------
-
Observed intra-chr read counts in range 91387585
Desired number of contacts per bin      456937.925,
Number of bins  200
Equal occupancy bins generated

Looping through all possible fragment pairs in-range
-----------------------------------------------------------------------------------
-
Chromosome 'chr1',      250 mappable fragments,         -2487765 possible intra-chr
 fragment pairs in range,    715750 possible inter-chr fragment pairs
Chromosome 'chr10',     136 mappable fragments,         -733191 possible intra-chr 
fragment pairs in range,     404872 possible inter-chr fragment pairs
Chromosome 'chr11',     136 mappable fragments,         -733191 possible intra-chr fragment pairs in range,     404872 possible inter-chr fragment pairs
Chromosome 'chr12',     134 mappable fragments,         -711689 possible intra-chr fragment pairs in range,     399186 possible inter-chr fragment pairs
Chromosome 'chr13',     116 mappable fragments,         -532571 possible intra-chr fragment pairs in range,     347652 possible inter-chr fragment pairs
Chromosome 'chr14',     108 mappable fragments,         -461283 possible intra-chr fragment pairs in range,     324540 possible inter-chr fragment pairs
Chromosome 'chr15',     103 mappable fragments,         -419328 possible intra-chr fragment pairs in range,     310030 possible inter-chr fragment pairs
Chromosome 'chr16',     91 mappable fragments,  -326796 possible intra-chr fragment pairs in range,     275002 possible inter-chr fragment pairs
Chromosome 'chr17',     82 mappable fragments,  -264957 possible intra-chr fragment pairs in range,     248542 possible inter-chr fragment pairs
Chromosome 'chr18',     79 mappable fragments,  -245784 possible intra-chr fragment pairs in range,     239686 possible inter-chr fragment pairs
Chromosome 'chr19',     60 mappable fragments,  -141075 possible intra-chr fragment pairs in range,     183180 possible inter-chr fragment pairs
Chromosome 'chr2',      244 mappable fragments,         -2369499 possible intra-chr fragment pairs in range,    700036 possible inter-chr fragment pairs
Chromosome 'chr20',     64 mappable fragments,  -160719 possible intra-chr fragment pairs in range,     195136 possible inter-chr fragment pairs
Chromosome 'chr21',     49 mappable fragments,  -93654 possible intra-chr fragment pairs in range,      150136 possible inter-chr fragment pairs
Chromosome 'chr22',     52 mappable fragments,  -105627 possible intra-chr fragment pairs in range,     159172 possible inter-chr fragment pairs
Chromosome 'chr3',      199 mappable fragments,         -1574304 possible intra-chr fragment pairs in range,    579886 possible inter-chr fragment pairs
Chromosome 'chr4',      192 mappable fragments,         -1465167 possible intra-chr fragment pairs in range,    560832 possible inter-chr fragment pairs
Chromosome 'chr5',      181 mappable fragments,         -1301586 possible intra-chr fragment pairs in range,    530692 possible inter-chr fragment pairs
Chromosome 'chr6',      172 mappable fragments,         -1174947 possible intra-chr fragment pairs in range,    505852 possible inter-chr fragment pairs
Chromosome 'chr7',      160 mappable fragments,         -1016175 possible intra-chr fragment pairs in range,    472480 possible inter-chr fragment pairs
Chromosome 'chr8',      147 mappable fragments,         -857172 possible intra-chr fragment pairs in range,     436002 possible inter-chr fragment pairs
Chromosome 'chr9',      142 mappable fragments,         -799617 possible intra-chr fragment pairs in range,     421882 possible inter-chr fragment pairs
Chromosome 'chrX',      156 mappable fragments,         -965811 possible intra-chr fragment pairs in range,     461292 possible inter-chr fragment pairs
Chromosome 'chrY',      60 mappable fragments,  -141075 possible intra-chr fragment pairs in range,     183180 possible inter-chr fragment pairs
Number of all fragments= 3113
Possible, Intra-chr in range: pairs= -19082983 
Possible, Intra-chr all: pairs= 241996.0 
Possible, Inter-chr all: pairs= 4604945.0 
Desired genomic distance range   [0 inf] 
Range of possible genomic distances  [100000  249450000] 
Baseline intrachromosomal probability is 4.13229970743318e-06 
Interchromosomal probability is 2.1715785964870374e-07 
5th quantile of biases: 0.57080572791248
50th quantile of biases: 1.01076079547
95th quantile of biases: 1.20269227401
Out of 3053 loci 85 were discarded with biases not in range [0.5 2]


Calculating probability means and standard deviations of contact counts
------------------------------------------------------------------------------------
Means and error written to outputs/Dixon_IMR90_HindIII_hg19_w100000.all/Dixon_IMR90_HindIII_hg19_w100000.fithic_pass1.res100000.txt


Fitting a univariate spline to the probability means
-----------------------------------------------------------------------------------
Spline successfully fit

The 'Possible, Intra-chr in range: pairs= -19082983' seems weird. If set -r to 1000000, the 'Intra-chr in range: pairs= ' is a positive number and the significant interactions greatly reduce. Shouldn't the resolution parameter (fithic -r) be the same as the fragment length (Dixon_IMR90_HindIII_hg19_w100000.gz)?

list index out of range

The following are the three input files:

$ zcat fat_5000.fithic.fragment.gz |head
NC_052532.1	0	2500	0	0
NC_052532.1	0	7500	0	0
NC_052532.1	0	12500	1106	1
NC_052532.1	0	17500	3828	1
NC_052532.1	0	22500	7946	1
NC_052532.1	0	27500	1786	1
NC_052532.1	0	32500	11554	1
NC_052532.1	0	37500	4999	1
NC_052532.1	0	42500	7694	1
NC_052532.1	0	47500	10932	1
zcat fat_5000.fithic.interaction.gz |head
NC_052532.1	12500	NC_052532.1	12500	113
NC_052532.1	12500	NC_052532.1	17500	15
NC_052532.1	12500	NC_052532.1	22500	4
NC_052532.1	12500	NC_052532.1	27500	1
NC_052532.1	12500	NC_052532.1	32500	1
NC_052532.1	12500	NC_052532.1	42500	1
NC_052532.1	12500	NC_052532.1	47500	5
NC_052532.1	12500	NC_052532.1	52500	2
NC_052532.1	12500	NC_052532.1	62500	2
NC_052532.1	12500	NC_052532.1	67500	3
zcat fat_5000.fithic.bias.gz|head
NC_052532.1	2500	0.447834
NC_052532.1	7500	0.098977
NC_052532.1	12500	0.150248
NC_052532.1	17500	0.374007
NC_052532.1	22500	0.563625
NC_052532.1	27500	0.239352
NC_052532.1	32500	0.588517
NC_052532.1	37500	0.492011
NC_052532.1	42500	0.661867
NC_052532.1	47500	0.819888

And the following are my error reporting messages:

GIVEN FIT-HI-C ARGUMENTS
=========================
Reading fragments file from: /home/SLY68/2022/hic/juicer/down_analysis/raw/fit-hic2/fat_5000.fithic.fragment.gz
Reading interactions file from: /home/SLY68/2022/hic/juicer/down_analysis/raw/fit-hic2/fat_5000.fithic.interaction.gz
Output path created ./interOnly/
Fixed size option detected... Fast version of FitHiC will be used
Resolution is 5.0 kb
Reading bias file from: /home/SLY68/2022/hic/juicer/down_analysis/raw/fit-hic2/fat_5000.fithic.bias.gz
The number of spline passes is 2
The number of bins is 100
The number of reads required to consider an interaction is 1
The name of the library for outputted files will be FitHiC
Upper Distance threshold is inf
Lower Distance threshold is 0
Graphs will be outputted
Only inter-chromosomal regions will be analyzed
Lower bound of bias values is 0.5
Upper bound of bias values is 2
All arguments processed. Running FitHiC now...
=========================


Reading the contact counts file to generate bins...
Interactions file read. Time took 2983.205014705658
Fragments file read. Time took 0.9499287605285645
Traceback (most recent call last):
  File "/home/SLY68/anaconda3/envs/hicpro/bin/fithic", line 8, in <module>
    sys.exit(main())
  File "/home/SLY68/anaconda3/envs/hicpro/lib/python3.7/site-packages/fithic/fithic.py", line 327, in main
    biasDic = read_biases(biasFile)
  File "/home/SLY68/anaconda3/envs/hicpro/lib/python3.7/site-packages/fithic/fithic.py", line 808, in read_biases
    chrom=words[0]; midPoint=int(words[1]); bias=float(words[2])
IndexError: list index out of range

--visual flag has cutoff images

Hi, I noticed that while I was using the --visual flag to produce plots, all the plots were slightly cutoff. I imagine this is a matplotlib issue while saving the images, maybe adding plt.tight_layout() or something before saving the image? (Unless it's being drawn with R?)

OverflowError: cannot convert float infinity to integer

Hi! I hope you are doing well. I am receiving this error when I try different upper bound values while running FitHiC:

(fithic) [murthys3 FitHiC]$ fithic -f test/fragmentLists/Dixon_hESC_HindIII_hg18_w40000_chr1.gz -i test/contactCounts/Dixon_hESC_HindIII_hg18_w40000_chr1.gz -o test/tested_outputs_again -L 2 -U 1000 -p 2 -b 100 -r 40000 -x intraOnly
.
.
.
.
.
.
.
.
Traceback (most recent call last):
File "/home/murthys3/.local/bin/fithic", line 10, in
sys.exit(main())
File "/home/murthys3/.local/lib/python3.6/site-packages/fithic/fithic.py", line 310, in main
(binStats,noOfFrags, maxPossibleGenomicDist, possibleIntraInRangeCount, possibleInterAllCount, interChrProb, baselineIntraChrProb)= generate_FragPairs(binStats, fragsFile, resolution)
File "/home/murthys3/.local/lib/python3.6/site-packages/fithic/fithic.py", line 654, in generate_FragPairs
log.write("Range of possible genomic distances [%d %d] \n" % (minPossibleGenomicDist, maxPossibleGenomicDist)),
OverflowError: cannot convert float infinity to integer

It seems like for the fragment file, there is an upper threshold set as infinity that cannot be converted to an integer. Without the upper threshold, the FitHiC program runs successfully. Do you know how I can fix this? This seems to be the case regardless of the input file I have selected.

Thanks.

How to get different contacts?

I call Loops for normal and cancer samples using fithic. I want to compare two samples results. Could you give help information or which software could be used for achieve the function to me ?Thank you!

Generate biasfile/KRnorm vector

Hello,
What would you recommend for generating the biasfile (i.e. KRnorm) from the Hi-C interaction matrix? All the tools I have found compute directly the KR normalized matrix, but as far as I understand, FitHiC requires the raw matrix and the bias vector. I have tried retrieving the bias vector from the diagonal balanced matrix, but I was wondering if there is a more straightforward approach you would recommend.
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.