pinellolab / crisprme Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
Hi @samuelecancellieri ,
Thank you and your colleagues for developing such an invaluable tool.
I had a first trial with cisprme
using docker. It seemed to work and produced lots of results which I have been trying to understand.
To be exact, I would love to understand the columns in *altMerge.txt.bestCFD.txt
, *bestMerge.txt
and final_results_*.bestMerge.txt.bestCFD.txt.*
.
I am particularly interested in understanding the columns coming from the *altMerge.txt file and have been listed below.
Moreover, If I am going to select the most likely off-targets, which file and which metrics should I rely on to make choices?
Thanks a lot in advance and Your help will be greatly appreciated.
#######################columns from *altMerge.txt file ##########
PAM_gen
Var_uniq
#Seq_in_cluster
CFD_ref
Highest_CFD_Risk_Score
Highest_CFD_Absolute_Risk_Score
MMBLG_PAM_gen
MMBLG_CFD
MMBLG_CFD_ref
MMBLG_CFD_Risk_Score
MMBLG_CFD_Absolute_Risk_Score
Hello,
We're looking into CRISPRme and I'm wondering if we could get clarification on a few details.
(1) If there are multiple indels within one guide's length of each other that together introduce a new off-target, will CRISPRme find it?
(2) How does CRISPRMe deal with PAMless enzymes? Making a TST containing the entire genome seems computationally expensive.
(3) Is there an allele frequency filter that is applied? If not, how does the computation time not balloon with larger reference panels?
(4) Is it correct that bulges are not allowed in the PAM?
Thank you,
Katie
Tried using the webtool to test the example guide targeting BCL11A in the paper, Human genetic diversity alters off-target outcomes of therapeutic gene editing (https://doi.org/10.1038/s41588-022-01257-y)
CRISPRme does identify the CPS1 off-target site as demonstrated in the paper at location chr2:210530658 but it is not annotated with the CPS1 gene or rsID.
Hello,
Thank you for creating this useful tool. I had a quick question: if I ran CRISPRme with parameters X on the web portal and then ran it locally with the same parameters X, should I expect the integrated_results file that is output to be identical? I was specifically wondering if there was any output truncation that happened on the web portal for efficiency even with the integrated_results file (which I think is the complete off-target set).
Thank you
Hi @samuelecancellieri and @kclem -- Is it possible to post links to the converted CRISPRme gnomAD data (e.g. gnomad.genomes.v3.1.sites.chr*.collapsed.vcf.gz
)?
Thank you!
Dear team,
We are trying to install this software and are somehow stuck at this step.
conda create -n crisprme python=3.8 crisprme -y
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: |
Could you please help in solving this ?
Regards.
Najeeb
Hi Crisprme team,
Thanks for the amazing tool! I am using the latest version 2.10.0 on conda. I am testing Crisprme for now, but will most likely get an industry license very soon. Please let me know how that works.
I have few questions/requests for now.
In which files/tables can I find information about the off-targets' closest gene(s), whether the off target is coding region or intronic region or a regulatory element? What about the type of gene information, whether it is a tumor suppressor or there is a PAM creation? Where are these information stored? Right now, I am only using the 1000G variants.
Are the reported CFD scores homology based only? Or the score are more sophisticated and depend on where the mismatch has occurred?
Lastly and more importantly, I would like to see the impact of genetic variation on the on-target specificity. Will genetic variation reduce the chance of being on target. Are there sub-populations in which the on-target PAM is disabled? These are the type of information I would like to see in the output. Hope this is rather easy to implement.
Looking forward to hearing from you,
-Davood
Describe the bug
Terminal output:
$ bash crisprme_auto_test_conda.sh
starting download and unzip of data
unzip gencode+encode annotations
start download VCF data and genome (this may take a long time due to connection speed)
download 1000G VCFs
crisprme_auto_test_conda.sh: line 26: 40706 Abort trap: 6 wget -c -q ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20190312_biallelic_SNV_and_INDEL/ALL.chr$i.shapeit2_integrated_snvindels_v2a_27022019.GRCh38.phased.vcf.gz
download hg38
start testing
Launching job /Users/sdm8/Desktop/crisprme_test/crisprme_test/Results/sg1617.6.2.2. The stdout is redirected in log_verbose.txt and stderr is redirected in log_error.txt
Traceback (most recent call last):
File "/Users/sdm8/opt/anaconda3/envs/crisprme/bin/crisprme.py", line 934, in <module>
complete_search()
File "/Users/sdm8/opt/anaconda3/envs/crisprme/bin/crisprme.py", line 673, in complete_search
raise OSError(f"\nCRISPRme run failed! See {os.path.join(outputfolder, 'log_error.txt')} for details\n")
OSError:
CRISPRme run failed! See /Users/sdm8/Desktop/crisprme_test/crisprme_test/Results/sg1617.6.2.2/log_error.txt for details
log_error.txt
[W::bcf_sr_add_reader] No BGZF EOF marker; file '/Users/sdm8/Desktop/crisprme_test/crisprme_test/VCFs/hg38_1000G/ALL.chr7.shapeit2_integrated_snvindels_v2a_27022019.GRCh38.phased.vcf.gz' may be truncated
mv: rename /Users/sdm8/Desktop/crisprme_test/crisprme_test/Genomes/variants_genome/SNPs_genome/hg38_enriched/ to ./hg38+hg38_1000G/: No such file or directory
To Reproduce
I had to install like this without specifying python version as described in README because I kept getting an error message pertaining to version one of the program:
Encountered problems while solving:
- package crisprme-1.0.1-0 requires crispritz, but none of the providers can be installed
Installed via:
mamba install -c bioconda crisprme
## missing module Bio -
mamba install biopython
I had to download test data off of my VPN (federal govt netowork), I downloaded the test data via:
wget https://www.dropbox.com/s/urciozkana5md0z/crisprme_test.tar.gz?dl=1 -O crisprme_test.tar.gz
tar -xvf crisprme_test.tar.gz
Environment (please complete the following information, ONLY applicable if running CRISPRme via command line):
Describe the bug
Status report
Indexing genome(s): Not available
Searching spacer: Not available
Post processing: Not available
Merge targets: Not available
Annotating and generating images: Not available
Integrating results: Not available
Populating database: Not available
The selected result encountered some errors, please remove it and try to submit again
To Reproduce
If running CRISPRme via command line, type the command line call to CRISPRme returning the error
If running CRISPRme via the website, please fill the form below:
Spacer sequences
AGACAGATATTTGCATTGAGATA
Cas protein
Cas12a
PAM
TTTV-23bp-Cas12a
Genome
Hg38
Variants dataset (OPTIONAL)
plus 1000 Genomes Project variants
plus HGDP variants
Thresholds
Mismatches: 6
DNA Bulges: 2
RNA Bulges: 2
Base editing (OPTIONAL)
Start: 7
Stop: 15
Nucleotide: A
Annotation
Expected behavior
I get an error and no results
Screenshots
If running CRISPRme via website, add screenshots to help explain your problem.
Environment (please complete the following information, ONLY applicable if running CRISPRme via command line):
Additional context
I ran your test dataset with Cas9 and had no issues. I wonder if it has something to do with the Cas12a; I also tried different gRNAs for Cas12a with the same outcome
Hi all,
Thanks for the tool! I have a few questions
Are the _alt and _random files from the hg38 genome used as part of the search with the standard VCF sets? If so, can you explain how they are identified in the output?
I am interested in using some other VCF datasets, can you provide any information on what the tools support for the VCF files? I am planning to format them the same as you have with your script for cleaning up the gnomad 3.1 dataset, but I would prefer to not convert them to multi-allelic records like you are doing because of the infromation loss -- is it required?
Thanks,
Tom
Thank you for this great tool!
I run this command:
nohup crisprme.py complete-search --genome Genomes/hg38/ --vcf list_vcf.txt/ --guide New_guides.txt --pam PAMs/20bp-NGG-SpCas9.txt --annotation Annotations/encode+gencode.hg38.bed --samplesID list_samplesID.txt --gene_annotation Annotations/gencode.protein_coding.bed --bMax 2 --mm 6 --bDNA 2 --bRNA 2 --merge 3 --output New --thread 64 &
I use --thread 64, but the code is still only using 4 cores. How do I change the number of cores utilized?
Also, what does --merge do? I couldn't find that in the --help section.
Thanks again,
-Davood
Describe the bug
When running the complete-search command there is a Memory Error in the Enrichment step. The folder titled Genomes/hg38+hg38_1000G
that contains the reference enrichment fastas (ex: chr3.enriched.fa) is missing the two from the largest chromosomes: chr1.enriched.fa and chr2.enriched.fa. There is both chr1.fa and chr2.fa in the Genomes/hg38
folder.
To Reproduce
Running final command in the crisprme_auto_test_conda.sh.
command:
crisprme.py complete-search --genome Genomes/hg38/ --vcf list_vcf.txt/ --guide sg1617.txt --pam PAMs/20bp-NGG-SpCas9.txt --annotation Annotations/encode+gencode.hg38.bed --samplesID list_samplesID.txt --gene_annotation Annotations/gencode.protein_coding.bed --bMax 2 --mm 6 --bDNA 2 --bRNA 2 --merge 3 --output sg1617.6.2.2 --thread 4
Expected behavior
Expect a list of off-target sites and some images. Instead some files are created but none including a full list of the off-target sites. No images are created in the img directory.
Environment (please complete the following information, ONLY applicable if running CRISPRme via command line):
Additional context
Running on a c4.4xlarge EC2 instance with 2Tb volume.
I am running on a clean docker image with the test script. It gets far into the process and then gets and error.
log_error.txt contains :
./merge_close_targets_cfd.sh: line 34: 3282 Killed python remove_contiguous_samples_cfd.py $fileIn $fileOut $thresh $chrom $position $total $true_guide $snp_info $cfd $sort_pivot $sorting_criteria_scoring $sorting_criteria
CRISPRme ERROR: contigous SNP removal failed (script: ./merge_close_targets_cfd.sh line 31)
./merge_close_targets_cfd.sh: line 34: 3281 Killed python remove_contiguous_samples_cfd.py $fileIn $fileOut $thresh $chrom $position $total $true_guide $snp_info $cfd $sort_pivot $sorting_criteria_scoring $sorting_criteria
CRISPRme ERROR: contigous SNP removal failed (script: ./merge_close_targets_cfd.sh line 31)
Traceback (most recent call last):
File "/opt/conda/opt/crisprme/PostProcess/remove_contiguous_samples_cfd.py", line 662, in
merge_targets()
File "/opt/conda/opt/crisprme/PostProcess/remove_contiguous_samples_cfd.py", line 618, in merge_targets
int(target_data[input_args[4]]),
ValueError: invalid literal for int() with base 10: '+'
CRISPRme ERROR: contigous SNP removal failed (script: ./merge_close_targets_cfd.sh line 31)
mv: cannot stat '/Test123/Results/sg1617_2/final_results_sg1617_2.bestMerge.txt.bestCFD.txt.trimmed': No such file or directory
I can add the full verbose log file if that would help, but here is the end that corresponds to the error...
Sorting file
Sorting file
Sorting file
Sorting done in 0 seconds
Sorting done in 0 seconds
Sorting done in 0 seconds
Merging contiguous targets
Merging contiguous targets
Merging contiguous targets
Any help tracing this down would be helpful, I am looking forward to running the tool on a real sequence!
Let me know if any other information would be helpful...
Tom
Hi,
Is it possible to run CRISPRme on a single chromosome? Ideally keeping all chromosomes in the same Genomes/
folder.
Thanks!
Hi, I'm really excited to use CrisprMe and think it will be really useful for my project. I was hoping to use the website since I had tested it out before but recently any jobs I submit don't seem to be updating. I was wondering if you knew if/when the website will be up and running again to process jobs? Thanks!
Hi!
I am trying to run CRISPRme on chr22 with no VCF files, but the run fails with the error below. Is there any required parameter that I might be missing? Thanks!
To Reproduce
$ crisprme.py complete-search --genome Genomes/hg38_chr22/ --guide sg1617.txt --pam PAMs/20bp-NGG-SpCas9.txt --bMax 2 --mm 6 --bDNA 2 --bRNA 2 --merge 3 --output output --thread 12
--annotation not used
Launching job /data2/crisprme_test/Results/output. The stdout is redirected in log_verbose.txt and stderr is redirected in log_error.txt
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/bin/crisprme.py", line 934, in <module>
complete_search()
File "/home/ubuntu/miniconda3/envs/crisprme/bin/crisprme.py", line 673, in complete_search
raise OSError(f"\nCRISPRme run failed! See {os.path.join(outputfolder, 'log_error.txt')} for details\n")
OSError:
CRISPRme run failed! See /data2/crisprme_test/Results/output/log_error.txt for details
$ head -50 /data2/crisprme_test/Results/output/log_error.txt
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3081, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rsID'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./remove_n_and_dots.py", line 29, in <module>
chunk['rsID'] = chunk['rsID'].str.replace('.', 'NA')
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/frame.py", line 3024, in __getitem__
indexer = self.columns.get_loc(key)
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3083, in get_loc
raise KeyError(key) from err
KeyError: 'rsID'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3081, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rsID'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./remove_n_and_dots.py", line 29, in <module>
chunk['rsID'] = chunk['rsID'].str.replace('.', 'NA')
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/frame.py", line 3024, in __getitem__
indexer = self.columns.get_loc(key)
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3083, in get_loc
raise KeyError(key) from err
KeyError: 'rsID'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3081, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'rsID'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
If running CRISPRme via command line, type the command line call to CRISPRme returning the error
If running CRISPRme via the website, please fill the form below:
Spacer sequences
Cas protein
PAM
Genome
Variants dataset (OPTIONAL)
Thresholds
Mismatches:
DNA Bulges:
RNA Bulges:
Base editing (OPTIONAL)
Start:
Stop:
Nucleotide:
Annotation
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If running CRISPRme via website, add screenshots to help explain your problem.
Environment (please complete the following information, ONLY applicable if running CRISPRme via command line):
Additional context
Add any other context about the problem here.
Describe the bug
Trying to download test data to test my installation:
$ wget https://www.dropbox.com/s/urciozkana5md0z/crisprme_test.tar.gz?dl=1 -O crisprme_test.tar.gz
--2023-04-04 15:06:23-- https://www.dropbox.com/s/urciozkana5md0z/crisprme_test.tar.gz?dl=1
Resolving www.dropbox.com (www.dropbox.com)... 162.125.8.18
Connecting to www.dropbox.com (www.dropbox.com)|162.125.8.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /s/dl/urciozkana5md0z/crisprme_test.tar.gz [following]
--2023-04-04 15:06:23-- https://www.dropbox.com/s/dl/urciozkana5md0z/crisprme_test.tar.gz
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc9473414cea25b32ec2d6ecc563.dl.dropboxusercontent.com/cd/0/get/B5gpbbCTnzZCiZiieBea8pMWQGbJ23D63L7ejpA053ciEAlNmWDQHGedfiSn7kZEdplAXkoEaZRB9OtFblQ19ZPUT5UjucX52-9NzWR3aBcMec9NK5PvnCdXdFxAJoGgK_BsGDswJW0BdEgwdASC2OveifZdcFvCQ_IMrmBucgU5MBchSE4NLe_s8dmw2Ge2llM/file?dl=1# [following]
--2023-04-04 15:06:24-- https://uc9473414cea25b32ec2d6ecc563.dl.dropboxusercontent.com/cd/0/get/B5gpbbCTnzZCiZiieBea8pMWQGbJ23D63L7ejpA053ciEAlNmWDQHGedfiSn7kZEdplAXkoEaZRB9OtFblQ19ZPUT5UjucX52-9NzWR3aBcMec9NK5PvnCdXdFxAJoGgK_BsGDswJW0BdEgwdASC2OveifZdcFvCQ_IMrmBucgU5MBchSE4NLe_s8dmw2Ge2llM/file?dl=1
Resolving uc9473414cea25b32ec2d6ecc563.dl.dropboxusercontent.com (uc9473414cea25b32ec2d6ecc563.dl.dropboxusercontent.com)... 127.0.0.1
Connecting to uc9473414cea25b32ec2d6ecc563.dl.dropboxusercontent.com (uc9473414cea25b32ec2d6ecc563.dl.dropboxusercontent.com)|127.0.0.1|:443... failed: Connection refused.
Tried via browser as well.
This address: https://www.dropbox.com/s/urciozkana5md0z/crisprme_test.tar.gz?dl=1
This site can’t be reacheducbbeffe6fcc5ac3d70c313358f9.dl.dropboxusercontent.com refused to connect.
Try:
Checking the connection
[Checking the proxy and the firewall](chrome-error://chromewebdata/#buttons)
ERR_CONNECTION_REFUSED
Tried going up a directory and I can see the directory crisprme_test
but when I try to download it produces an error. Same as when I go into the directory and try to download any of the files. A red error message at the top says "There was an error downloading your file."
Hello,
I tired to launch the test command:
docker run -v ${PWD}:/DATA -w /DATA -i scancellieri/crisprme crisprme.py complete-search --genome Genomes/hg38/ --vcf list_vcf.txt/ --guide sg1617.txt --pam PAMs/20bp-NGG-SpCas9.txt --annotation Annotations/encode+gencode.hg38.bed --samplesID list_samplesID.txt --gene_annotation Annotations/gencode.protein_coding.bed --bMax 2 --mm 6 --bDNA 2 --bRNA 2 --merge 3 --output sg1617.6.2.2 --thread 4
But received error:
The folder specified for --vcf does not exist
This is what the test directory looks like (command tree -L 2)
├── Annotations
│ ├── encode+gencode.hg38.bed
│ └── gencode.protein_coding.bed
├── clean_all.sh
├── crisprme_auto_test_conda.sh
├── crisprme_auto_test_docker.sh
├── crisprme_auto_test_download_essentials.sh
├── crisprme_auto_test_no_download.sh
├── Dictionaries
├── Genomes
│ ├── hg38
│ └── hg38.chromFa.tar.gz
├── list_samplesID.txt
├── list_vcf.txt
├── PAMs
│ └── 20bp-NGG-SpCas9.txt
├── Results
├── samplesIDs
│ ├── hg38_1000G.samplesID.txt
│ ├── hg38_gnomAD.samplesID.txt
│ └── hg38_HGDP.samplesID.txt
├── sg1617.txt
└── VCFs
├── hg38_1000G
└── hg38_HGDP
I then modified the command to --vcf VCFs/hg38_1000G, but then have a new errorThe folder specified for --pam does not exist
What am I doing wrong?
Thank you!
Paola
Describe the bug
Hi!
I am trying to run CRISPRme with a converted gnomAD VCF but the run fails. I am using v2.1.0 by conda.
The Genome
folder contains chr22 only, and the VCFs
folder contains the corresponding VCF for chr22 converted using the gnomAD-converter
. A run without VCFs finished without errors.
To Reproduce
The full command line is the following
crisprme.py complete-search --genome Genomes/hg38_chr22 --guide sg1617.txt --pam PAMs/20bp-NGG-SpCas9.txt --annotation Annotations/encode+gencode.hg38.bed --gene_annotation Annotations/gencode.protein_coding.bed --mm 6 --output sg1617_gnomad_chr22_top --thread 12 --vcf list_vcf_gnomad_chr22.txt --samplesID list_samplesID_gnomad.txt
I get the following error message
$ cat Results/sg1617_gnomad_chr22_top/log_error_no_check.txt
Traceback (most recent call last):
File "./process_summaries.py", line 136, in <module>
dict_samples[sample][3] += 1
KeyError: 'raw'
Traceback (most recent call last):
File "./process_summaries.py", line 136, in <module>
dict_samples[sample][3] += 1
KeyError: 'raw'
Traceback (most recent call last):
File "./process_summaries.py", line 136, in <module>
dict_samples[sample][3] += 1
KeyError: 'raw'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CFD.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_CRISTA.txt'
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/crisprme/opt/crisprme/PostProcess/populations_distribution.py", line 87, in <module>
with open(sys.argv[1]) as summary:
FileNotFoundError: [Errno 2] No such file or directory: '/data2/crisprme_test/Results/sg1617_gnomad_chr22_top/.sg1617_gnomad_chr22_top.PopulationDistribution_fewest.txt'
I'm seeking information about acquiring a license to use CRISPRme at my company.
The LICENSE doc says:
Please contact [email protected] and [email protected] for more information.
I've sent emails, but haven't heard back. Is there newer or alternative contact information?
Thanks!
Hi CRISPRme Team,
Quick question: I noticed the installation scripts do not download the 1000G Project VCF for the Y chromosome.
Is this expected behavior? Thank you so much!
Update and Fix Dockerfile:
crisprme_auto_test_docker.sh
Gives me above error for few guides(not all). In particular L121 in radar_chart_dict_generator.py- 'if guide[count] == 'N':', error is IndexError: string index out of range.
Running against the gnomAD v4.0.0 (converted with CRISPRme) fails in the Integrating Results phase.
Error message
Traceback (most recent call last):
File "/opt/conda/opt/crisprme/PostProcess/./resultIntegrator.py", line 492, in
if float(elem) == 0:
ValueError: could not convert string to float: 'rs635634'
CRISPRme ERROR: result integration failed (script: /opt/conda/opt/crisprme/PostProcess/post_process.sh line 45)
CRISPRme ERROR: postprocessing failed - reference (script: /opt/conda/opt/crisprme/PostProcess/submit_job_automated_new_multiple_vcfs.sh line 848)
Some details
I ran a guide with a set of parameters against the hg38_1000G VCFs successfully and wanted to also run the same against gnomAD VCFs. I downloaded the VCFs and converted them and then ran again with the same parameters but updated the sampleIDs to be the gnomAD sample IDs and VCFs to be the gnomAD VCFs. I get the error that is above. I am running on a fresh Ubuntu VM with 128 GB of RAM against the latest docker. I have reproduced the error twice.
I have faced the problems
hg38+hg38_1000G_test_20bp-NGG-SpCas9.txt_guides.txt_gencode_encode.hg38.bed_6_2_2_chrX_KI270881v1_alt.total.cluster.txt.tmp_sort.txt file have specific symbols
<90> in third columns. And I had faced with error blows,I want to know if this problem can be solved,any ideas about how to solve this problem would be very grateful.
run command line
crisprme.py complete-search --genome Genomes/hg38 --vcf list_vcf.txt/ --guide sg1617.txt --pam PAMs/20bp-NGG-SpCas9.txt --annotation Annotations/gencode_encode.hg38.bed --samplesID list_samplesID.txt --gene_annotation Gencode/gencode.protein_coding.bed --bMax 2 --mm 6 --bDNA 2 --bRNA 2 --merge 3 --output sg1617.6.2.2_new --thread 58
8. Annotation
I download the docker crispr using command line:
docker pull pinellolab/crisprme
file format
the third column have specific symbols,<90>
Any help tracing this down would be helpful, I am looking forward to running the tool on a real sequence!
Let me know if any other information would be helpful...
khl
In output, I didn't get directory with raw targets (containing the un-processed results from the search). I do get directory with images.
Hi,
I ran CRISPRme for a guide sequence and included 1000 Genome vcf files to find out how variants effect the OT sites. I found some sites that didn't have any variants mapping but were reported as having lower number of mismatches/bulges in "MMBLG_Mismatches | MMBLG_Bulge_Size | MMBLG_Total" columns compared to "Mismatches | Bulge_Size | Total" columns.
The OT sequence is same in "DNA" and "MMBLG_DNA" columns but the alignment to guide is different.
aggCACTAG-aTTGACaCACAGG vs aggCACTAGA-TTGACaCACAGG
All variation related rows have "n" value for this example, so there is no variant mapping to this genomic region.
Is this a bug or should I interpret the results in a different way?
Thank you,
Meltem
Hi there,
I have a question about how *.altMerge.txt is generated. We're interested in this file because we need an exhaustive list of all variants that could result in a hit at a specific locus, not just the top-scoring one.
I ran CRISPRme with the gnomAD data (6 mm, 1 bulge). I notice that for each cluster, the first 24 columns are identical, but have different MMLBG* columns. They also have identical CRISTA* columns.
Could you shed some light as to how this file is generated? Should all the sites contained in the MMBLG* be considered alternate sites?
Thanks
Dear Samuele,
thank you for such an interesting tool!
I have tried to run it on docker but the I am unable to successfully test the docker application.
This is the error message I get:
<3>WSL (4903) ERROR: CreateProcessEntryCommon:577: execvpe /bin/bash failed 2
<3>WSL (4903) ERROR: CreateProcessEntryCommon:586: Create process not expected to return
Docker is fully functional and working with other images. I have stuck to
Thank you in advance for your help
Malte
Hi,
What is the format for the file containing the pam, eg. PAMs/20bp-NGG-spCas9.txt
. I don't seem to find a description of this file.
Thanks!
Hi!
Thanks for the quick reply on the other issues! I managed to run it successfully on chr22.
I am inspecting output.bestMerge.txt
. What does Position
correspond to? and what is the difference with Cluster_Position
?
How can I get start and end coordinates of the sequence in DNA
?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.