Giter Club home page Giter Club logo

bedparse's Introduction

Build Status Docs Status JOSS Status DOI License: MIT

Bedparse

Bedparse is a simple python module and CLI tool to perform common operations on BED files.

It offers 11 sub-commands that implement the following functionality:

  • filter: Filtering of transcripts based on annotations
  • join: Joining of annotation files based on transcript names
  • gtf2bed: Conversion from GTF to BED format
  • convertChr: Conversion from UCSC to Ensembl chromosome names (and viceversa)
  • bed12tobed6: Conversion from bed12 to bed6
  • promoter: Promoter reporting
  • introns: Intron reporting
  • cds: CDS reporting
  • 3pUTR and 5pUTR: UTR reporting
  • validateFormat: Check that the file conforms with the BED format

Installation

Installing is as simple as:

pip install bedparse

Basic usage

The basic syntax in the form: bedparse subcommand [parameters].

For a list of all subcommands and a brief explanation of what they do, use: bedparse --help.

For a detailed explanation of each subcommand and a list of its parameters, use the --help option after the subcommand's name, e.g.: bedparse promoter --help

Documentation

Our documentation is hosted on Read the Docs.

We also have a short tutorial to guide you through the basic functions.

Publications

If you use bedparse please cite the following paper:

Leonardi, (2019). Bedparse: feature extraction from BED files. Journal of Open Source Software, 4(34), 1228, https://doi.org/10.21105/joss.01228

bedparse's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bedparse's Issues

Unable to parse BED3 file from Ensembl example

I tried using the BED3 file from the Ensembl example and am currently getting:

(tempenv-19e22026634fb) ~ ❯❯❯ cat test.bed
chr1  213941196  213942363
chr1  213942363  213943530
chr1  213943530  213944697
chr2  158364697  158365864
chr2  158365864  158367031
chr3  127477031  127478198
chr3  127478198  127479365
chr3  127479365  127480532
chr3  127480532  127481699
(tempenv-19e22026634fb) ~ ❯❯❯ bedparse cds test.bed
Traceback (most recent call last):
  File "/Users/BenjaminLee/.virtualenvs/tempenv-19e22026634fb/bin/bedparse", line 10, in <module>
    sys.exit(main())
  File "/Users/BenjaminLee/.virtualenvs/tempenv-19e22026634fb/lib/python3.6/site-packages/bedparse/bedparse.py", line 222, in main
    args.func(args)
  File "/Users/BenjaminLee/.virtualenvs/tempenv-19e22026634fb/lib/python3.6/site-packages/bedparse/bedparse.py", line 40, in cds
    utr=bedline(line.split('\t')).cds(ignoreCDSonly=args.ignoreCDSonly)
  File "/Users/BenjaminLee/.virtualenvs/tempenv-19e22026634fb/lib/python3.6/site-packages/bedparse/bedline.py", line 29, in __init__
    raise BEDexception("Only BED3,4,6,12 are supported. "+self.name+" is neither.")
bedparse.BEDexception: Only BED3,4,6,12 are supported. NoName is neither.

Is this behavior anticipated? Either way, could you provide some example files in the repository for experimentation?

Use dynamically generated CLI docs

While not an issue, you may be interested in using something like sphinxcontrib-programoutput to have your CLI help command generated upon each commit so that it always stays up to date.

The one rub would be that you'd need to switch from markdown to rst, but that should be easy using panda. If you're interested and run into any trouble, I'm happy to help out.

tx2genome error when run on BED6

The function throws the following error when used on a BED6 record:

AttributeError: ("'bedline' object has no attribute 'exStarts'", 'occurred at index 0')

bedparse convert fails on nanocompore bed file

Hi,
I am having issues with using bedparse to convert an ensembl bed file into UCSC. The bed file was made with nanocompore and the alignments were made to gencode.v33_transcripts which contain the transcript version numbers.
When I run the following code on a bed file with the header removed
bedparse convertChr --assembly hg38 --target ucsc WT_v_KO_DRS.2_sig_sites_GMM_logit_pvalue_context_2_thr_0.01.bed
The script fails on the first line of the bed file and I get the following error;

Screenshot 2021-03-19 120449

Traceback (most recent call last):
File "/home/samirwatson/miniconda3/envs/guitar/bin/bedparse", line 10, in
sys.exit(main())
File "/home/samirwatson/miniconda3/envs/guitar/lib/python3.9/site-packages/bedparse/bedparse.py", line 250, in main
args.func(args)
File "/home/samirwatson/miniconda3/envs/guitar/lib/python3.9/site-packages/bedparse/bedparse.py", line 121, in convertChr
translatedLine=bedline(line.split('\t')).translateChr(assembly=args.assembly, target=args.target, suppress=args.suppressMissing, ignore=args.allowMissing, patches=args.patches)
File "/home/samirwatson/miniconda3/envs/guitar/lib/python3.9/site-packages/bedparse/bedline.py", line 524, in translateChr
raise BEDexception("The chromosome of transcript "+self.name+" ("+self.chr+") can't be found in the DB.")
bedparse.BEDexception: The chromosome of transcript ENST00000368723.4_ACCTC (chr1) can't be found in the DB.

I have checked the bed file using the validateFormat function and it passed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.