Giter Club home page Giter Club logo

bionorm's Introduction

bionorm

bionorm normalizes and validates genomic data files prior to further processing or inclusion in a data store such as that of the Legume Federation.

Prerequisites

Python 3.6 or greater is required. This package is tested under Linux and MacOS using Python 3.7.

Installation for Users

Install via pip or (better yet) pipx:

pipx install bionorm

bionorm contains some long commands and many options. To enable command-line completion for bionorm commands, execute the following command if you are using bash as your shell:

eval "$(_BIONORM_COMPLETE=source_bash bionorm)"

For Developers

If you plan to develop bionorm, you'll need to install the poetry dependency manager. If you haven't previously installed poetry, execute the command:

curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python

Next, get the master branch from GitHub

git clone https://github.com/legumeinfo/bionorm.git

Change to the bionorm/ directory and install with poetry:

poetry install -v

Run bionorm with poetry:

poetry run bionorm

Usage

Installation puts a single script called bionorm in your path. The usage format is:

bionorm [GLOBALOPTIONS] COMMAND [COMMANDOPTIONS][ARGS]

Global Options

The following options are global in scope and, if used, must be placed before COMMAND. Not all commands support every global option:

-v, --verbose Log debugging info to stderr.
-q, --quiet Suppress logging to stderr.
--no-logfile Suppress logging to file.
-e, --warnings_as_errors Treat warnings as fatal (for testing).

Commands

A listing of commands is available via bionorm --help. The currently implemented commands are:

prefix_fasta Prefix FASTA files for data store standard.
prefix_gff Prefix and sort GFF3 file for data store standard.
busco Perform BUSCO checks.
detector Detect/correct incongruencies among files.
fasta Check for GFF/FASTA consistency.
generate_readme Generates a README file with details of genome.
index Indexes FASTA file.

Each command has its COMMANDOPTIONS, which may be listed with:

bionorm COMMAND --help

Project Status

Latest Release Python package Make me NORMAL, please!
GitHub GitHub repository
License License terms
Travis Build Travis CI
Coverage Codecov.io test coverage
Code Grade Codacy.io grade
Dependencies dependabot dependencies
Pre-commit pre-commit
Issues Issues reported

bionorm's People

Contributors

joelb123 avatar ctcncgr avatar

Watchers

Alan Cleary avatar James Cloos avatar  avatar  avatar  avatar

bionorm's Issues

installation seemingly broken

followed recommended pipx installation on haldane; result:
[adf@haldane ~]$bionorm
Traceback (most recent call last):
File "/home/localhost/adf/.local/bin/bionorm", line 5, in
from bionorm import cli
File "/home/localhost/adf/.local/pipx/venvs/bionorm/lib/python3.8/site-packages/bionorm/init.py", line 84, in
from .installer import install # isort:skip
File "/home/localhost/adf/.local/pipx/venvs/bionorm/lib/python3.8/site-packages/bionorm/installer.py", line 13, in
from packaging import version
ModuleNotFoundError: No module named 'packaging'

Move datadir_mgr fixture to its own repo

The functions in the pytest datadir_mgr fixture are definitely useful in implementing proper test for other NCGR projects, including ones that I code for. The code isn't very long (~200 lines), but it deserves its own test suite. Move this to its own repository.

index-gff does not seem to respect --compress option

bionorm index-gff --compress Aeschynomene_evenia/CIAT22838.gnm1.ann0.ZM3R/aesev.CIAT22838.gnm1.ann0.ZM3R.gene_models_main.gff3

File "/home/localhost/adf/.local/pipx/venvs/bionorm/lib/python3.8/site-packages/sh.py", line 865, in handle_command_exit_code
raise exc
sh.ErrorReturnCode_1:

RAN: /erdos/adf/bin/tabix -p gff Aeschynomene_evenia/CIAT22838.gnm1.ann0.ZM3R/aesev.CIAT22838.gnm1.ann0.ZM3R.gene_models_main.gff3

STDOUT:

STDERR:
[tabix] the compression of 'Aeschynomene_evenia/CIAT22838.gnm1.ann0.ZM3R/aesev.CIAT22838.gnm1.ann0.ZM3R.gene_models_main.gff3' is not BGZF

Get extract-fasta tests running

I admit that this issue is a retrospective one, to document what I've been working on. extract-fasta runs on systems with gffread installed but fails on travis. Without CI testing, we don't know when things break.

To get extract-fasta working under pytest, we have to solve the following problems:

  1. Make an "install" command that installs, for now just gffread, but is the gateway for installation of other binary dependencies.
  2. Implement downloading of required data files in pytest.
  3. Implement a way for results from previous steps to be cached in pytest.
  4. Put the required data files in the data store.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.