Giter Club home page Giter Club logo

phagefilter's People

Contributors

brsi3353 avatar dreycey avatar klinvill avatar

Watchers

 avatar  avatar

phagefilter's Issues

Output filtered reads should inherit input file type.

https://github.com/Dreycey/PhageFilter/blob/963095238b15e163024ea98fc598cc89798961a4/src/main.rs#L301C17-L319

I see around here that PhageFilter is manually writing output as fasta format. It is likely that input reads will be fastq format, where the qualities are definitely relevant and should be maintained in the output. I would love to see the format inherited.

Otherwise, it might be simpler to write only IDs and pass the responsibility to read extraction to another tool, such as samtools.

Documentation should mention details regarding input reads parsing

It seems that the format of input reads is determined by file extension. I would encourage you to document this behaviour.

Also, this is brittle and should ideally encompass multiple conventions for extension syntax to be robust. A better "automagic" approach would be to inspect the begging file contents for a match to either header format.

In the end, you might find it simpler to assume a format and give users the option to declare the other.

Separate issue, but I am being lazy

GZIP read files do not appear to be supported? This really awkward with today's sequencer yields.

The manual should explain what is encoded in the headers of output reads.

pub fn get_ext_id(&self, read_id: &String) -> String {
let genomes = self
.read_map
.get(read_id)
.map(|genome_set| {
genome_set
.iter()
.map(AsRef::as_ref)
.collect::<Vec<&str>>()
.join(",")
})
.unwrap_or_default();
read_id.clone() + " |" + &genomes
}

The header appears to contain references to mapped genomes, but I do not see this explained in the documentation.

Is it possible to supply a ranking/score for each map? This is perhaps an ignorant question, as I do not know what is returned from searching the gSBT.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.