pjotrp / bioruby-gff3-plugin Goto Github PK
View Code? Open in Web Editor NEWGFF3 plugin for BioRuby - allows parsing big data GFF3
License: MIT License
GFF3 plugin for BioRuby - allows parsing big data GFF3
License: MIT License
Hi Pjotr,
thanks for making this happen.
I may have encountered a bug. In the following gff, my genes of interest are on the reverse strand (my genes of interest are those with "Vg" in the name)
http://fourmidable.unil.ch/temp/Si_gnF%2Escaffold10535-splittingyw.gff
(the GFF was first built by MAKER, and then manually edited using Apollo)
(my connection is slow here so the upload may take another 10 mins or so)
My sequences come out looking like this:
Si_Vg3-RA Sequence:Si_gnF.scaffold10535_2342963:2350091 (2342963:2350091)
TACACCAAGGGACAGCAGGAAGAGAATGAACCACTCATGTGTAGTCGAAACGCTTATCATTGTCAAAAATGCGCTTTTAAAGTTCATCAGAAGTTTATTAGAACATGAAATTAATTTGAAAGCTTTATTTTAACAAACTGAACCATATTTTAGCTAATTATATGTAGATAATATTAACAACGTTTTAGACTTAAAGTGTAAAACGTGTATGAATAATTTAATTTAGTTCGGTCCGAAAAACTCTGTTATATTATGCACTTTATTAAAAGTGAATTCCCTTTAAAGTTAAGTTAGACTGTTTTTTATGTTCAAGGTTAATAATATTAAACTAGTAAAGAAACAGACCGTCAGCCGCACCGGCATCGGTACGGACTGGTGCTCGTACGGACCCTTGGCGTCTTGCTCATAGTTATGAGACATAAGCAAGCCTGTGACTGACCACACCTGTGCAACTTTGTCGTTATATGACCTTAAGTTAATTTCCCACAAGAGCAGTAAGTTCATTTTAGTCTCCTCAACAACGTTCGCTTCATATAGTTAGGTGCTATACGAGTATATGTGGTTCTTAATAGCTTGCCAGGCATAAGGTTCTAAGGACTTCTCTTAGAGCTTATAGCGCTATATGGGTACAGCCCTTTCGGTAAACTCTAGTTCAACTTCGTGCCTCACTAGGCCCTAAATAATAAGCTAGCATTGCATGGATGAACCCTCCACTTATACGAGTTCCCATAGCATCCAGTCGACGTCTAGCTGTGAGTCCCGCTTTTGCGCTATCTATCGGTCTCATGAGTCTAAGGGAGGTTGAGTCTCGGAAGAAGGCGATGCAAATTTCGGTACCTCCTAAGGCAGCCACCGTTTACGCTCCAAGATATACTCTAATGCGGCAACGGGGTTGTACATCGGGTTTGCTCCGGTCTATCTCATGGATACAGCAGACATGGGTCGTTTCCAGTAGTAATACTTCAATTCTTCAACTTCTTAATACTCTTCACGGTCCTCGCTGTCGAGATGGTAATGCCATACCTGCAATTTTACTGCTTCCTTTTATACTACTTTGCTTTATTTCAACAAAGCCATTCAATATATATAAAATTAACCATAGATATAAAATGACTATTAATGAAAGAATAAGATATTAGATCAATTTGTTACACTACTGACATTAACAACTTTTAATACTTTGTATGTAAATACTTTCTAATACTTTCTTTTCTTTGCCCTTTGATAAAATAAAATAAAATTTATAGCCATGAAACAGTATATGCCCCAACAGGTTTAATGTAACGTTGGTCTCTTACTATGCTATATATCTGCTTTGAATAAACTTTAACATACTAACAATGTCCTTAATAGGTGCTGAGTGTATCAATAGTGCCCATGGAACTTTTCAAAGTGGTAAGTTAGATGACTCTACTTCTTGCTCTATTGGCAAGTTGGACTTAGTAGGCTAAGAGGGTAACCATGACAGATGTCATATTGCTTTAATTGAAACCGGTTTTATTTGTTTTAGAGGTTGAGAACCAAACCTGGTGAGCTTAATTCGTTGCAGCTCAGTTGACCTTTAGACCACATGTATAAATTATTAGGAAAGAGACTAAGGCTCGTCGCTTCTCATCCAGTCGGCTCGTAATCAGCTTTAAGGCTCGTTCTTTTGAGAAATCTCTGGTTTTTTTCTAGAAAGGTGTCAGTGTCAAGGAGTAGGTCATCGAGATCATCAAGGTCGTCGCTCCTTCTTTTACTTAGACAGTACGTTAGATTCCGCAGTAATGCTTTGTAAAAGTACCGAGGCTTGCAGGGCGACAATGGAATGAAATAGCCAAAGTTTCCGTTCTGCTAGTACTTTAGATTACTCGTGTTGCAATACGTTGAACGGTTCCTGAATGAAGAGGTTTATCGATTTCTTTATGTCTTAGGAAGACTTCCTATACTCTTATGCGACCTCTTTATACATTTGAATTTCTTAGAGTAAGCGTGGTACCTAGCGTTCGTCATGTGACTCAATCTCGTTATACATAGGGTTAAATTATTCCGTTGTCACCTTCCGCTTTTACGAACCATGTGGAATGCGCTACGACAGCATGTACGACCTTGACCTGGACGAAAACAGTGATAGCTTTTAACCGATTTCTCACCTGTTCAATTTCCGCTTCTCCGCCGTCTTGAGGAAAGATTTTAAGGGTTTTCGCAAGTGGTTGGTTGTGGCCTAATATAGTTTCTTAAGAAACATTCATAGGTAAATACGTGTAAGTATTAATATCAAGGTCTCAGCTAACGTTGTAAATGAATTTGTAGTTTTGAAAATCTCTTCTCATATACTTTAGCCGGTTTTAACCGTAGTTCGCCTTTAGATTTTCTAATTAAGGTGTCACAAAAGAACTACACATACTTTCTCTTTTTATAATTAATATTTTAATTCAAACATATATAAAATTATATATATAAATGCGAATAAAAATCTAATGTAAAATGTATGTCTAAATATGTGTTGCTGTTCAAAATTATTTTCCTGTTTAAATTTTTTTTTGCGTTATAGAAAACAATTTTGAATATGACAACACATATTCACATACGTTTTATATAAACTTTTATTCGTATTTACATATATAATTTTATGTTCAGTCTATTATAATGTTTTTATTTTTTTTCAACTTGTATTATAGTTCATAATGCATAAAACATATTGTAAAAAAACAATAGTTGTTAAAAGACGCGTGAAAGGCCATAATGTTTATAATAAAAAGAGAAAGTATATGTAGTTCTTTGTGACACCTTAATTAAAAAATCCAAAGACGAACGCCGATTAAAATTGAACTCATAAGTAAATATGTCTTATTGGAATTCTGAGTTCGTAAAAAATTAAGAAAAATTGGTTAAGTTTATAAATAATAATTTTTAATATATTTTTTTAATAAAAATTAAAATTTTAAACGTATTAAATTAGACACGTCTTTGACTAATTCTCAAGTCACCATTGAGTTGTTCTTATGCATTTACACAGGCGTGGAGATCGTAAACGACTTGACAATGCGTTATTAATACAGCAGGGTAGAATGATAGGCCATGTGTCAAAGCCGGCATACTGAGAATTCCCTTTACTTCTCTATCTGTTGATATAAAGAATAAACCGATTGGTTAACGTTGTTCCGATGGAGCTTTTGTTATGAGTCTTTTAAGTCTGTAAGTAGAAACGTGAACCACAGTGACGGGTAGGCTTCTAATAGAGTCAGAAACTCGGTATGAATCTTCCGTTCGACGGCTGCTGCTTTATAGTTGCATACGAGTACCACCGGCGAGACATACTAGATAGGTCTCTGTATGGTTTTAATCAACCTGGTTAAAAGATATTCGAGATGTACTTACTTTTGCTTCGAGTGCTTCAAGCAACGTACCGTCATGTCGTTAAATAGGACTGTCTGGGCGGTTACTAATGCAACGTTGCGCATCGCTTTATGTGATTAATGCTAGTCTCACTAGTCCACTTGAGACGGCACTTCTCATGCGATTTGTCGTAGTAATTATGCTTCGCTGGCCTTACCGCTTTAGAGCGATTGTTCCGCGCGTCACAGTCCATAGATCACTTAGGTTTCTTGATACTGTGGACCATGAGCTTCCCGATGATATATCTAAAACTTTTGACCCAAAAGTTTCCTGAATTACACTTTTACCAACGATCATTACTACGGCATGATGGGTCTATACATATGCAACCGGAACTATCATAAAAGTTGAAGGAGGCATTCGGGTGAAAGCTTCATCCTATACGCCATAGCTCGATGTCCGTCCAAATGCTAAACTAGTTACTCAACACCTTGAGGATAGTTAAACTTCTTTACTCTCTCTTTAGTGTTCCTAGTGCGCAACTCTTTGATCGCGTTCTTGAATTCTAATTTAGGCCTGTCTTCTTATTAAACCTTCCTGTACAGGACAAATTGAGCCATATGCCAAGGTACCAGATAGGAATGCTGTTCGTATCTTAATCTCTCCGGCAACGACCACTCATAGAACTTGAAATGAAGTTTTAGTATAATGAATGTTTCTATAATTAATAATAATGTAACGTCGCGACTTTTTTGAAGACTGCTCACTGAGGTTTGACTTTTGTTGACGTAAATTGTTAAAGCTTTTCTATCACTCGAAGGGCTACCTTTACCCACACGGCAAGCAGATAAGAAAGCTCGACGGTCATAAACATTTTTCACTTTAATTGAAATTTTTTCCTCTTGGGTAGTGGAGCTCCTCACCGCAAATGCTTTGCAATAAGACGTTATCTCATGTCTTCGCCAAACCAAAGTATCGTGGAAAGCTCATAGTCTTAATATAACGACCATAACTGTTTTTACCTTATTACGCTCATGGAGATTTTATGCTTTGGTTATAACTATATTTTGTCTTTTTGAAACGTAACTTCTAAGTAGGCTTGTATGGCGTTAGACCTTGCTCATGACCTAATTGTGTAATATCACAGCAAGGGAAGTGGTGCGCTGTTTTATAGAAATTAGAAGTTGGTCATAGATTACTCCCATTATGCGCAGGACACTATTGGAGTCTTTATGTATTTTACTGTTTTCTTTTTCCAGGTAAAAGATAGTTTTATCTTAGACTATGGTGCTTTTTTCTTAGTCAGAACCTTCTATAGCAATGCCCTTAGAGCTTTAGTAGATTAAGATTATTGCTCGCAATATACTTTTATCTGTGTTGTAAACTCAGATTCGTTCATCGTTTCACACTCTATGTTTAGCTGTACTGCAAGCTACGTCACTGTTAAGTGCCATTCTTAGTCGTTGGTAGCGTATTTCTCTACGTTGTGGTGTCATTTGACCTAACCTTTGGTTTGTCATTCCTTTCTTCTCTTCTTTAACATTTACAGGAGTCGCGTCCAGAGTTCAGCCCATGTCAGAAACATCGCCTGCATTCAAAGTCAGAGGGCTCCAATGTTCTGTTGTGAATACAAAAATGACAGCCGTCGCACTCTTCATTATATCTGGTCTTTAATTCTGTAATGAAAATACAATTATGATTACGTCGCGTTCTTCATTTTATACTTGAAACAATGAGTGTTCTTCATGTTATACGTATAGGATGAGGAGAGTTAAAGCTTATGCGGTAATTATTGCTTGGTTTTCTATTTAATTTCCCACATAACGCTATACCTTCTTGGACGTTATGTCCTTTACTTTAACAATAGTGGCCTTCGAGGAGTGTTTCAGGTGTTGACTCTCTATACTATCTCTTGAGGTCGTAATGGTTTGTTACATACCTTCTCTAAGTTTTCCCTTTCTTTAGACACGCTTGAACGTTATTCCGATGACTGCAACGGGTTCACTCCCTAGTCGATTTGAAAGTATAACTACGTAGGGTCGAAAGACTTTATGCGGTCTTTATACTGGTCCACTAGCCAGAGGAGTTAATATGCTTAAATAGTGTTATGTTACAAGTCGTTTTGAGACTTTGGTTGTGGTAGCAACATGTTTTGGGTACCCACTACCATGGTTGGCATGTTCTCGGCACCATGGCAACCCGGTAATTTGGAAGTCTTAGTGTTTCCGTTTCACTTTAACTACAGGACAATCTGCTTCATAGTGTTGCATTCAATGTTTTAAAACATTTTTGTTATTAATTATAAAAAATTAACAAATTCATAATTTTCTGCATTACCATTGATATAGATATACATTACTATTATTAAATACACAATTAATTCTTAGTTTAATTACGAGTTTATTACAGCGTCTTCGAGATTAGAAGATTTGAGATCTGAAACATCGTAAAGTATTAATATAAAATTTATTAAATTTATTAACATCAAGTTTACTTCATTTATTAACAAGATAGTAATCAAAAGCTACGTAATATAGATTGACTCATTTGAGAGAATACTTTACATTACATCTTATAGATAGTTTCAGAGTATATTAAATTGTAGTGACTTAAGAAAAAATGTCGAAGCACGTGAGAGCTATTGCTGTTCTAAGAATGAAAACTATTGGTCGAGATATTGCACGTTGATCCTTTTACATTCGTGCACGAAAATTGGTGGATAGGTGTTCTAAGGGTATTAGCATCTTTAATATAAGGCCTTTCGAGCTTTCACCGACAGAATCGATTTCTATGCCTGTTACTGTCGTCTTTACAAATACATACCGACCCGTTAGATCTTTAACTTGACTTCTTTCACCCATTACTAGATTTTCAACGTTATTTGCCTGTCTTGCAGCTCTATGGCCTCTTTCCGGTAGTTCTTTCGTTACCTTTACTTTAGTAAAAGCTTTAGCACGTTAATGGTCTGCCTAGGGAGAGACAGTAAAGTCTCTTCATACCTTATTGGCAGTATAAGCTGCCTTTTGTGCAAGCTAATATACATTCATTTTTAGATTTTATTGAACTATTATAAAATTCATTAATTAATATTTGTCATGCTACCATGTTCTTTCACATTATTTTTAGCTTTATTTAAACAATGATCCGTTTACCACGTTGTATAGCATTACGGTATGCACCAGAGACGCCGTTGATGCTGAGGTCCGCACTATTGCTAAAGGAGTGAGGATTCTTGACAGAGAACTGCTTTGGCCTTCTTAAACGGCGATGGATGCGATACTGTTTGCTCTTAACGGTTCCTGGACGAGGCCTTTTATTTGCTTTTCGGCTTCCTCGGTACACGTAACTTCACGGTCTCGTCGTTGTCTACTTGCAATAGTCGCTGTCTCTTCGCCCATCTTACTACTGTCTCCCGCCTTTAACCCCGATGGTAGTCAGATTAGCGTTCTTTCTTGTACCTGTTCTATCATTTTCTCCAGTGCCAGTATTTTTTATGTTAGTCTTTCTGAGCGTCCTCCCACCTAGTTTGCTCTCAGTTATATCCTTCTTTGTGTTGTAACAGATAGCGTGCTCTCAACATCTTTACCTGCTATTTTAGACAAAGTGATGATGTGGACATGGTCCTACGGAGGTTCTATGGTCCGGGCATCTTTCTCAAGGCTTCTTTATACTAGAAATAACGGACAGCTTCTTGTTACTTAGATACCTGGAGTTTGCATCCCAGCTCCTTCCGCGATTCGGTCTAAAATGAGTTTTCGGACAGTTGTTCTATGTTTTGAAGGTTTAAGGACAGAGAACGTCGCGGCGTATT
(first three nucl are "TAC" which is the complement of ATG). But it's not the reverse-complement either. So I think there is a snafu in there. These genes of interest are on the reverse strand. Maybe your parser forgets to take the complement?
Also, are you planning on letting users extract more generic labels? (eg: this gff has the "gene" label)
Thanks!
yannick
Hi,
Are there any examples of how to change IDs in a GFF3 file?
Thank you in advance,
Michal
gff3-fetch mRNA m_hapla.WS232.genomic.fa m_hapla.WS232.annotations.gff3 > m_hapla.WS232.mRNA.fa
caused
/home/wrk/.gems/gems/bio-gff3-0.8.7/lib/bio/db/gff/digest/gffparser.rb:119:in block in each_mRNA_seq': undefined method
seqname' for #Array:0x00000000de79e8 (NoMethodError)
User request for output fasta containing direction information of CDS
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.