Giter Club home page Giter Club logo

wipo-st25-parse's Introduction

wipo-st25-parse README File

This is a module for Biopython to allow for parsing and importing WIPO ST.25 sequence files as SeqRecord objects

Objective

I had the misfortune of having to do some analysis on a bunch of patent sequence files in this WIPO ST.25 format.

The software provided by the USPTO for this purpose (https://www.uspto.gov/patents-getting-started/patent-basics/types-patent-applications/utility-patent/checker/patentin) has been "enhanced" by .NET. It is not exactly amenable to fitting into a sequence analysis pipeline. It did not open ST.25 files without import error messages that caused it to abort. I did not attempt to fix it.

This is a hacked together parser for ST.25 sequence files to get them into biopython objects. It works for the most part but it's pretty ugly, and the format specification is vague on a bunch of important points so lots of these files have subtle differences. Your milage may vary. Use at your own risk. I don't think enough people will have to work with these types of file to justify pulling it back into biopython-master as a module in SeqIO.parse, and if you do work with these files I would suggest keeping a bottle of scotch in your desk drawer.

I am not ambitious enough to script something that would actually write files in this format. It should be easier than writing a parser, but I just do not want to do it.

Usage

Import it and call it with a file handle. It returns an iterator that should walk through the file and generate SeqRecord objects. These will have the generic_protein, generic_nucleotide, or generic_RNA alphabets.

import ST25SequenceIterator

for seq_record in ST25SequenceIterator(file_handle):
	...

Requirements

wipo-st25-parse's People

Contributors

trappedinaribosome avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.