Giter Club home page Giter Club logo

libschwa-python's People

Contributors

jnothman avatar timdawborn avatar wejradford avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

pombredanne

libschwa-python's Issues

Opaque ReaderException when stream and python class names mismatched

I have a stream with a store called tokens made of class Token. If I define the a schema with a store tokens which has a ann class by a different name, I get an error:

schwa.dr.exceptions.ReaderException: Store u'tokens' points to <schwa.dr.schema.AnnSchema object at 0x7f410dcdcd60> but the store on the stream points to a lazy type.

Minimal example (given tokenized text with conventional naming):

from schwa import dr

import sys

class Tok(dr.Ann):
    pass


class Doc(dr.Doc):
    tokens = dr.Store(Tok)

r = dr.Reader(sys.stdin, Doc)
next(r)

dr tail to return last *complete* document

dr tail can be used to track progress of a file being written, or of a process that has broken while dr was writing, if it returns the last complete docs rather than attempt to return a broken doc.

dr count should support --every with an argument

dr count could double as a process monitor, showing the cumulative counts of each store while processing. Using the drcli implementation:

$ bin/candc ... < in.dr | tee out.dr | dr count --every 100 --cumulative --timestamp --docs -s sentences -s nes -s parse_nodes

Would give me a running count of output progress from C&C.

Serialisation errors should report the field they failed on

I came across this odd error that is partly me being a py3 noob, but is also a bit of a rough edge on the library.

 File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/writer.py", line 68, in write
   self._write_doc_instance(doc, rt)
 File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/writer.py", line 160, in _write_doc_instance
   self._pack_prefixed(self._build_instance(doc, None, doc, rt.doc))
 File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/writer.py", line 156, in _build_instance
   instance[f.field_id] = field.to_wire(val, f, store, doc)
 File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/fields_extra.py", line 61, in to_wire
   return obj.encode(self.encoding)
AttributeError: 'bytes' object has no attribute 'encode'

The problem was that I was trying to serialise a bytes object using a dr.Text field. This is likely to bite py2 natives like myself, so perhaps a warning might be nice.

The main thing I'd like would be some better reporting of what field failed to serialise. While it's fine for me to go digging in the checked-out source and add some print statements, it's much better to present more detail in the error message.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.