schwa-lab / libschwa-python Goto Github PK
View Code? Open in Web Editor NEWPython bindings for libschwa
License: MIT License
Python bindings for libschwa
License: MIT License
I have a stream with a store called tokens
made of class Token
. If I define the a schema with a store tokens
which has a ann class by a different name, I get an error:
schwa.dr.exceptions.ReaderException: Store u'tokens' points to <schwa.dr.schema.AnnSchema object at 0x7f410dcdcd60> but the store on the stream points to a lazy type.
Minimal example (given tokenized text with conventional naming):
from schwa import dr
import sys
class Tok(dr.Ann):
pass
class Doc(dr.Doc):
tokens = dr.Store(Tok)
r = dr.Reader(sys.stdin, Doc)
next(r)
Only because I have found this wc
-like functionality (including the overall total) useful in practice.
dr tail can be used to track progress of a file being written, or of a process that has broken while dr was writing, if it returns the last complete docs rather than attempt to return a broken doc.
Only because I have found this wc
-like functionality (including the overall total) useful in practice.
Double dashes between groups. Makes heterogeneous dr-*
apps more consistent to use.
dr count could double as a process monitor, showing the cumulative counts of each store while processing. Using the drcli implementation:
$ bin/candc ... < in.dr | tee out.dr | dr count --every 100 --cumulative --timestamp --docs -s sentences -s nes -s parse_nodes
Would give me a running count of output progress from C&C.
I came across this odd error that is partly me being a py3 noob, but is also a bit of a rough edge on the library.
File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/writer.py", line 68, in write
self._write_doc_instance(doc, rt)
File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/writer.py", line 160, in _write_doc_instance
self._pack_prefixed(self._build_instance(doc, None, doc, rt.doc))
File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/writer.py", line 156, in _build_instance
instance[f.field_id] = field.to_wire(val, f, store, doc)
File "/n/schwafs/home/wradford/repos/libschwa-python/schwa/dr/fields_extra.py", line 61, in to_wire
return obj.encode(self.encoding)
AttributeError: 'bytes' object has no attribute 'encode'
The problem was that I was trying to serialise a bytes
object using a dr.Text
field. This is likely to bite py2 natives like myself, so perhaps a warning might be nice.
The main thing I'd like would be some better reporting of what field failed to serialise. While it's fine for me to go digging in the checked-out source and add some print statements, it's much better to present more detail in the error message.
schwa.dr.contrib.processing
is designed to work with a previous dr-dist that relied on ZMQ REQ/REP. A python interface for the new source/sink/control model (as in https://github.com/schwa-lab/libschwa/blob/develop/src/lib/schwa/dr-dist/worker_main.h) should be adopted, either in the existing schwa.dr.contrib.processing
or a new framework.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.