Comments (5)
Alright, so this is from this pull request, which I perhaps didn't think hard enough about: https://github.com/brendonh/pyth/pull/19/files
The stated purpose of that PR is to make it easier to filter out image data, by identifying it as such in the document object. But then none of the writers actually filter it out. Sigh.
I figure the right fix is to make Image a top-level type (instead of a Paragraph subclass), and then update the writers to ignore it. Or something.
from pyth.
Having looked again, it looks like the images are getting recognized. I see image objects inside of paragraphs, so it may be as easy just having the writers ignore it.
from pyth.
Right. So there are two bugs:
- Image is a Paragraph subclass, leading writers to interpret it as text instead of (correctly) crashing on an unknown type
- Writers don't know to skip it.
Someone should fix that! ;-)
from pyth.
@brendonh You should have highlighted me, since I wrote that original image support... I am not sure making an Image a top level class is the correct approach, since images actually appear in the flow of the paragraphs, and currently pyth checks religiously that a Document only contains a Paragraph.
Before my patch #19 Image data would just be interpreted as plain text, and so the output of all the writers hasn't changed. I currently think adding functionality to the writers to ignore/handle the images is the best way forward.
from pyth.
Ugh you are right on both counts. Okay, I'll do something about it.
from pyth.
Related Issues (20)
- Feature request: Add support for reading simple RTF tables HOT 1
- CJK characters support for RTF parse HOT 5
- Please upload a new release to PyPi HOT 3
- newline for plaintext writer HOT 2
- Implement lists in RTF reader HOT 1
- Bump to 0.6.1? HOT 4
- Python 3 support HOT 2
- Add more tests, for instance from unrtf HOT 1
- pyth.__version__ claims to be 0.5.6 even though packaged as 0.6.0
- Exclude images when writing plain text files
- Parsing RTF fails when an escaped quote is followed by non-hex digits
- PlainTextWriter adds extraneous newline after each paragraph
- Unicode error when reading RTF HOT 3
- Import field text from RTFs
- Parse colortbl and colored text
- Request - Support for parsing string instead of file
- problem with reading file.rtf with table inside
- wiki.github.com/brendonh/pyth is a broken link
- rtf reader: decode argument TypeError HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyth.