Comments (6)
Why are these StreamRDF... classes in ...riot/system/ and not in .../riot/writer/stream/?
Different meanings of "stream".
org.apache.jena.riot.system.stream is in support of the stream manager - that is IO streams.
org.apache.jena.system includes StreamRDF - a stream of triples./quads/prefixes.
from jena.
RIOT has its own tokenizer and parsers - the combination is x2 to x4 faster. The tokenizer is the performance bottleneck.
The fastest parsers in Jena run at up to 1m triples/second on binary RDF Thrift. RDF PRotobuf is slightly less than 10% slower (making protobuf work for open ended streams of input seems to create an extra object and at 1microsecond a triple this is observable).
The performance of Turtle and N-triples etc is approximately 240 kTPS and 400 kTPS. The only difference is the grammar parser being much simpler than all the "if"s for Turtle.
All these are a minimum of x4 faster than Javacc.
All parsing performance is sensitive to the hardware used. So these figures are relative. (they are on a old core-I5 with SATA SSD as has been used consistently for measurements over time.)
Java has to convert to Java chars at some point which is a copy. In fact, it is faster to convert large buffers using Java built-in UTF-8 handling than to try to do one less copy but of each RDF term. Java checks all input for validity of UTF-8.
If you'd like to improve the tokenizer and provide a PR, then would be great.
from jena.
@AtesComp Could you provide a test case to illustrate the issue with StreamRDFWriter.getWriterStream
? There is a lot of rdf-transform
that may be influencing issue.
from jena.
Thanks, Andy. Of course, I should have wrote .../riot/io/stream
, or just .../riot/stream
, instead of .../riot/writer/stream
. I was just locked onto writing out RDF export files. The system
directory just didn't click with me.
As for the real issue, I'm noting that the OpenRefine 3.5.2 version uses an older 3.x Jena ARQ that is squashing my dependency on the 4.5.0 Jena ARQ Maven release. I looked at many ways to force it to use the newer jars but to no avail. Since my code is just a lowly red-headed stepchild extension to OpenRefine, I don't have much say in the matter. I'm fairly sure the getWriterStream
issue is due to the older jar.
However, the OpenRefine 3.6-SNAPSHOT is up-to-date! So, now, if I can just get them to make an official release, all will be good. Well, mostly. The Jena documentation needs updating. I can live with in for now.
from jena.
So this issue "StreamRDFWriter getWriterStream()" can be closed?
from jena.
from jena.
Related Issues (20)
- Why does a query that works in Jena 3.16 but throw an error in Jena 4.10? HOT 4
- MurmurHash3 NoSuchMethodError HOT 3
- Implement RDF Dataset Canonicalization and provide EARL report HOT 1
- 4.10: FesukiAutoModule not loading when running as standalone service HOT 3
- Fuseki File Upload Status Indicator Visibility
- fuseki.Fuseki: jetty.servlet.ServletContextHandler$Context does not implement the requested interface javax.servlet.ServletContext HOT 5
- Standardize the case of language tags HOT 1
- Clearer naming in NodeFactory for string literal creation
- /$/compact/ with deleteOld=true does not remove the old folder (on Windows) HOT 6
- Better support for JSON-LD Titanium development? HOT 3
- ShaclParseException: No sh:path on a property shape HOT 3
- `rdfdiff` with quad support HOT 1
- Forcing usage of caffeine 2.x HOT 2
- Upgrade Apache Lucene library to 9.9.x for jena-text HOT 3
- How to package apache-jena-libs for maven project? Reader not found: TTL HOT 1
- reasoners should support owl:AllDisjointClasses HOT 4
- ARQInternalErrorException during query execution in Jena 4.10.0 HOT 2
- QueryBuilder should generate query syntax trees that correspond to parser output.
- SHACL: Implement ?currentShape and ?shapesGraph for SPARQL validators
- SHACL: Apply imports during cmd validation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jena.