Giter Club home page Giter Club logo

Comments (15)

phillord avatar phillord commented on August 21, 2024 1

I have it working now. It's taking me a while to test, because I think my code was dependent on behaviour from the old parser that was actually buggy.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

which version of sophia are you using?

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

Assuming you are using the latest release (0.5.3), I just pushed an experimental branch rio_xml. You might want to try it, and replace xml::RdfXmlParser by xml2::RdfXmlParser in your code, see if that solves this issue -- and possibly #76 as well.

If it does, I will probably switch to this implementation as the default RDF/XML parser.

from sophia_rs.

phillord avatar phillord commented on August 21, 2024

I'm trying to work my way through this. It seems to work and parse much quicker, but it's not a drop in replacement in my code.

Currently my main use for this just dumps graphs out into [Term; 3]. So I do this:

    let triple_iter = sophia::parser::xml::parse_bufread(bufread);

    let triple_result: Result<Vec<_>, _> = triple_iter.collect();
    let triple_v: Vec<[SpTerm; 3]> = triple_result.unwrap();

But I can't drop in replace this with xml2, and I haven't managed to work out how to get triples from the xml2::RdfXmlParser. Apologies, I find the API rather confusing! I'd be grateful for any hints.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

It seems to work and parse much quicker,

good

but it's not a drop in replacement in my code.

not quite, you are right...

Apologies, I find the API rather confusing! I'd be grateful for any hints.

I should be the one to apologize... I'm sorry you feel that way about the API, and I am open to any suggestion to make it easier.

Now about your problem:

TL/DR

This should work for you:

    let triple_source = sophia::parser::xml2::parse_bufread(bufread);
    let triple_result: Result<Vec<[BoxTerm;3]>, _> = triple_source.collect_triples();
    let triple_v = triple_result.unwrap();

Explanations

  • First, you need to understand that xml::parse_bufread(b) is just a shortcut for xml::RdfXmlParser::default()::parse(b), where RdfXmlParser implements the TripleParser trait. So basically, xml::parse_bufread is specified as the trait method TriplelParser::parse (and this should be true of the parse_bufread method of any parser module).

  • The contract of this method is to return a TripleSource, which itself is a trait. This trait is implemented by any iterator of triples, but has other implementations. Each parser provides its own implementation of TripleSource. xml's happens to be an iterator, and your code above was relying on that. xml2, on the other hand, has a different implementation (which contributes to making it faster, by the way 😉).

  • Since sophia 0.5.0, TripleSource provides a method similar to Iterator's collect. It is called collect_triples, and can build most implementations of Graph (to be precise: it can build any implementation of CollectibleGraph).

  • Vec<[BoxTerm;3]> happens to implement Graph and CollectibleGraph.

I hope this helps.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

FTR, there was an error in my previous comment; Vec<BoxTerm> should have been Vec<[BoxTerm;3]>. I just edited it to fix that.

from sophia_rs.

phillord avatar phillord commented on August 21, 2024

Well, it seems to be working well. The two failures I were getting in my test suite were, I am sure, because of behaviour that was buggy in the old parser. It also fixes #76.

In terms of the API, I think the issue is partly mine. I still not find Rust entirely natural to use. Especially when implemented though traits, the documentation you need in the Rust doc can be several clicks away or deep in the page. Main thing that would help would be a bit more module documentation and especially examples!

I need to think more on sophia, because at the moment my own https://github.com/phillord/horned-owl duplicates some of the functionality. Too many options.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

Well, it seems to be working well.

Great. I'll make the Rio parser the default in the next release. I'll close both issues then.

Main thing that would help would be a bit more module documentation and especially examples!

Yep, that's a pertaining item on my TODO list ;)

from sophia_rs.

phillord avatar phillord commented on August 21, 2024

More documentation is on everyone's TODO list:-)

Do you have an ETA for a new release?

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

Do you have an ETA for a new release?

I'm hoping to do it by the end of June or beginning of July.

from sophia_rs.

phillord avatar phillord commented on August 21, 2024

Okay, thanks for letting me know!

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

I'm hoping to do it by the end of June or beginning of July.

A little later than announced, but v0.6.0 is now out, with parser::xml now based on Rio parser. Give it a try, and feel free to close this issue (and #76) if your problems are solved.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

@phillord are you ok to close this issue? Since the pre-release patch "[seemed] to be working well", I am assuming that your problem is also solved with the current release.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

@phillord up?

are you ok to close this issue? Since the pre-release patch "[seemed] to be working well", I am assuming that your problem is also solved with the current release.

from sophia_rs.

pchampin avatar pchampin commented on August 21, 2024

closing: this issue is very old, and the current RDF/XML parser processes http://purl.obolibrary.org/obo/go.owl without any problem

from sophia_rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.