Giter Club home page Giter Club logo

Comments (7)

tucotuco avatar tucotuco commented on May 31, 2024

Oops, looks like lxml is the only parser BeautifulSoup can use

"Right now, the only supported XML parser is lxml. If you don’t have lxml installed, asking for an XML parser won’t give you one, and asking for “lxml” won’t work either."

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#specifying-the-parser-to-use

So the new question becomes, "Would it be possible to have the reader not depend on BeautifulSoup?"

from python-dwca-reader.

niconoe avatar niconoe commented on May 31, 2024

Hi John,

Indeed, you've perfectly nailed it: python-dwca-reader depends on BeautifulSoup, and BeautifulSoup needs lxml. I've myself been uncomfortable since a long time to have such an heavy dependency for relatively "peripheral" features.

So one of my medium-term plan was to replace BeautifulSoup by something lighter, or at least make it optional. Do you urgently need to use python-dwca-reader? I can in the next few days (let's say a week) find time to evaluate if I can publish a new version that doesn't depend on BeautifulSoup. If not too hard and useful for you, I'd definitely go for it. It's also a good opportunity to test it (and fix it if necessary) on Jython, I don't think it has been done before!

Best,

Nico

from python-dwca-reader.

tucotuco avatar tucotuco commented on May 31, 2024

I am using python-dwca-reader actively, but the Jython context does not have the same urgency as just using the Readers. I thought about forking the repository and making a version that had BeautifulSoup optional, but it would probably take me longer than next week to get around to it. If you can do it that same time frame, that is better. I will gladly test it as soon as it is ready.

from python-dwca-reader.

niconoe avatar niconoe commented on May 31, 2024

Cool, didn't know you were already using it, happy that my work is useful to others.

I had a quick look, and it seems indeed that it should be possible to make an version of python-dwca-reader that replace BeautifulSoup/lxml by ElementTree from the standard library... If I'm not mistaken, it is also available in Jython, and so we shouldn't be too far from having Jython compatibility... What do you think?

from python-dwca-reader.

tucotuco avatar tucotuco commented on May 31, 2024

I think, "Excellent, go for it." Waiting anxiously.

On Fri, Aug 14, 2015 at 11:16 AM, Nicolas Noé [email protected]
wrote:

Cool, didn't know you were already using it, happy that my work is useful
to others.

I had a quick look, and it seems indeed that it should be possible to make
an version of python-dwca-reader that replace BeautifulSoup/lxml by
ElementTree from the standard library... If I'm not mistaken, it is also
available in Jython, and so we shouldn't be too far from having Jython
compatibility... What do you think?


Reply to this email directly or view it on GitHub
#43 (comment)
.

from python-dwca-reader.

niconoe avatar niconoe commented on May 31, 2024

Hi John,

I just released a new version (0.7.0) that totally drops the dependency to BeautifulSoup and lxml. All the APIs that were returning BeautifulSoup objects now return xml.etree.ElementTree.Element (from the standard library). Could you have a look?

I only checked very briefly, but it seems to work under Jython!

from python-dwca-reader.

tucotuco avatar tucotuco commented on May 31, 2024

Confirmed that this works great under Jython and completely solves the issue for me. Closing. Thank you very much.

from python-dwca-reader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.