Giter Club home page Giter Club logo

Comments (8)

victor-torres avatar victor-torres commented on May 28, 2024 3

Even though this optional requirements strategy is documented on the PEP 508, I don't believe it's the best approach seeking the mainstream usage of this library.

You suggest the addition of docs specifying which requirements are necessary for each part of the software, but I don't believe it's a good solution too. We all understand how useful are documentations, but IMHO we should make the software work out of the box whenever possible, using things such as convention over configuration and concepts like batteries included.

You could not consider the JSON Schema validation as part of the project's core, but to me, if we are using an external library in our source code, it should be listed in the project's requirements, and having multiple requirement files (not confuse with test requirement files) could look a little bit confusing, specially for new users.

If you say this validation thing is not part of the project's core, we could just provide a default interface and publish something else in another package, stored in another repository. Something that could be installed like:

pip install spidermon
pip install spidermon-json-schema-validator

It's way clearer to me.

What I'm trying to say is that we should try to make things a little bit easier and straightforward. Doesn't matter if you are a new or experienced developer, you expect things to be easy. If the project provides a JSON Schema validator class, you're supposed to be good to run with a simple pip install spidermon, especially if it's such a common case between users.

Not to mention the undesirable runtime errors due to lack of dependencies you only discover later and have to spend some time figuring out what's missing. I know, I know, it wouldn't happen if we read the docs, right? But people do that.

Django is a well-known and widely adopted Python web framework. Its code base is huge, but although you have a dozen classes and features you don't use, you're good to go with a pip install django. Other things could be installed with additional packages, but that are not part of the core project and should be listed as external dependencies that use built-in interfaces manually.

from spidermon.

rennerocha avatar rennerocha commented on May 28, 2024 1

Considering the actual implementation of spidermon, item validation is not a core feature (despite the fact that it is widely used), so you can still use spidermon (creating monitors and custom actions) for your spider without including these libraries.

A user can also chose a different library for item validation (different than jsonschema or schematics) and IMHO the user should not be obligated to install non-essential libraries.

Even the extras_require actually in place contains more than necessary (for example, [monitoring] install a slackclient even if I just want to use boto).

Scrapy doesn't include these extra libraries by default, for example:
https://doc.scrapy.org/en/latest/topics/feed-exports.html#topics-feed-storage-s3

I think the best solution would be include in the docs, the libraries required for each of the contrib feature the user wants to use.

from spidermon.

raphapassini avatar raphapassini commented on May 28, 2024 1

After reading @victor-torres I have to agree with him, it's better to have everything included, this makes things easier for developers and we want them to use the library and monitor the spider right?

from spidermon.

raphapassini avatar raphapassini commented on May 28, 2024

I think this is a nice suggestion.
We think that we use more the validation than not use it, right?

from spidermon.

muzaffaryousaf avatar muzaffaryousaf commented on May 28, 2024

@rennerocha @raphapassini I think this is a very good suggestion and we should move towards the unification of these requirements.

from spidermon.

rennerocha avatar rennerocha commented on May 28, 2024

I still don't agree that these external libraries should be installed in the user environment
when you type pip install spidermon.

Even jsonschema library doesn't include all libraries in the environment if they won't be used. If you want more validation formats you need install optional requirements.

You mentioned Django, and besides the fact that it is a complete framework (compared to spidermon, a
simple extension), it doesn't install everything for the user (you need to install your databases libraries for example). If the majority of the users are using PostgreSQL, should Django install PostgreSQL libraries for everyone, even if I don't need it?

Scrapy doesn't install boto. If you want to use S3 storage, you need to install an optional dependency (even if is a common case to have this feature used).

After pip install spidermon you are good and can create your monitors and your custom actions. What could be easier than that???

from spidermon.

rennerocha avatar rennerocha commented on May 28, 2024

Related to #89, jsonschema is required by PythonExpressionsMonitor, a core feature not documented yet (we need more docs!!!), so now I agree that it needs to be included as default when you install spidermon.

from spidermon.

rennerocha avatar rennerocha commented on May 28, 2024

After #100 this is solved.

from spidermon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.