Will solve <a class="issue-link js-issue-link" data-error-text="Failed to load title"

Thanks for the feedback, <a class="user-mention notranslate" data-hovercard-type="user

This is done in <a class="issue-link js-issue-link" data-error-text="Failed to load ti

Splash support about autologin HOT 5 CLOSED

teamhg-memex commented on August 21, 2024

Splash support

from autologin.

Comments (5)

madisonb commented on August 21, 2024

I dont understand why Splash is needed in order to support phpbb3 style cookies? If autologin requires Splash, then it is no longer really a python module and requires greater architecture for it to function. While I am not well versed in phpbb3 style cookies - I do not see why faking a header request with all of the proper information cannot be done - which is pretty easy in Scrapy.

We have been very happy with integrating autologin in our scraping architecture, and I think the best use of the module will be to make it standalone as much as possible.

from autologin.

lopuhin commented on August 21, 2024

Thanks for the feedback, @madisonb! Do you use autologin as a library to get the request data and then send it with Scrapy?

The situation where splash support is helpful is when we use autologin as a service, perhaps even on a different host, and also crawl via a separate splash instance. In this case by using the same splash instance both in autologin and in the crawler we get the same ip and the same user-agent, and can also log in on sites that are hard to handle without splash (js heavy or tor).

from autologin.

lopuhin commented on August 21, 2024

Just to clarify - splash support it intended to be optional, not a requirement.

from autologin.

madisonb commented on August 21, 2024

Precisely, we use autologin/formasaurus in library form and but could switch over to autologin as a service if needed, and then use the cookies generated within Scrapy. We dont use Splash instances to crawl the open web, and for Tor we have our spiders configured to work with the network.

Most sites in the past have not cared whether the cookie comes from a different IP, but the phpbb3 sites may and we may need extra engineering for work with that.

from autologin.

lopuhin commented on August 21, 2024

This is done in #8 by using scrapy and scrapy-splash.

from autologin.

Recommend Projects

Splash support about autologin HOT 5 CLOSED

Comments (5)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent