Comments (4)
Let me die in peace now.
from minet.
I guess the only course of action we have left here is to switch to GET
requests without fetching the body.
from minet.
It should also be noted that the redirect will fail if you don't spoof UA...
from minet.
I changed default method to GET
, and spoof_ua
is now on by default. follow_meta_refresh
still False
by default because it remains quite costly.
from minet.
Related Issues (20)
- Add a playwright version of the crawler
- Spider process error should lead to errorred crawl? HOT 1
- Upgrade trafilatura and deal with lxml_html_clean
- Upgrade rich and other deps
- Refactor Crawler request_args as inheritance
- instagram post-infos should have line parity in the output and increase a stat rather than log
- ThreadsafeBrowser enhancements
- Add LoadingBar.track
- Improve ThreadsafeBrowser.request stability by retrying content acquisition if needed
- Draw edges kwarg
- Retrieve videos from instagram hashtag function HOT 9
- Scrapping 1000's of comments on Instagram HOT 3
- Error on wikipedia pageviews HOT 1
- Add FORWARD_SPIDER option
- Spider process exceptions should at least be raised with some context around them
- tiktok search-videos error HOT 3
- "Invalid Twitter cookie!" error (possibly due to migration from twitter.com to x.com ?) HOT 3
- -I should default to "downloaded" in scrape and extract HOT 1
- When -c is not specified, we should default to test all available browsers instead of only firefox HOT 1
- potential changes in rate limit of twitter public API HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from minet.