Comments (4)
I'm getting googlebot blocked quite a bit in my rack-attack logs using legitbot, it's probably because some IPs are missing?
95.216.227.158
95.216.33.117
Is it possible to automate the process of adding new IPs using the host command like they suggest here?
https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot
I actually had more IPs in there but calling 'host' on them I realized they were actually fake & I'd just randomly had the first ones I tested be actually googlebot.
from legitbot.
@inspire22 Legitbot follows the exact verification procedure you linked to, only programmatically. Did you try to follow the steps? These IPs do not pass for me:
$ host 95.216.227.158
158.227.216.95.in-addr.arpa domain name pointer crawl-95-216-227-158.googlebot.com.
$ host crawl-95-216-227-158.googlebot.com
Host crawl-95-216-227-158.googlebot.com not found: 3(NXDOMAIN)
$ host 95.216.33.117
117.33.216.95.in-addr.arpa domain name pointer crawl-95-216-33-117.googlebot.com.
$ host crawl-95-216-33-117.googlebot.com
Host crawl-95-216-33-117.googlebot.com not found: 3(NXDOMAIN)
from legitbot.
Oops, you're right, thanks! Strange they would match the first step and not the second.
I'd mistaken your TODO to add crawlers for adding crawler IPs, which is why I jumped on here. My bad and apologies :)
from legitbot.
By the way, I don't think these IPs belong to Google. Both of them are owned by Hetzner (a well known European hosting provider):
$ whois 95.216.227.158
…
route: 95.216.0.0/16
org: ORG-HOA1-RIPE
descr: HETZNER-DC
…
$ whois 95.216.33.117
…
route: 95.216.0.0/16
org: ORG-HOA1-RIPE
descr: HETZNER-DC
Strange they would match the first step and not the second.
Someone managed to convince Hetzner to create these reverse DNS records (I am surprised). Faking corresponding forward records is close to impossible, as Google itself controls the zone.
from legitbot.
Related Issues (20)
- Upgrade `create-pull-request` to V5
- Split Google crawlers into categories
- Possible Facebook RADB source issue? HOT 1
- NoMethodError when using valid? or fake? method with FB useragent HOT 11
- New: Marginalia
- Check that Codecov integration is working
- Run `test` workflow on `pull_requests` only
- Crawlers are banned by GPTBot
- NoMethodError: undefined method `empty?' for nil:NilClass HOT 9
- NoMethodError: undefined method `index' for nil:NilClass HOT 5
- Does AppleBot still masquerate as GoogleBot? HOT 3
- Petalbot tests are failing
- Update the list of Oracle IPs
- New: Amazon / Yahoo
- iMessageBot HOT 1
- Automate the process of updating IPs
- Mock Resolv
- Resolv issues with googlebot sometimes HOT 7
- Add support for GPTBot
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from legitbot.