Giter Club home page Giter Club logo

Comments (4)

inspire22 avatar inspire22 commented on June 23, 2024

I'm getting googlebot blocked quite a bit in my rack-attack logs using legitbot, it's probably because some IPs are missing?
95.216.227.158
95.216.33.117

Is it possible to automate the process of adding new IPs using the host command like they suggest here?
https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot

I actually had more IPs in there but calling 'host' on them I realized they were actually fake & I'd just randomly had the first ones I tested be actually googlebot.

from legitbot.

alaz avatar alaz commented on June 23, 2024

@inspire22 Legitbot follows the exact verification procedure you linked to, only programmatically. Did you try to follow the steps? These IPs do not pass for me:

$ host 95.216.227.158
158.227.216.95.in-addr.arpa domain name pointer crawl-95-216-227-158.googlebot.com.
$ host crawl-95-216-227-158.googlebot.com
Host crawl-95-216-227-158.googlebot.com not found: 3(NXDOMAIN)

$ host 95.216.33.117
117.33.216.95.in-addr.arpa domain name pointer crawl-95-216-33-117.googlebot.com.
$ host crawl-95-216-33-117.googlebot.com
Host crawl-95-216-33-117.googlebot.com not found: 3(NXDOMAIN)

from legitbot.

inspire22 avatar inspire22 commented on June 23, 2024

Oops, you're right, thanks! Strange they would match the first step and not the second.

I'd mistaken your TODO to add crawlers for adding crawler IPs, which is why I jumped on here. My bad and apologies :)

from legitbot.

alaz avatar alaz commented on June 23, 2024

By the way, I don't think these IPs belong to Google. Both of them are owned by Hetzner (a well known European hosting provider):

$ whois 95.216.227.158
…
route:          95.216.0.0/16
org:            ORG-HOA1-RIPE
descr:          HETZNER-DC
…

$ whois 95.216.33.117
…
route:          95.216.0.0/16
org:            ORG-HOA1-RIPE
descr:          HETZNER-DC

Strange they would match the first step and not the second.

Someone managed to convince Hetzner to create these reverse DNS records (I am surprised). Faking corresponding forward records is close to impossible, as Google itself controls the zone.

from legitbot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.