Comments (30)
Hi,
if i understand you correctly: Most of the requests are handled by CDN but some of them are redirected (via proxy) to your site, is that right?
from fake-bot-plugin.
This is what RIPE says about this client IP: https://apps.db.ripe.net/db-web-ui/query?searchtext=137.74.122.3
Why would it use bingbot as user-agent? So whatever happens with the redirect (which is a bit odd), this seems to be a legitimate alert.
What is also puzzling for me is that the client and the server (cdn.achatpc.com) are both in France and seem to be hosted by the same company.
So what is happening here exactly?
from fake-bot-plugin.
Is your machine behind a reverse proxy, and is this the IP address of the reverse proxy: 137.74.122.3?
If so, you should use mod_remoteip (if you use Apache) to trust your reverse proxies and read the X-Forwarded-For
header, so that your webserver and WAF see the real connecting address.
from fake-bot-plugin.
Good thinking. @azurit this might be something for the readme.
from fake-bot-plugin.
Bonjour, si je vous comprends bien : la plupart des requêtes sont gérées par CDN mais certaines d'entre elles sont redirigées (via proxy) vers votre site, n'est-ce pas ?
Yes, you right. Static content and media content are via CDN
from fake-bot-plugin.
Votre machine est-elle derrière un reverse proxy, et est-ce l'adresse IP du reverse proxy : 137.74.122.3 ?
Si tel est le cas, vous devez utiliser mod_remoteip (si vous utilisez Apache) pour faire confiance à vos proxys inverses et lire l'
X-Forwarded-For
en-tête, afin que votre serveur Web et WAF voient la véritable adresse de connexion.
Yes, i dont know this module "mod_remoteip". I can have a look about. Thank you.
from fake-bot-plugin.
Yes, you right. Static content and media content are via CDN
As @dune73 stated, there's no reason for CDN to copy the User-Agent header - i assume that CDN is accessing only uncached files so requests to your site has only caching purpose.
from fake-bot-plugin.
This is what RIPE says about this client IP: https://apps.db.ripe.net/db-web-ui/query?searchtext=137.74.122.3
Why would it use bingbot as user-agent? So whatever happens with the redirect (which is a bit odd), this seems to be a legitimate alert.
What is also puzzling for me is that the client and the server (cdn.achatpc.com) are both in France and seem to be hosted by the same company.
So what is happening here exactly?
OVH confirmed me and gave me the list of internal traffic IPs used for the CDN. This is a good part of it.
from fake-bot-plugin.
OK, got it. The response from @lifeforms and the confirmation from OVH make sense. You are seeing a false positive induced by the CDN that forwards the bingbot's request with its own IP address. The fake-bot-plugin then resolves the CDN's IP as a non-bing IP address and flags it as fake-bot.
Definitely a problem I did not anticipate.
from fake-bot-plugin.
Thank you all, for help.
I have solved with mod_reoteip. I have now the ip client source on headers and logs.
But,
I can now see this error:
[Mon Feb 21 14:31:56.702061 2022] [:error] [pid 1003560:tid 139909004248832] [client 66.249.66.219:0] [client 66.249.66.219] ModSecurity: Warning. Fake Bot Plugin: Detected fake Googlebot. [file "/etc/modsecurity/plugins/fake-bot-after.conf"] [line "27"] [id "9504110"] [msg "Fake bot detected: Googlebot"] [data "Matched Data: googlebot found within REQUEST_HEADERS:User-Agent: Googlebot-Image/1.0"] [severity "CRITICAL"] [ver "fake-bot-plugin/1.0.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-bot"] [tag "capec/1000/225/22/77/13"] [tag "PCI/6.5.10"] [tag "paranoia-level/1"] [hostname "cdn.achatpc.com"] [uri "/media/catalog/product/2/7/27003969111020_5173564641.jpg"] [unique_id "YhOUTPQirk8Eh64A9KR43AAAF7c"]
When i check ip 66.249.66.219, i can see that is a google IP.
I use some services from google. Google Ads, google merchant center. I think that merchant center can have other ip's adress for check catalog content (images, links,...). Here, this is Googlebot-Image agent.
from fake-bot-plugin.
@dune73 I don't think that CDN should copy the User-Agent header - it is only caching content so, next time, it can be served locally.
from fake-bot-plugin.
@achatpc Googlebot-Image is working fine here (not blocked) so it seems that your modsecurity still see a real IP address. Have you checked if you are able to to do a reverse DNS resolve of IP 66.249.66.219 on the server, where your site is running?
from fake-bot-plugin.
@achatpc Are you willing to do some debug?
from fake-bot-plugin.
@azurit : copying is probably the wrong term. Let's say it forwards the request via a new tcp connection with it's own IP as source IP.
Keeping my fingers crossed, that Lua sees the correct REMOTE_ADDR
.
from fake-bot-plugin.
host 66.249.66.219
219.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-219.googlebot.com.
But i have other error before:
[Mon Feb 21 15:15:22.022090 2022] [:error] [pid 1020408:tid 139849227036416] [client 176.83.218.174:0] [client 176.83.218.174] ModSecurity: collections_remove_stale: Failed to access DBM file "/var/cache/modsecurity/magento_user-global": Permission denied [hostname "cdn.achatpc.com"] [uri "/static/version1645361515/frontend/Emipro/achatpc/fr_FR/Amasty_Scroll/images/loader.svg"] [unique_id "YhOeehcI88A3x7BZk3E_nQAAGDM"], referer: https://www.achatpc.com/
from fake-bot-plugin.
@azurit : copying is probably the wrong term. Let's say it forwards the request via a new tcp connection with it's own IP as source IP.
Keeping my fingers crossed, that Lua sees the correct
REMOTE_ADDR
.
If Lua see access log, yes because i have change Apache log format:
Before:
LogFormat "%v:%p %h %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" vhost_combined
LogFormat "%h %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" combined
LogFormat "%h %l %u %t "%r" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
After:
LogFormat "%v:%p %a %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" vhost_combined
LogFormat "%a %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" combined
LogFormat "%a %l %u %t "%r" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
%a instead %h print remote_addr
from fake-bot-plugin.
@achatpc Please edit file fake-bot.lua
and add this on line 63 (almost at the end of file):
m.log(2, string.format("Fake Bot Plugin DEBUG: REMOTE_ADDR: %s REMOTE_HOST: %s", remote_addr, remote_host))
so it will look like this:
end
m.log(2, string.format("Fake Bot Plugin DEBUG: REMOTE_ADDR: %s REMOTE_HOST: %s", remote_addr, remote_host))
m.setvar("tx.fake-bot-plugin_bot_name", bot_name)
return string.format("Fake Bot Plugin: Detected fake %s.", bot_name)
Reload web server and wait for another request from Google-Image - you should see some debug info in logs.
Fake Bot plugin is not using web server logs so it doesn't matter on it's format.
from fake-bot-plugin.
@achatpc Please edit file
fake-bot.lua
and add this on line 63 (almost at the end of file):m.log(2, string.format("Fake Bot Plugin DEBUG: REMOTE_ADDR: %s REMOTE_HOST: %s", remote_addr, remote_host))
so it will look like this:
end m.log(2, string.format("Fake Bot Plugin DEBUG: REMOTE_ADDR: %s REMOTE_HOST: %s", remote_addr, remote_host)) m.setvar("tx.fake-bot-plugin_bot_name", bot_name) return string.format("Fake Bot Plugin: Detected fake %s.", bot_name)
Reload web server and wait for another request from Google-Image - you should see some debug info in logs.
Fake Bot plugin is not using web server logs so it doesn't matter on it's format.
[Mon Feb 21 16:12:58.350226 2022] [:error] [pid 1036742:tid 140554708952832] [client 66.249.66.219:0] [client 66.249.66.219] ModSecurity: Fake Bot Plugin DEBUG: REMOTE_ADDR: 66.249.66.219 REMOTE_HOST: 137.74.122.35 [hostname "cdn.achatpc.com"] [uri "/media/catalog/product/cache/993aa024f1a2c812c347b0876f4d0efd/4/1/41810189304034_5675549042.jpg"] [unique_id "YhOr-o-YEHPaOX93HVxKdwAAFhQ"]
[Mon Feb 21 16:12:58.350664 2022] [:error] [pid 1036742:tid 140554708952832] [client 66.249.66.219:0] [client 66.249.66.219] ModSecurity: Warning. Fake Bot Plugin: Detected fake Googlebot. [file "/etc/modsecurity/plugins/fake-bot-after.conf"] [line "27"] [id "9504110"] [msg "Fake bot detected: Googlebot"] [data "Matched Data: googlebot found within REQUEST_HEADERS:User-Agent: Googlebot-Image/1.0"] [severity "CRITICAL"] [ver "fake-bot-plugin/1.0.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-bot"] [tag "capec/1000/225/22/77/13"] [tag "PCI/6.5.10"] [tag "paranoia-level/1"] [hostname "cdn.achatpc.com"] [uri "/media/catalog/product/cache/993aa024f1a2c812c347b0876f4d0efd/4/1/41810189304034_5675549042.jpg"] [unique_id "YhOr-o-YEHPaOX93HVxKdwAAFhQ"]
[Mon Feb 21 16:13:16.494867 2022] [:error] [pid 1031694:tid 140555434333952] [client 92.184.105.227:0] [client 92.184.105.227] ModSecurity: collections_remove_stale: Failed to access DBM file "/var/cache/modsecurity/magento_user-global": Permission denied [hostname "cdn.achatpc.com"] [uri "/static/version1645361515/frontend/Emipro/achatpc/fr_FR/fonts/opensans/regular/opensans-400.woff2"] [unique_id "YhOsDIfnLCUjpOwKAX-suQAACQA"], referer: https://cdn.achatpc.com/static/version1645361515/_cache/merged/fonts_a0ee8e469788a2508da646d4452deae3.min.css
[Mon Feb 21 16:13:16.494927 2022] [:error] [pid 1031694:tid 140555434333952] [client 92.184.105.227:0] [client 92.184.105.227] ModSecurity: collections_remove_stale: Failed to access DBM file "/var/cache/modsecurity/magento_user-ip": Permission denied [hostname "cdn.achatpc.com"] [uri "/static/version1645361515/frontend/Emipro/achatpc/fr_FR/fonts/opensans/regular/opensans-400.woff2"] [unique_id "YhOsDIfnLCUjpOwKAX-suQAACQA"], referer: https://cdn.achatpc.com/static/version1645361515/_cache/merged/fonts_a0ee8e469788a2508da646d4452deae3.min.css
EDIT:
I fixed ModSecurity: collections_remove_stale: Failed to access DBM file "/var/cache/modsecurity/magento_user-global" issue with correct folder right.
Bad bot erro is issue with REMOTE_HOST: 137.74.122.35
from fake-bot-plugin.
So here is the problem:
Fake Bot Plugin DEBUG: REMOTE_ADDR: 66.249.66.219 REMOTE_HOST: 137.74.122.35
Looks like mod_remoteip is updating REMOTE_ADDR but not REMOTE_HOST.
from fake-bot-plugin.
@achatpc Can you try current version?
from fake-bot-plugin.
@achatpc Can you try current version?
What do you mean by "current version" ?
from fake-bot-plugin.
Redownload fake-bot.lua: https://github.com/coreruleset/fake-bot-plugin/blob/main/plugins/fake-bot.lua
from fake-bot-plugin.
Redownload fake-bot.lua: https://github.com/coreruleset/fake-bot-plugin/blob/main/plugins/fake-bot.lua
Your change fix false positive issue on log. But i need to try if working.
i try this from other server:
curl http://www.achatpc.com --header "User-Agent: Googlebot"
no error in log. I think that detection work not.
Or, my test is not correct. I can not see trace in access.log of my curl requests
from fake-bot-plugin.
It works on my side (but your test seems correct). Can you check if you have also newest version of this file?
https://github.com/coreruleset/fake-bot-plugin/blob/main/plugins/fake-bot-after.conf
from fake-bot-plugin.
yes, if you are 217.*.*.*, working
[Mon Feb 21 17:31:05.165732 2022] [:error] [pid 1065666:tid 140535896561408] [client 217.*.*.*:36946] [client 217.*.*.*] ModSecurity: Warning. Fake Bot Plugin: Detected fake Googlebot. [file "/etc/modsecurity/plugins/fake-bot-after.conf"] [line "27"] [id "9504110"] [msg "Fake bot detected: Googlebot"] [data "Matched Data: googlebot found within REQUEST_HEADERS:User-Agent: Googlebot"] [severity "CRITICAL"] [ver "fake-bot-plugin/1.0.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-bot"] [tag "capec/1000/225/22/77/13"] [tag "PCI/6.5.10"] [tag "paranoia-level/1"] [hostname "www.achatpc.com"] [uri "/"] [unique_id "YhO-SNRB6SChFwbpfMevmAAIkAU"]
from fake-bot-plugin.
Yes, that was me. :D
from fake-bot-plugin.
Yesm that was me. :D
Ok perfect. Thank you a lot.
from fake-bot-plugin.
Yes, that was me. :D
Can i contact you in private ?
from fake-bot-plugin.
You can contact me on jozef at sudolsky dot sk .
from fake-bot-plugin.
Thanks for reporting and testing! Closing.
from fake-bot-plugin.
Related Issues (9)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fake-bot-plugin.