Giter Club home page Giter Club logo

Comments (8)

bjeanes avatar bjeanes commented on June 8, 2024

Thinking on this more, could the IP regex be delegated to stdlib's Resolv::AddressRegex?

[7] pry(main)> Resolv::AddressRegex
=> /(?:(?-mix:\A((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\.((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\.((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\.((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\z))|(?:(?x-mi:
      (?:(?x-mi:\A
      (?:[0-9A-Fa-f]{1,4}:){7}
         [0-9A-Fa-f]{1,4}
      \z)) |
      (?:(?x-mi:\A
      ((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?) ::
      ((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?)
      \z)) |
      (?:(?x-mi:\A
      ((?:[0-9A-Fa-f]{1,4}:){6,6})
      (\d+)\.(\d+)\.(\d+)\.(\d+)
      \z)) |
      (?:(?x-mi:\A
      ((?:[0-9A-Fa-f]{1,4}(?::[0-9A-Fa-f]{1,4})*)?) ::
      ((?:[0-9A-Fa-f]{1,4}:)*)
      (\d+)\.(\d+)\.(\d+)\.(\d+)
      \z))))/

EDIT: Yes, there is a good reason. The regex is anchored so wouldn't match anything in a logging context.

from logstop.

ankane avatar ankane commented on June 8, 2024

Hey @bjeanes, great suggestion. I created an ipv6 branch for this. Unfortunately, there's a pretty big performance hit (slows down log throughput by around 33%), so need to decide what to do. Also need to make it work with url encoding. Anyways, should should be able to use the regexp from the branch for your use case for now.

https://github.com/ankane/logstop/compare/ipv6

from logstop.

bjeanes avatar bjeanes commented on June 8, 2024

Hah!

Regexp.new(Resolv::AddressRegex.source.gsub('\A', '\b').gsub('\z', '\b'))

^^ that is exactly what I have just deployed to production already!

from logstop.

bjeanes avatar bjeanes commented on June 8, 2024

slows down log throughput by around 33%

I've deployed this and the slowdown is noticeable in web RPMs too. 🤔

so need to decide what to do

I run on Heroku so my logs go via STDOUT. I do wonder if I'm better off with a solution that filters the logs outside of the core Ruby process, even if backed by this gem. That should, at least in theory, allow for some better use of multiple cores...

from logstop.

ankane avatar ankane commented on June 8, 2024

I'm surprised it's noticeable at the RPM level. What difference are you seeing there?

from logstop.

bjeanes avatar bjeanes commented on June 8, 2024

from logstop.

ankane avatar ankane commented on June 8, 2024

Just ran benchmarks with Ruby 2.7 and latest code:

Warming up --------------------------------------
             no ipv6     3.126k i/100ms
                ipv6     1.938k i/100ms
Calculating -------------------------------------
             no ipv6     34.599k (± 3.5%) i/s -    175.056k in   5.065833s
                ipv6     19.874k (± 3.4%) i/s -    100.776k in   5.076570s

It still appears to reduce throughput significantly, but 20k iterations per second is still pretty fast and most of the time spent in an application is not in logging. Will plan to merge once IPv6 has more adoption.

Edit: another approach could be to use a less complex regex if common sources of IPs (like Rack::Request) use a specific format.

from logstop.

bjeanes avatar bjeanes commented on June 8, 2024

Edit: another approach could be to use a less complex regex if common sources of IPs (like Rack::Request) use a specific format.

Yeah, or use a less complex regex regardless. It probably is better (from GDPR etc standpoint) to accidentally filter non-IPs but catch all actual IPs than to only catch IPs but take a large hit in performance. This could even be a configuration option, that swaps in a more accurate regex when the user opts into the trade-off of slower perf.

from logstop.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.