Giter Club home page Giter Club logo

Comments (4)

junyer avatar junyer commented on May 27, 2024

IMO, this is an egregious misuse of regular expressions. Between that and the ~6yo version of RE2, it wouldn't be reasonable for me to spend time (either professionally or personally) digging into why the linear-time constant seems to be larger for the single pattern. There are better (i.e. more readable, more maintainable, more efficient) ways of parsing data in such a format. (The data would be in a better format, ideally, but that may or may not be within your control.)

from re2.

huajosep avatar huajosep commented on May 27, 2024

Thank you for the quick response. Unfortunately, we don’t have much control over both the data and the pattern (parse query) as they come from our customers. We agree that such a pattern is not ideal. To us, it appears that the latency grows exponentially with the increase in the number of match groups in the pattern. If you can confirm that RE2::Match always maintains linear performance in the latest version of RE2, even for the edge case we encountered, it would greatly help us to determine whether an upgrade to RE2 is a viable solution.

from re2.

rsc avatar rsc commented on May 27, 2024

If you have a test case you should be able to try the latest RE2 yourself.

from re2.

junyer avatar junyer commented on May 27, 2024

Also, it seems unsound to draw conclusions about asymptotic complexity from two (2) data points. For now, my guess is that the DFA and NFA execution engines incur overhead (i.e. increase the linear-time constant) due to the large number of (.*?) subexpressions. (Specifically, combining so much ambiguity with so many capturing groups would amplify cost significantly.) Updating the version of RE2 could be a mitigation, but considering it a solution would be a categorical mistake.

from re2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.