Giter Club home page Giter Club logo

Comments (4)

ltrzesniewski avatar ltrzesniewski commented on June 11, 2024 1

FWIW, here's a faster way to match two words in the same line: ^(?=.*?export)(?=.*?function)

from ripgrep.

ltrzesniewski avatar ltrzesniewski commented on June 11, 2024 1

I don't mean to be pedantic, but I'll add a few comments since it's an interestic topic 🙂

The .NET regex engine supports timeouts (there's a constructor overload which takes a timeout parameter). This feature is mainly provided for searching with untrusted patterns in order to avoid DoS attacks. I haven't looked at how it's implemented but I believe a check is inserted when backtracking (and maybe in some other cases such as lookarounds). The check is simple enough but it still needs to retrieve the clock when enabled. In any case this article states:

Regex supports timeouts, and guarantees that it will only do at most O(n) work (where n is the length of the input) between timeout checks

As for PCRE2, it has a PCRE2_AUTO_CALLOUT flag which causes the engine to call a user-provided function before each pattern "item", and you can cancel execution from there, so in theory you could implement timeouts, but that would surely wreck the matching performance.

from ripgrep.

LangLangBart avatar LangLangBart commented on June 11, 2024

^(?=.?export)(?=.?function)

It serves me well, thank you.

from ripgrep.

BurntSushi avatar BurntSushi commented on June 11, 2024

Or also, rg export | rg function.

Anyway, PCRE2 doesn't support a way to say, "if the search is taking longer than X time, stop." I don't know of any regex engine that supports that. If you think for a moment about how something like that would be implemented, it's pretty clear why: imagine trying to check a timeout value when you're in the middle of some vectorized SIMD loop. It would trash performance. And this sort of timeout check would need to be in virtually every loop everywhere. Ain't going to happen.

PCRE2, like most regex engines, does expose other types of resource limits. But none of them really approximate "time."

Your best bet is to filter out longer lines yourself (rg -v '.{100}' or something), or use a technique that doesn't require using a backtracking engine. Hopefully some day #875 will happen and that will probably be the "right" way to solve your particular problem.

from ripgrep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.