noxone / regex-generator Goto Github PK
View Code? Open in Web Editor NEWGenerate regular expressions from sample texts.
Home Page: https://regex-generator.olafneumann.org
License: MIT License
Generate regular expressions from sample texts.
Home Page: https://regex-generator.olafneumann.org
License: MIT License
Describe the bug
On browsers that do not support the copy&caste function the "Copy Regex" will be removed from DOM. In this case the checkbox "Generate only patterns" is not left-aligned.
To Reproduce
Expected behavior
The checkbox should be left-aligned if the button is not visible.
Smartphone (please complete the following information):
first of all - very nice project!!
would you consider get a files instead of reading a very long string?
I had a problem where I put some very large text (like html)
Add a possibility to "configure" recognizers... or at least the output of a recognizer.
First idea would to be able to generate greedy and lazy patterns. Maybe there are even more options that could be added to some recognizers.
When editing capturing groups #109, enable the user to do the following things:
?
✅*
✅, +
✅ or {3,5}
Add the possibility to edit the sample text directly in step 2 (maybe remove step 1)
I deleted a bunch of Text from the Sample Textbox
An error occurred with these details:
- Exception: `index: 9, size: 9`
- Commit ID: `8eded646836e376d693d1010a2f56ea22f6c75ef`
- UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/110.0
- Vendor:
- Language: de
- Platform: Win32
- CPU: Windows NT 10.0; Win64; x64
- TouchPoints: 0
- Plugins:
- PDF Viewer
- Chrome PDF Viewer
- Chromium PDF Viewer
- Microsoft Edge PDF Viewer
- WebKit built-in PDF
- StackTrace: rl: index: 9, size: 9
oa@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:110348
ol@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:122391
744/e/Vt.prototype.g1@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:133400
Qi@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:93330
744/e/to.prototype.j@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:151402
654/i/Nu/e.onmouseleave/e.onmouseleave<@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:618714
654/i/_u.prototype.b3m@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:700272
654/i/Nu/e.onmouseleave@https://regex-generator.olafneumann.org/regex-generator.js?commitId=8eded646836e376d693d1010a2f56ea22f6c75ef:1:618659
I'm on Windows 11 Pro
Text like hgf\n\t <
leads to errors.
They are currently suppressed so the users are not bothered, but they should be fixed.
It is very helpful if the developer could run the application locally inside docker.
In case the user does rage clicks in the pattern selection part, the page should display a popover, what should be changed...
Maybe if 4 clicks within 2 seconds appear... show a popover.
Is your feature request related to a problem? Please describe.
Currently the language snippets are closed if you open the page. The page should store information, which snippet box is open. Once the page is reloaded the corresponding boxes should be reopened again
Let's provide another type of UI that is more similar to txt2re, that displays the matches in a better way...
Is your feature request related to a problem? Please describe.
Nope
Describe the solution you'd like
When hovering the proposed found regex's, there is the name of pattern that has been found. It would be cool to have a tooltip there to see an description of the pattern. Or maybe just the actual pattern that will be generated.
Add regex for
maybe other url stuff
Describe the bug
I'm trying to build the project but I get the following error when I run gradle run
[webpack-cli] TypeError: cli.isMultipleCompiler is not a function
shared | at Command.<anonymous> (/app/node_modules/@webpack-cli/serve/lib/index.js:146:35)
shared | at async Promise.all (index 1)
shared | at async Command.<anonymous> (/app/node_modules/webpack-cli/lib/webpack-cli.js:1674:7)
Seems like this is an issue caused by a recent update from webpack. The solution described here is to upgrade webpack-cli to 4.10.0
webpack/webpack#15951
So I included this line in build.gradle under dependencies:
implementation npm("webpack-cli", "4.10.0")
However, then I get this error:
Execution failed for task ':packageJson'.
> There is already declared version of 'webpack-cli' with version '4.10.0' which does not intersects with another declared version '4.9.2'
I've tried see if there's another dependency that uses webpack-cli 4.9.2 but I can't figure it out. Any help here is much appreciated. This is a great tool and I'd love to try developing with it!
Desktop (please complete the following information):
Hi,
Great idea this - REGEX by example!
I am trying to use the website (https://regex-generator.olafneumann.org/) to help me work out some non-trivial FIND / REPLACE tasks within VS-CODE Editor.
I appreciate that this is more likely a VS-CODE question but I thought I'd ask here at it raises the possibility of the additional feature of 'REPLACE- WITH' on the site.
An example of what I am trying to do *...
Within the text ...
overline{A}
... I wish to replace this with ¬A
Thus, I am replacing the content of the braces and the the outer function 'overline' with a ¬ followed by the original contents of the braces.
I am trying to use the web page to do this but the expression that it comes up with doesn't seems to 'find' anything when I use it in the FIND/REPLACE dialogue in VS-CODE
The expression I've used is...
^overline{[a-zA-Z]}$
Thank you
Describe the bug
The gradle build inside a docker container fails when running the unit tests.
To Reproduce
Steps to reproduce the behavior:
Simply follow the steps from the README file
docker build . -t noxone/regexgenerator
> Task :browserTest
Cannot start FirefoxHeadless
Command '/usr/bin/firefox' requires the firefox snap to be installed.
Please install it with:
snap install firefox
FirefoxHeadless stdout:
FirefoxHeadless stderr:
Command '/usr/bin/firefox' requires the firefox snap to be installed.
Please install it with:
snap install firefox
Cannot start FirefoxHeadless
Command '/usr/bin/firefox' requires the firefox snap to be installed.
Please install it with:
snap install firefox
FirefoxHeadless stdout:
FirefoxHeadless stderr:
Command '/usr/bin/firefox' requires the firefox snap to be installed.
Please install it with:
snap install firefox
Cannot start FirefoxHeadless
Command '/usr/bin/firefox' requires the firefox snap to be installed.
Please install it with:
snap install firefox
FirefoxHeadless stdout:
FirefoxHeadless stderr:
Command '/usr/bin/firefox' requires the firefox snap to be installed.
Please install it with:
snap install firefox
FirefoxHeadless failed 2 times (cannot start). Giving up.
java.lang.IllegalStateException: Errors occurred during launch of browser for testing.
- FirefoxHeadless
Please make sure that you have installed browsers.
Or change it via
browser {
testTask {
useKarma {
useFirefox()
useChrome()
useSafari()
}
}
}
> Task :test FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':test'.
> Failed to execute all tests:
:browserTest: java.lang.IllegalStateException: Errors occurred during launch of browser for testing.
- FirefoxHeadless
Please make sure that you have installed browsers.
Or change it via
browser {
testTask {
useKarma {
useFirefox()
useChrome()
useSafari()
}
}
}
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
* Get more help at https://help.gradle.org
BUILD FAILED in 1m 17s
Expected behavior
The build should run without problems and the docker build command should create a usable image.
Desktop (please complete the following information):
Additional context
The "normal" build works.
Is your feature request related to a problem? Please describe.
I'm always frustrated when I have the output regex and want to test it against even more text samples.
Describe the solution you'd like
Add a new (optional) step to very the generated regex against several lines of sample text.
Describe alternatives you've considered
Copy the regex into a page like regex101. But then I would need to change the website.
Github runner image ubuntu-latest
does not contain Firefox. They want to add it later. Once the runner image is updated, the workflow should be updated again to use the ubuntu-latest
again instead of ubuntu-20.04
.
Issue describing the change: actions/runner-images#6399
Pull request that needs to be merged before solving this issue: actions/runner-images#6528
Alternatively install Firefox later: https://www.omgubuntu.co.uk/2022/04/how-to-install-firefox-deb-apt-ubuntu-22-04
Currently the Recognizers in the application are very basic.
The application needs a bigger "library" off possible matches.
As requested by @faustfa in #190 :
How is your programming language called?
C
How does the snippet look like that we shall generate?
The snippet should be compilable and runnable as is. The user should be able to just copy and paste the code and have a fully functional application.
#include <regex.h>
int useRegex(char* textToCheck) {
regex_t compiledRegex;
int reti;
int actualReturnValue = -1;
char messageBuffer[100];
/* Compile regular expression */
reti = regcomp(&compiledRegex, "^asd[0-9]+asd", REG_EXTENDED | REG_ICASE);
if (reti) {
fprintf(stderr, "Could not compile regex\n");
return -2;
}
/* Execute compiled regular expression */
reti = regexec(&compiledRegex, textToCheck, 0, NULL, 0);
if (!reti) {
puts("Match");
actualReturnValue = 0;
} else if (reti == REG_NOMATCH) {
puts("No match");
actualReturnValue = 1;
} else {
regerror(reti, &compiledRegex, messageBuffer, sizeof(messageBuffer));
fprintf(stderr, "Regex match failed: %s\n", messageBuffer);
actualReturnValue = -3;
}
/* Free memory allocated to the pattern buffer by regcomp() */
regfree(&compiledRegex);
return actualReturnValue;
}
Anything special about string literals?
Well, standard C-like string literals...
How can we specify options?
The options are part of the regcomp
function call.
REG_ICASE
REG_NEWLINE
Need a warning?
If there is no DOT_ALL, this is a warning.
Describe the bug
Inputting [id]=[6]
into rg_raw_input_text
generates \[id]=\[[^\]]*]
, whereas it should generate \[id\]=\[[^\]]*\]
, per https://stackoverflow.com/a/49111429/9731176 and Visual Studio Code.
To Reproduce
Steps to reproduce the behavior:
[id]=[6]
into rg_raw_input_text
.rg_button_copy
.Desktop:
PS /home/rokejulianlockhart> uname -a
Linux RQN6C6 6.2.2-1-default #1 SMP PREEMPT_DYNAMIC Thu Mar 9 06:06:13 UTC 2023 (44ca817) x86_64 x86_64 x86_64 GNU/Linux
PS /home/rokejulianlockhart> firefox -v
Mozilla Firefox 110.0.1
In case the name of a language contains a special character the UI will break.
The Kotlin code generates the HTML id from the language name to create the HTML elements to display the language snippets. In case the ID is invalid the page will not work anymore.
Use stats show, that users often click on boxes that do not allow clicking, because they want to select something. If there are several of these clicks, the page should highlight the area where to click to indicate what the user is doing "wrong"... or maybe we could change the way the page is styled so that it is more obvious where to click...
What's the new pattern about? What is it able to recognize?
Upper- and lowercase UUID4
How does the pattern look like?
[0-9a-fA-F]{8}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{12}
Is there anything special we need to take of when recognizing this pattern?
-
Describe the pattern
-
Technical reference
RFC4122. Does not reference the above RegEx (taken from ihateregex.com) but explains each part of an UUID.
What's the new pattern about? What is it able to recognize?
Alfanumeric string.
How does the pattern look like?
[A-Za-z0-9]+
Describe the bug
When a recognizer uses a search-regex the position of the match might be wrong in case characters in front of the main match are taken into account.
Is your feature request related to a problem? Please describe.
I'm always frustrated when trying to use regular expression with the terminal command grep
.
Describe the solution you'd like
A copy-and paste Regex snippet for terminal use under the Usage in programming languages
section
Currently the workflow "Scan with Detekt" https://github.com/noxone/regex-generator/actions/workflows/analyze-with-detekt.yml uses a slightly different configuration than the actual build job. This needs to be consolidated.
The user guide created with driver.js should be styled so it fits to the overall page style.
Usage analysis shows, that users quite often select several "digit" matches, instead of selecting "multiple digits". The page should recognize this behavior and then suggest to change the selection.
Describe the bug
When I enter a long URL, the page is broken
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I don't know whether this is a bug or a feature, but I think it could be improved although I know this is not of priority
Screenshots
If applicable, add screenshots to help explain your problem.
image
Desktop (please complete the following information):
Hi can you insert also Syntax language for PERL and PCRE and PCRE2 and classical C the world would reward you , and i too thanks so much for your big work
Is your feature request related to a problem? Please describe.
If the input is too long the parsing of the input and the recognition of patterns takes very long.
Describe the solution you'd like
Limit the number of characters that can be entered by the user.
Describe alternatives you've considered
Use a faster CPU.
The "Copy regex" button does not work on Firefox, so I need to copy the regex from the textbox. If I do that, I get 2 newlines which is really bad for pasting into code.
Enhance BracketedRecognizer results:
Describe the bug
Whenever the generated regex needs to (greedy) repeat a single character, the .*
that it's supposed to generate comes out as *.
To Reproduce
Steps to reproduce the behavior:
The most straightforward way to reproduce the error is to select some single characters in step 2. It appears when selecting other spans as well.
Expected behavior
Generated regex should've had .*
in place of *.
Desktop (please complete the following information):
An error occurred with these details:
can't access property "i3c_1", t is undefined
ff31f7729ea786ed917d48ce22c7c05069e9c652
Describe the bug
When asked to generate a pattern to match a string plus a trailing underscore, the generator outputs a pattern that matches the string plus a hyphen.
I can't seem to reproduce this issue ever since I reloaded the page, but it consistently did as above as long as I didn't reload it.
To Reproduce
Steps to reproduce the behavior:
TX_RESP_Q008
.Multiple characters
. Then, on the first line, click the first Character (_)
, which refers to the third character of the string. Then, click the second 'Character (_)', which refers to the eigth character of the string.[a-zA-Z]+_[a-zA-Z]+-Q008
or [a-zA-Z]+_[a-zA-Z]+-
, depending on if you select Generate only patterns' or not, which I did (these patterns were copied straight from the website using the 'Copy regex' button).Expected behavior
The pattern should have looked like this:
[a-zA-Z]+_[a-zA-Z]+_
Screenshots
Not necessary.
Desktop (please complete the following information):
Additional context
It was the first URL I opened when launching Edge. I had previously used the website. I first generated the RegEx without the trailing underscore, then I switched to the program RStudio, where I used a the sub() function to replace that pattern on a string, but I realized that it kept the underscore, so I corrected the RegEx, reran the function, then came back to the website to select the second underscore, and that's when it happened. I opened the DevTools to copy the trailing underscore from the source and pasted it into Google to make sure that the Unicode was from an underscore, and not, maybe, a weird Unicode character that the generator would have recognized as a hyphen, but it was your run of the mill underscore(here's the character I copied: _
)
Once generated the regular expression show ready-to-use source code for different languages so you can copy and paste your favourite language.
As mentioned in the readme, I would like to get rid of the substring()
calls. Currently there are two source files with substring in the code:
I would like to consider alternatives to substring()
to have a more expressive alternative.
Describe the solution you'd like
I would like to have a possibility to define capturing groups in the generated regex.
Describe the bug
The code generated for Python, in section 4, has some mistakes and the regex pattern is also functionally different from the one displayed in section 3.
For the sample []
in section 1, with the "Square brackets" selection in section 2:
the regex in section 3 says: \[\]
;
the Python code in section 4 says:
import re
def useRegex(input):
pattern = re.compile(r"\\[\\]", re.IGNORECASE)
return pattern.match(input)
input
is a keyword and should not be used as a variable name (even though the code does work this way).To Reproduce
Steps to reproduce the behaviour:
1) Make an expression that requires a character to be escaped with a backslash;
2) Scroll down and view the Python code; copy it and use it in Python;
3) Notice that the r-string part \\
is interpreted by Python as \\
, instead of \
.
Expected behaviour
The expected behaviour is that the Python code either uses a normal instead of r-string, or uses a r-string and does not try to unnecessarily escape the backslash. In the latter option, the same regex text as in section 3 should be used.
Desktop (please complete the following information):
would you consider support some python code?
if you some help with itegrate with python I actually can help you happily
When I paste in:
| | | 13024a.htm
to generates a regex for it, it uses the pipe as a plain character, which is an alternate selector, so it should be escaped in the regex rather.
I'm forced to use this stupid language at work, so it would be nice if you could add support for it.
How is your programming language called?
VB.net
How does the snippet look like that we shall generate?
Imports System.Text.RegularExpressions
Public Module SampleModule
Public Function useRegex(ByVal input As String) As Boolean
Dim re = New Regex("regex", RegexOptions.IgnoreCase Or RegexOptions.Singleline Or RegexOptions.Multiline)
Return re.IsMatch(input)
End Function
End Module
Anything special about string literals?
VB.net uses only " for string iterals. If you want to use " in a string you have to use ""
e.g.
MsgBox("Hello "" World")
Will show a MsgBox with Hello " World.
How can we specify options?
See above
Need a warning?
No
Describe the bug
Error is recieved when String is being identified for conversion
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expected to get the following regex in Section 4:
[2/10/23\s\d\d:\d\d:\d\d:\d\d\d\sCST]\s+[0-9]+\s[A-Za-z]+\s+O
but instead I got the error reported and in the 2nd screenshot
Desktop (please complete the following information):
Additional context
An error occurred with these details:
Cannot read properties of undefined (reading 'v3b_1')
d0df4af8c79e688174f426dc5363d72d75fa22ce
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.