firasdib / regex101 Goto Github PK

View Code? Open in Web Editor NEW

3.2K 3.2K 199.0 20 KB

This repository is currently only used for issue tracking for www.regex101.com

regex101's People

Contributors

Stargazers

Watchers

Forkers

godlesz jay3ird vamc007 deepesh0102 sayopaul wellic lucas2003 yubobo kngspook pebsconsulting lunaticfrank homebrewcrew inoas roger-melo devjoo mdstart sagafav subrata11 rex200789 lukederpwalker james-xuan alexanderz11234 silavsale ethos2pathos zarthus tintohill jigneshraval minghuilf benwoodworth johangenis fionafanfan hlynski sidpathak92 sahwar shiryabdo onimgame deiviands zombig kuzeygh krinkelss wolf2407 ktp-forked-repos enconnn slopez-lab mr-256 lukelucode kalimita deadmau8772 alleensmith nafie911 diablov sinferwu dnettoraw mariapapadopoulou m7md889 yifan-blog cikey66 martin1122 acidburn0zzz christian7877 the-weird ifhbdod matthelonianxl koedi23 mxm2776 c3333 morristech atooq atook-org hulyakirtasiyeoyuncak-desing-com saimakin34 jhuanglabtools betgar global19 shaneutt rogerluan blahblah1777 kory915 847689421 ibanllopart supbose sitedata kmastore-id a402539 sportymsk hasnightmare terminator-ivanovich baldhakal joshua-nigro jefferycline1 arfanakash kindra67 wbhsm lordoftheflies chris-jr-williams chrisjrwilliams xphillyx anezdoank03 haaami01 timo955

regex101's Issues

site colors

I find the new colors are cool but make the site hard to read.
All the best,
Slewitan, an aging programmer

Explanation panel over Information panel

In my opinion the explanation panel is important, but the information one is below over explanation one. Actually when you use regexp is because more or less you know about regexps, but the most important things is know what results give your regexp. So for me i think is better if the information panel is above of the explanation one. Also why is "Information" and no Matches?

Is just my suggestion as lover of this tool :)

Mismatch between the highlight and the match list?

http://regex101.com/r/iR0bX7

Firefox 29.0.1 display issue

Match highlights are not properly rendered on Firefox (see screenshot). And it seems the cursor isn't accurate in the textarea as well (i.e. I have to click a few characters to the left to place my cursor in the right position)

codegen: Dart Language Support

Support for Dart Code Generation, etc.

Dart Site

Insert parenthesis around highlighted text

While writing an expression I often find myself thinking the sub-expr I just wrote needs to be in a group. So I hit the left arrow key while holding shift to highlight then hit shift left parenthesis, thinking that both left and right would be inserted around that text. Currently, all text is replaced. I'd like to see this behaviour changed. The cursor placement after char insertion is also important. For example after hitting left parenthesis, the cursor should be placed after the left to easily configure the group for non-capture, look-around etc, whereas hitting the right should place it after the right to enable entering repeat modifier.

Another suggestion would be to auto include the right parenthesis any time the left is entered. As it is, after entering the left I have to then enter the right and replace the cursor within, otherwise the editor shows the expression to be in error.

Case insensitive modifier (/i) doesn't work with cyrillic characters in PCRE\Phyton flavors

Example (/i modifier enabled):

Test matches tEsT
Проба doesn't matches пРоБа (cyrillic characters)

Multilanguage

I can help with Spanish...

Four backslashes in PHP

PHP needs four backslashes to match single backslash. For example, if I test [^\\]\/[a-z]*$ against foobar in your wonderful project, it gives 'No matches'. Yet:

php > var_dump(preg_match('/[^\\]\/[a-z]*/i', 'foobar'));
int(1)

php > var_dump(preg_match('/[^\\\\]\/[a-z]*$/i', 'foobar'));
int(0)

and, as expected:

php > var_dump(preg_match('/[^\\\\]\/[a-z]*$/i', 'foobar/i'));
int(1)

php > var_dump(preg_match('/[^\\\\]\/[a-z]*$/i', 'foobar\/i'));
int(0)

This is a known PHP easter egg ))

Error in generated JS code

The generated Javascript code for replace uses \n for using captured lists. It should be $n. Example here: http://regex101.com/#javascript

Versions

Suggestion for future consideration, tracking newer versions. Even something as subtle as.

or "This regex has a fork"

In the case where an expression has more than one fork, on-click possibly show the "Community" tab filtered so that you only see newer forks of this expression.

This would be extremely helpful for improving community submitted regex's.

There should be an entire pattern match in Match information

Match informatin should containt an entire pattern match in addition to captured groups.

Example:

Pattern: f[os]
Text: foo
Entire pattern match: fo

It corresponds to $& in perl or matcher.group(0) in Java (after matcher.find() == true).

Checkbox to turn off line wrapping

First off, regex101 is the best regex website by far (and I've probably used 20+ other sites and programs in the past 5 years)!

Sometimes I need to perform regex on a loooooooong string (think over 1000+ characters per line) that are usually some SQL dump or CSV file. The original text file looks very clean with all of its perfectly aligned columns. Unfortunately regex101 automatically wraps my result so it makes the text very hard to read.

I can force the textarea of the test string to not wrap with the CSS property white-space: nowrap; using Chrome Developer Tool, but I noticed the color highlighting and the whitespace visualization doesn't update with the no-wrap CSS hack (the position of the visualization is still set to wrapped text). It also doesn't display a horizontal scrollbar so I can't even view the rest of the line.

I've noticed that the color highlighting and the whitespace visualization works nicely with vertical scrolling so I assume this is possible with a horizontal scrolling as well.

Multiple-Match Mode

There is one usability feature I have been meaning to write to you about. I realize that you can tell the tester to try multiple matches by adding a 'g' flag. However, that will not be obvious for many users because unlike JavaScript, PHP does not have that flag. For this reason, I would suggest adding that feature as a toggle in settings or at the top of the pattern box.

What do you think?

Turning colorize syntax on/off does not rehighlight matches in document.

If I turn colorize syntax off, then turn it back on, the matches from my query are not re-colorized. Once I modify the regex string somehow, the match colorization comes back. This is unpredictable; sometimes flipping the colorization switch on/off doesn't cause the bug, sometimes it does.

Using Chrome 35.

Unresponsive or high CPU with recursive subpattern

http://regex101.com/r/tW6eK9

I know the incorrect group is being matched, I noticed the problem when adding a new capture group to the front of the expression. Clicking/Typing anywhere in the expression causes a high CPU use for about 20 seconds, in the meantime the page is unresponsive. This is on the chrome browser, not tested others.

add a search/filter on quick reference

the new quick ref great for those who know what they are looking for. But not everyone has that depth of knowledge.

I think just adding a little search bar that will filter out the items would be very helpful.

Alternatives are not correctly parsed

/(a|c|g|t)*/ schould match with 'gattaca', right?

but it doesn't...

php code generator is not escaping backslash

With this regexp ^(?:(.*)\\)?([^\\]+)$ (for catching namespace and class name) php code generator should generate ^(?:(.*)\\\\)?([^\\\\]+)$.

Mouse-over with small menu

Mousing-over incorrectly identifies regex parts in the middle DIV when using small menus.

No issue

Just wanted to say that http://regex101.com is fantastic. Great work.

Rather than an embed, the IRC tab should open a new tab in your browser.

Personally, I'm not too much of a fan of the skin-breaking embed, and suggest that the 'irc' tab which currently embeds webchat.freenode.net directly, instead opens a new tab in your browser.

I think nowadays most work settings / programmers have multiple monitors, I often find myself coding on one screen, and having multiple instances of my browser, or even irc open, on the other.

Due to the fact it is currently an embedding, I cannot review both things at the same time. For instance, when someone suggests me to change my regex, I can't directly view what my current regex is, and am instead going to have to switch tabs to verify, and if I forget what was said I have to switch back again. A minor point but also kind of tedious.

It also doesn't go with the skin, I don't really think this needs elaboration, light cyan doesn't go with a blackish background.

That said, it does bring to consideration that this might get slightly annoying and you may accidentally open multiple instances of IRC in your browser, so it's also possible that, instead of directly linking to it, opening an information page (like a FAQ) with a link so the user themselves can decide what they want to do with it.

regex section height too small

I've noticed that the regex section has a max-height. This is somewhat irritating when using the x modifier with long expressions since we need to scroll up and down in a 100px height box. Could we remove this max-height or set a higher limit?

Fix: remove/edit:

#richtext_regex_container {
    max-height: 100px;
}

Public source code?

Will you ever put the source code online? Or maybe a part of it? You might get support from the community.

Incomplete token?

I'm quite the beginner to regex, but I think I found a bug where any unicode code points that contain more than 4 characters cause an "Incomplete token" error to appear in the Explanation window. Eg \x{FFEF} is ok, but \x{10000} is not.

The regular expression I had originally tried is from Stack Overflow:

http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url

Issue with highlighting the match in test string

The first character in each line is highlighted even though there is no match. Match information is returning empty which is correct.

Please have a look at http://regex101.com/r/xF0gF8

Python auto-generated example code should use compiled regex

The compiled regex should be used instead of basic usage.

Simply replace:

import re
p = re.compile(r'\r\n|\r|\n')
str = "Unix text string\nOS X text string\rDos text string\r\n"
subst = "\r\n"

re.sub(p, subst, str)

by:

import re
p = re.compile(r'\r\n|\r|\n')
in_str = "Unix text string\nOS X text string\rDos text string\r\n"
subst = "\r\n"

out_str = p.sub(subst, in_str)

You can specify a name for the result: out_str.

Note: str is a reserved Python keyword.

Reference: https://docs.python.org/2/library/re.html#re.RegexObject.sub

Ruby flavor

Please add ruby support.

Javascript is UTF-16, PCRE is UTF-8

[\x7F-\xFF]+ should be able to match ߖ and does do so in PCRE since everything is UTF-8. However, things are represented as UTF-16 in Javascript, it does not accept ߖ as a valid match.

Python auto-generated example code shouldn't use `str` as a variable name

This overrides the built-in keyword str, masking it, so has the potential for confusion if the code is used as is. Maybe test_str or similar would be better?

(Great site otherwise, by the way 👍)

Community Submissions should have their regex language in the title

Hey there!

My suggestion should be fairly straightforward, but aside from filtering on language, perhaps it would be wise to mention what kind of Regex it is (pcre, js or python) in the title itself?

For clarification, I'm talking about this screen, where I suggest, that for example, instead of the title being "REGEX101.COM ID GRABBER", it would be "[PCRE] REGEX101.COM ID GRABBER" instead, so you quickly know what you're clicking on, and for which language it is.

Which additionally can be hidden when you're only sorting on one language.

Failed to detect a certain broken lookbehind.

For example, this is invalid:

(?<!&(\w\w|\w\w\w\w))

But Regex101 doesn't spot the problem.

Quiz

The quiz has been discontinued, or atleast not included in the current release. This is because the interest seemed to be very low and it required a lot of work to re-implement it for this new version. If people really are interested, leave a comment and I will reconsider it.

wrong highlighting

Please test. Look at the end of highlighting. It is not (")

pattern: (data-title=")(.+?)(")
string: iframe width="640" height="360" data-title="%D0%94%D0%B8%D1%81%D0%BA%D1%83%D1%81%D1%81%D0%B8%D0%BE%D0%BD%D0%BD%D1%8B%D0%B9%20%D1%81%D0%B5%D0%BC%D0%B8%D0%BD%D0%B0%D1%80%20%D0%BF%D0%BE%20%D1%82%D0%B5%D0%BC%D0%B5%3A%22%D0%9E%20%D1%81%D0%B2%D0%BE%D0%B1%D0%BE%D0%B4%D0%B5%20%D0%B2%D0%BD%D0%B5%D1%88%D0%BD%D0%B5%D0%B9%20%D0%B8%20%D1%81%D0%B2%D0%BE%D0%B1%D0%BE%D0%B4%D0%B5%20%D0%B2%D0%BD%D1%83%D1%82%D1%80%D0%B5%D0%BD%D0%BD%D0%B5%D0%B9%22" data-picprefix="sd" data-video_id="iVJee5GTqz8" frameborder="0" allowfullscreen></iframe

Support for vim regex patterns

Any chance of seeing support for vim regex pattern in regex101?

Select dialect through URL

I used to access the webpage through the http://regex101.com/#python URL, which saved me the effort of selecting the dialect I use through mouse clicking. The new layout prevents me doing so. I consider this to be a regression.

use `unicode` raw string with Python 2.7

With Python 2.7, you must use unicode raw string for UNICODE regex, ex.:

import re
p = re.compile(ur'(Summer|été)')
in_str = u"C’est l’été…"
subst = u"[\1]"

out_str = re.sub(p, subst, str)

Flags and Modifiers Help

Somehow show what each of the flags and modifiers do, as there is currently no explanation in plain sight.

Maybe on click of the input box open the Quick Reference pane and switch to the "Flags and Modifiers" section.

Not all quotes replace

http://regex101.com/r/fT5aU1#pcre

I made php file with preg_replace and in it all quotes replaced, not only first.
https://gist.github.com/versusbassz/ee4adc22322d4e7e853c

Is it bug or i didn't understand something?

sorryformyenglish

Code generated for PHP doesn't handle newlines properly

The following regular expression (PCRE):

^[a-z]+$

And the following input:

a
b
c

Will generate the following PHP code:

$re = '/^[a-z]+$/m'; 
$str = 'a\nb\nc'; 

preg_match_all($re, $str, $matches);

The problem is that escape sequences do not work inside single-quotes. The \n characters stay as it is — as a literal \ followed by an n. They need to be wrapped in double-quotes instead for it to work, i.e. it should be:

$re = '/^[a-z]+$/m'; 
$str = "a\nb\nc";

To improve it further, you could check the return value:

$match = preg_match_all($re, $str, $matches);

if ($match) {
    print_r($matches);
    // ...
}

Select All Substitution

If you have a large amount of text you replace with regex, it can be very annoying to have to scroll and copy the substituted text. What would be great would be a button to select all and the you could just copy it(something implemented like http://stackoverflow.com/questions/1173194/select-all-div-text-with-single-mouse-click would be great but so would opening a new tab with only that(as in, a link to a temporary webpage that was just text so the user could just control a and copy that way)). Even converting #subst_result div to an input box would work(I have a bookmarklet that implements that for just this occassion)

Partial matching

I'm not sure to what degree this is possible (as I'm not familiar with the internal workings), but it would be great if it were possible to see partial matches, optionally filtered by a minimum match length (to prevent it from matching absolutely everything).

When debugging regular expressions, this would make it very easy to spot at what point a regex stops matching, when it should be matching entirely.

Add replacement functionality

A la Javascript,
str = str.replace(/^(my)(regex101)$/, '$1improved$2');

Specifically,

Add an option for replacement.
If replacement is enabled,
a. Add a second input line, beneath the regex, to allow for the replace string
b. Add a second text area, to the right of the current text area, showing the result of the replace

Generated Python code broken

I was provided the following generated code:

import re
p = re.compile(ur'Node (\d+).*label (\w+).*property (.*)', re.IGNORECASE)
test_str = u"Node 534760 already exists with label Entity and property \"uid\"=[me:0]"

re.findAll(p, test_str)

This will not run - the re module does not have a function named findAll. Instead, it should be findall [citation].

Add ability to change delimiters

Hi,
I don't remember if the previous version this was automatic if a regexp with custom delimiters was found or not.
In this version delimiters are fixed to '/', and it would be really nice if those would be changeable with other characters (e.g. '#') as I use to test some regexp with many slashes and I do not want to escape all of them every time.

Thank you!

Browser UI hangs on some regexps

The browser (Firefox, Safari) hangs om some regexps.

This is the string I was using in my tests:

PB-RE-MVL14050100-02-Example document to test with.xlsx

Regex1:

(?P<OBS>.+)\-(?P<Document_Categorie>.+)\-(?P<Origin>.+)(?P<Jaar>.+)(?P<Maand>.+)(?P<Dag>.+)(?P<Volgnummer>.+)\-(?P<DrawingIndex>.+)\-(?P<DrawingFreeText>.+)\.(?P<DrawingStatus>.+)

Regexp 2 (detail):

(?P<OBS>CM|JV|OM|PB|PM|TM)\-(?P<Document_Categorie>BE|BG|CN|CO|CY|DR|FI|GW|OV|PB|PI|PP|QU|RE|TO|VG|WI)\-(?P<Origin>ANI|ARC|ARD|ASM|BBR|BHA|CHA|CHU|CIJ|CMU|CWA|DOD|DWA|EKL|FGI|FOO|GHE|GSC|GVE|HBA|HST|HVE|IWO|JHA|JJA|JME|JMH|JSC|KLU|MVL|PAU|PLO|RBR|RDE|RTE|RWS|TPR|TTI|WBO|WPO|WSV|WVI|WZW)(?P<Jaar>11|12|13|14|15|16|17|18|19|20|21|22|23|24)(?P<Maand>01|02|03|04|05|06|07|08|09|10|11|12)(?P<Dag>01|02|03|04|05|06|07|08|09|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31)(?P<Volgnummer>[0-9][0-9])\-(?P<DrawingIndex>[0-9][0-9])\-(?P<DrawingFreeText>.{1,30})\.(?P<DrawingStatus>xls|xlsx|pdf|xnk|DOC|mpp|pptx|JPG|ZIP|xlsm|PNG|avi|key|htm|plt|mdb|wmv|mpeg|LNK|thmx|skp|GIF|label|xml|ods|TXT|msg|AVI|ppt|bmp|DWG|eps|THMX|gif|gvsp|dwg|dwf|lnk|nwd|txt|XLS|vsd|BMP|mm|zip|ppsx|PDF|HTM|jpg|dwfx|rvt|png|DOCX|doc|docx|XML)

It's used for naming convention tests.

'\r' is invalid

the regex string '\s{2,} ' and 'Test String' Content area is invalid for some data like '======' below:

ssdfsadf

\r is invalid

^[\Q\E]$

missing terminating ] for character class - offset: 8

A few notes on UX/UI design

Introduction

I found regex101.com back in January when I was working on some incredibly challenging regular expressions (think 39 match items in about half as many lines–built in a for loop). I needed an easy way to test my work on the actual data in real-time. Regex101 proved to be a lifesaver. Needless to say, I've put the site through its paces and I really appreciate the service. This is why I've taken a few minutes to put down my thoughts for further discussion.

I'm really disliking certain aspects of the new theme as compared to before. Perhaps this is partly a case of "Who moved my cheese?!?" but perhaps some of these comparisons will be useful. The things I feel affect the usability are: Fixed height, lower contrast/less whitespace, and changes in element emphasis.

My Comparison

The first consideration is what parts of the site do I use most?

Regex box
Text box
Match information
Quick reference
Occasionally:
Substitution
Explanation
Match mode explanations

Secondly, how have these aspects changed?

The quick reference box is now collapsed and restructured
Match information is now on the right-hand side, easier to see, as is the Explanation
Substition, colapsed item, about the same
Match mode explanations apparently gone

Thirdly, what design elements do I feel have negatively impacted usability?
A fixed height has caused the restructuring of the quick reference box. This forces the user to click twice to see any information, three clicks total to get to an example usage, and many more clicks if multiple things need to be referenced. Previously, because height was not a consideration there were two(?) possible views for the quick reference box: basic, and advanced. In order to see it, I would two-finger scroll down (Mb Pro), view the reference, and two-finger scroll back up. This was much simpler and the second thing I noticed when I visited the newly themed sight.

Less whitespace combined with a lower contrast for the light theme, affect my desire to look at the site. This combined with the next item decrease my desire to use the site by decreasing the clarity of the visual hierarchy of the page.

Changed element emphasis has made it more challenging to navigate the single page. I really like the fact that the new design attempts to put all of the information you need right in front of you. I think the right side panel is pretty genius, really. However, by putting everything on one page, evenly spaced from other items, with the same font size and color, and the same background and heading colors, I suddenly have to think about what's important on the page because it's not blatantly, visually obvious. This is not good. Previously, there was a strong visual hierarchy for the difference between the information currently in the left side panel and the information in the panels to its right. That is now gone, and significantly contributes to the mental confusion.

Conclusion

There have been a lot of great changes to the design, theme, and usability. There have also been changes that have challenged me in my use of the site as compared to before. I do not have all of the answers, yet I submit my analysis in the hope that it might contribute in some small way to the success of this much-appreciated service, regex101.com. Thank you.

Typo for "lookahead"

The new hover info is great! A quick typo I noticed:

"Lokahead": Start of positive "lokahead" assertion