Deion There's a bug in props parsing. Looks like a flaw in

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I just experimented with this simple implementation (see <a href="https://github.com/z

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks for your detailed investigation, <a class="user-mention notranslate" data-hover

Bug in props parsing about nicegui HOT 7 CLOSED

ed2050 commented on July 22, 2024

Bug in props parsing

from nicegui.

Comments (7)

ed2050 commented on July 22, 2024 1

Sorry I accidentaly hit post before text was finished. Please see modified text above.

from nicegui.

falkoschindler commented on July 22, 2024

Hi @ed2050, thanks for reporting this issue!

Actually, we originally used shlex for parsing props, but replaced it with regex for performance reasons: #341. Since the .props() method is potentially called thousands of times, performance does matter quite a lot. So even though I'd like to fix the bug with equal signs in prop values, it might be better to improve the regex rather than switching to shlex. But maybe there's also a way to reach similar performance with shlex. I'm totally open for suggestions.

Just for the record: A trivial workaround would be to use quotes around the value, like .props(f'src="{ imgfile }"').

from nicegui.

falkoschindler commented on July 22, 2024

I just experimented with this simple implementation (see "props" branch):

@staticmethod
def _parse_props(text: Optional[str]) -> Dict[str, Any]:
    dictionary: dict[str, Any] = {}
    for token in shlex.split(text or ''):
        words = token.split('=', 1)
        dictionary[words[0]] = True if len(words) == 1 else words[1]
    return dictionary

This is around 50% slower for long strings like 'dark color=red label="First Name" hint="Your \"given\" name" input-style="{ color: #ff0000 }"', and even 80% slower for short strings like 'dark':

       short  long
shlex  0.030  0.260
regex  0.006  0.130

from nicegui.

falkoschindler commented on July 22, 2024

I think I fixed the problem by adding "=" to the list of allowed characters in an unquoted props string.

from nicegui.

ed2050 commented on July 22, 2024

This post contains a modest improvement if you want even more speedup to your regex solution (2x on simple strings). _parse_props code at end.

Timings

I did timing tests for shlex vs nicegui regex. Looks like a 25x speed difference on my machine:

> python3 -m timeit -s 'import shlex ; text = """src=foo=bar.jpg href="foo bar.jpg" rab=oof"""' -c 'shlex.split (text)'
5000 loops, best of 5: 49 usec per loop

> python3 -m timeit -s 'import nicegui.element as e; regex = e.PROPS_PATTERN ; text = """src=foo=bar.jpg href="foo bar.jpg" rab=oof""" ' -c 'regex.search (text).groups ()'
200000 loops, best of 5: 2 usec per loop

Interesting that your tests with current regex version have closed that gap. Only about 2:1 difference for long strings and 5:1 for short. But you prob measured complete function not just splitting the string.

Improvement

Speeding up shlex doesn't appear to be possible. At least without lib hacking.

If you really want a speedup, you can handle simple cases separately. Checking a string for a quote char is very fast: 25 ns for a 100 char string on my machine. It's much faster to check for quote and use str.split () if none present. Otherwise fall back to regex as desired.

if '"' not in text :
    words = text.split ()
else :
    words = [ m.groups () for m in PROPS_PATTERN.finditer (text) ]

This speeds up splitting simple cases by 10x:

# has quotes, parse with regex
> python3 -m timeit -s 'import nicegui.element as e; text = """src=foo=bar.jpg href="foo bar.jpg" rab=oof""" ; t2 = "foo=bar rab=oof" ; regex = e.PROPS_PATTERN ; quote = """_"_""" [1]' -c 'quote in text and regex.search (text).groups () or text.split ()'
200000 loops, best of 5: 1.98 usec per loop

# no quotes, parse with str.split
> python3 -m timeit -s 'import nicegui.element as e; text = "foo=bar rab=oof" ; regex = e.PROPS_PATTERN ; quote = """_"_""" [1]' -c 'quote in text and regex.search (text).groups () or text.split ()'
2000000 loops, best of 5: 186 nsec per loop

Is it worth it? Well... maybe. By the time you add all the other code in _parse_props, the difference is only 2x on simple strings (1.7 us per call vs 3.3 us). Significant, but _parse_props is probably not a bottleneck in the overall app. But you were concerned about a 2x difference between shlex and regex, so maybe.

Implementation

Here's an implemention of _parse_props with this speedup. It's 2x faster on simple strings (no quotes) by skipping the regex. On strings with quotes, speed is unchanged.

def _parse_props (text) :

    props = {}
    # first split text into list of pairs
    if '"' in text :
        # get match groups from regex and keep values that aren't None
        pairs = [ [ x for x in w.groups () if x ] for w in e.PROPS_PATTERN.finditer (text) ]
    else :
        pairs = [ t.split ('=', 1) for t in text.split () ]

    # now transform pairs into props dict
    for key, *value in pairs :
        
        value = value and value [0] or ''    # handle empty values
        if value and value.startswith ('"') and value.endswith ('"') :
            value = json.loads(value)
        props [key] = value or True
        
    return props

from nicegui.

ed2050 commented on July 22, 2024

@falkoschindler

Actually, we originally used shlex for parsing props, but replaced it with regex for performance reasons: #341.

That's unfortunate. I hate when the clear, simple, obvious solution isn't fast enough.
Regexes are tricky. I've used and abused them a lot. Very easy for bugs to sneak in.

from nicegui.

falkoschindler commented on July 22, 2024

Thanks for your detailed investigation, @ed2050!
We'll leave the implementation as it is for now, since it is working fine and the performance is good enough. Using a simple .split() for strings without quotes is an interesting idea though. 👍🏻

from nicegui.

Bug in props parsing about nicegui HOT 7 CLOSED

Comments (7)

Timings

Improvement

Implementation

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent