Giter Club home page Giter Club logo

Comments (7)

ed2050 avatar ed2050 commented on July 22, 2024 1

Sorry I accidentaly hit post before text was finished. Please see modified text above.

from nicegui.

falkoschindler avatar falkoschindler commented on July 22, 2024

Hi @ed2050, thanks for reporting this issue!

Actually, we originally used shlex for parsing props, but replaced it with regex for performance reasons: #341. Since the .props() method is potentially called thousands of times, performance does matter quite a lot. So even though I'd like to fix the bug with equal signs in prop values, it might be better to improve the regex rather than switching to shlex. But maybe there's also a way to reach similar performance with shlex. I'm totally open for suggestions.

Just for the record: A trivial workaround would be to use quotes around the value, like .props(f'src="{ imgfile }"').

from nicegui.

falkoschindler avatar falkoschindler commented on July 22, 2024

I just experimented with this simple implementation (see "props" branch):

@staticmethod
def _parse_props(text: Optional[str]) -> Dict[str, Any]:
    dictionary: dict[str, Any] = {}
    for token in shlex.split(text or ''):
        words = token.split('=', 1)
        dictionary[words[0]] = True if len(words) == 1 else words[1]
    return dictionary

This is around 50% slower for long strings like 'dark color=red label="First Name" hint="Your \"given\" name" input-style="{ color: #ff0000 }"', and even 80% slower for short strings like 'dark':

       short  long
shlex  0.030  0.260
regex  0.006  0.130

from nicegui.

falkoschindler avatar falkoschindler commented on July 22, 2024

I think I fixed the problem by adding "=" to the list of allowed characters in an unquoted props string.

from nicegui.

ed2050 avatar ed2050 commented on July 22, 2024

This post contains a modest improvement if you want even more speedup to your regex solution (2x on simple strings). _parse_props code at end.

Timings

I did timing tests for shlex vs nicegui regex. Looks like a 25x speed difference on my machine:

> python3 -m timeit -s 'import shlex ; text = """src=foo=bar.jpg href="foo bar.jpg" rab=oof"""' -c 'shlex.split (text)'
5000 loops, best of 5: 49 usec per loop

> python3 -m timeit -s 'import nicegui.element as e; regex = e.PROPS_PATTERN ; text = """src=foo=bar.jpg href="foo bar.jpg" rab=oof""" ' -c 'regex.search (text).groups ()'
200000 loops, best of 5: 2 usec per loop

Interesting that your tests with current regex version have closed that gap. Only about 2:1 difference for long strings and 5:1 for short. But you prob measured complete function not just splitting the string.

Improvement

Speeding up shlex doesn't appear to be possible. At least without lib hacking.

If you really want a speedup, you can handle simple cases separately. Checking a string for a quote char is very fast: 25 ns for a 100 char string on my machine. It's much faster to check for quote and use str.split () if none present. Otherwise fall back to regex as desired.

if '"' not in text :
    words = text.split ()
else :
    words = [ m.groups () for m in PROPS_PATTERN.finditer (text) ]

This speeds up splitting simple cases by 10x:

# has quotes, parse with regex
> python3 -m timeit -s 'import nicegui.element as e; text = """src=foo=bar.jpg href="foo bar.jpg" rab=oof""" ; t2 = "foo=bar rab=oof" ; regex = e.PROPS_PATTERN ; quote = """_"_""" [1]' -c 'quote in text and regex.search (text).groups () or text.split ()'
200000 loops, best of 5: 1.98 usec per loop

# no quotes, parse with str.split
> python3 -m timeit -s 'import nicegui.element as e; text = "foo=bar rab=oof" ; regex = e.PROPS_PATTERN ; quote = """_"_""" [1]' -c 'quote in text and regex.search (text).groups () or text.split ()'
2000000 loops, best of 5: 186 nsec per loop

Is it worth it? Well... maybe. By the time you add all the other code in _parse_props, the difference is only 2x on simple strings (1.7 us per call vs 3.3 us). Significant, but _parse_props is probably not a bottleneck in the overall app. But you were concerned about a 2x difference between shlex and regex, so maybe.

Implementation

Here's an implemention of _parse_props with this speedup. It's 2x faster on simple strings (no quotes) by skipping the regex. On strings with quotes, speed is unchanged.

def _parse_props (text) :

    props = {}
    # first split text into list of pairs
    if '"' in text :
        # get match groups from regex and keep values that aren't None
        pairs = [ [ x for x in w.groups () if x ] for w in e.PROPS_PATTERN.finditer (text) ]
    else :
        pairs = [ t.split ('=', 1) for t in text.split () ]

    # now transform pairs into props dict
    for key, *value in pairs :
        
        value = value and value [0] or ''    # handle empty values
        if value and value.startswith ('"') and value.endswith ('"') :
            value = json.loads(value)
        props [key] = value or True
        
    return props

from nicegui.

ed2050 avatar ed2050 commented on July 22, 2024

@falkoschindler

Actually, we originally used shlex for parsing props, but replaced it with regex for performance reasons: #341.

That's unfortunate. I hate when the clear, simple, obvious solution isn't fast enough.
Regexes are tricky. I've used and abused them a lot. Very easy for bugs to sneak in.

from nicegui.

falkoschindler avatar falkoschindler commented on July 22, 2024

Thanks for your detailed investigation, @ed2050!
We'll leave the implementation as it is for now, since it is working fine and the performance is good enough. Using a simple .split() for strings without quotes is an interesting idea though. 👍🏻

from nicegui.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.