Giter Club home page Giter Club logo

Comments (9)

jschroed91 avatar jschroed91 commented on August 23, 2024

Hmm... very interesting @danepowell.

Is it happening every time? Are you able to share examples of the HTML inputs that are going into htmldiff? Or is it happening for any input?

from php-htmldiff.

SavageTiger avatar SavageTiger commented on August 23, 2024

Also it might be good to know what version of PHP this is happening on.

from php-htmldiff.

danepowell avatar danepowell commented on August 23, 2024

It happens every time I run a certain command. Like I mentioned, it's hard to know what input is being passed to the library since I'm not calling it directly, but I'll try to find out.

I've reproduced it on PHP 7.1 and 7.2. Haven't tried other versions.

My hunch is that this is a out of memory error. I think I've seen this happen before with excessively long strings being passed to mbstring. But i really can't know for sure.

from php-htmldiff.

danepowell avatar danepowell commented on August 23, 2024

I've narrowed it down quite a bit.

This is the line generating the segfault:

return preg_match("/^[a-zA-Z0-9\pL]+$/u", $str);

The latest release works fine if I simply change that line back to mimic the old version:
return ctype_alnum($str);

I'm still having trouble figuring out if or where that line is being called. Even if I put print('hello world'); or exit; statements in there, nothing seems to happen. It's like the line isn't being called at all, and yet it's generating a segfault. It's also possible the output is getting swallowed by SSH/TTY or something.

Edit: now it's segfaulting even if I have the old line in there, so it's possible I'm chasing a red herring here. I'll see what else I can find. Sorry for the noise...

from php-htmldiff.

SavageTiger avatar SavageTiger commented on August 23, 2024

Maybe this is relevant: https://bugs.php.net/bug.php?id=65009&edit=1

If this is the case, best solution is to reimplement this method in another hopefully also unicode compatible way.

from php-htmldiff.

danepowell avatar danepowell commented on August 23, 2024

I've refined my theory about what's happening here: I think this is actually a bug in xhprof that's being triggered by php-htmldiff. I'm in pretty well over my head on this though; can you review the evidence and let me know if you agree, and then I can try to open an upstream ticket with xhprof?

Here is a core dump and backtrace that I managed to pull: https://pastebin.com/4iVHND1G

The germane / proximal lines:

#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:38
38 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: No such file or directory.
(gdb) bt
#0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:38
#1 0x00007f04b302b743 in memcpy (__len=39, __src=, __dest=0x7f04b0ae8358) at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#2 zend_string_init (persistent=0, len=, str=) at /usr/include/php/20160303/Zend/zend_string.h:160
#3 hp_compile_file (file_handle=0x7ffc213c11a0, type=8) at /beetbox/workspace/7.1/xhprof-php7/extension/xhprof.c:1594

It appears that xhprof performs some sort of compilation or hooks into the opcode compilation. My theory is that xhprof is causing this segfault when it tries to read/compile that preg_match line in php-htmldiff. That would explain why php-htmldiff is involved despite apparently never being called.

To verify this, I tried repeatedly enabling/disabling the xhprof extension. Whenever xhprof is enabled, PHP segfaults. Whenever xhprof is disabled, no segfaults. So xhprof definitely seems involved (unless this is a memory problem or something more exotic).

from php-htmldiff.

SavageTiger avatar SavageTiger commented on August 23, 2024

As far as I can tell the segterm happens while memory is allocated, and I imagon you came to the same conclusion. Not sure if this gdb dump has any value to us or the xhprof devs.

Best thing might be to try and create a standalone php file with a huge string in it and apply the same preg_match to it. If you get a segfault, you can supply that file as a reproducible situation to the xhprof maintainers.

from php-htmldiff.

danepowell avatar danepowell commented on August 23, 2024

Whatever this is, it seems to not be a problem with this library so I can close this issue and declutter your queue. If you have any other insight I'd love to hear it though.

I'm 99% sure php-htmldiff is not even being called when the segfault occurs so I doubt this has anything to do with input to preg_match. I think this library just happened to trigger some sort of memory bug elsewhere in the stack. I opened an xhprof issue just in case that's the culprit: phacility/xhprof#102

from php-htmldiff.

danepowell avatar danepowell commented on August 23, 2024

Just closing the loop on this... I still have no idea what caused the segfaults, but a colleague started experiencing them even with the older version of caxy/php-htmldiff, so I'm pretty sure this library is completely innocent :)

from php-htmldiff.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.