Giter Club home page Giter Club logo

Comments (19)

Synchro avatar Synchro commented on August 11, 2024 1

That's reasonable for cached content, or pages that don't change - but I'm using it for email where my template remains the same, but it's different every time its rendered, and it's likely to be rendered hundreds of thousands of times, so processing the output is very inefficient.

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

Actually, I'm not quite sure whether this behavior is intended (or avoidable). We'd need to look into that to find this out.

As a workaround, you could use a different delimiter, e.g. # or ^.

from emogrifier.

nelson-rivera avatar nelson-rivera commented on August 11, 2024

Thanks for the advice, I'll use one of those

from emogrifier.

Synchro avatar Synchro commented on August 11, 2024

Given that this is very likely to crop up when using emogrifier with any templating system, it would be useful to be able to escape template markup before processing, then unescape it afterwards. This way you could support any templating system, regardless of delimiters. Common delimiters are {/}, [[/]], *| / |*, ~{ / }~, so you can see there is a fair amount of variation, and characters such as | and * are likely to play havoc with regexes unless handled very carefully. The hard question is how best to escape them, given that they can occur in any context as they do not have to be valid HTML - for example between </tr> and <tr>. An HTML-safe approach might be to wrap them in HTML comments, but that may cause problems for template tags that appear inside attribute values, or indeed as attributes.

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

@Synchro, would you be willing to provide a PR for this (including unit tests that cover the new behavior)?

from emogrifier.

steffenschebesta avatar steffenschebesta commented on August 11, 2024

Yes. we run into the same problems with Twig.
Emogrifier converts <a href="{{ var | filter }}"> to <a href="%7B%7B%20var%20%7C%20filter%20%7D%7D">

I believe this is a problem in PHP's DOMDocument rather than emogrifier itself, though. A fix would be highly appreciated.

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

@jubi Would you be willing to provide a pull request for this?

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

So Emogrifier should not fix attribute values that are not properly encoded as HTML, I think.

from emogrifier.

steffenschebesta avatar steffenschebesta commented on August 11, 2024

@oliverklee
One workaround could be something like this (not very clean):

public function saveLinks(){
    $xmlDocument = $this->createXmlDocument();
    $xpath = new \DOMXPath($xmlDocument);
    $this->clearAllCaches();

    $nodesWithHrefAttributes = $xpath->query('//*[@href]');
    $this->links = array();
    $id = 0;
    if ($nodesWithHrefAttributes !== FALSE) {
        foreach ($nodesWithHrefAttributes as $node) {
            $this->links[] = $node->getAttribute('href');
            $node->setAttribute('href', 'emogrify_stored_link_'.$id++);
        }
    }
}

public function restoreLinks($html){
    $id = 0;
    foreach ($this->links as $link) {
        $html = str_replace('emogrify_stored_link_'.$id++, $link, $html);
    }
    return $html;
}

from emogrifier.

Synchro avatar Synchro commented on August 11, 2024

I think a more general way to do it would be to ignore everything within defined delimiters. One way of doing this is to search for everything that matches the delimiters, like \{\{.*\}\} and replace it with an identifier as in your example. Then run the inliner, which should ignore the identifiers as being plain text, then do a search and replace to put everything back again. The only problem with this approach is if delimiters can appear inside delimited strings, like {{$var|default:"{{"}}, and the proper way to deal with that is to hook into the template compiler itself.

from emogrifier.

steffenschebesta avatar steffenschebesta commented on August 11, 2024

@Synchro
I disagree. We are already using a DOM parser. Falling back to regex does not seem a good approach to me. Plus the problem only occurs in hrefs anyway. Doing it your way would match every variable, even those, that would never be affected by the initial problem.

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

Using a regular expression approach to first find all href attribute values and then restore them afterwards would be fine with me.

from emogrifier.

Synchro avatar Synchro commented on August 11, 2024

It's not using a regex to parse HTML - it's using a regex to parse template syntax. Does it not also happen in other places like src and CSS url() items? I can foresee other situations where template syntax could confuse a DOM parser, for example if you had something like:

{{if $x->type == "heading"}}<th>{{else}}<td>{{/if}}

(ok, so that's a bit contrived!)

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

Actually, I think a better way would be to first render the template and then put it through Emogrifier. This would solve the problem of passing a template that might not be valid (X)HTML.

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

I think as long as we are using a DOM-based approach in Emogrifier, we won't be able to work with templates that are invalid XHTML.

However, for templates that are valid XHTML, the only remaining problem are href and src attributes that automatically get URL-encoded. So saving and restoring those should suffice.

I don't know whether url() items in CSS are affected. @Synchro or @jubi, would you be willing to write a short proof-of-concept unit test for this in a separate PR? If it passes, we can include it as a safety net against regressions.

from emogrifier.

Synchro avatar Synchro commented on August 11, 2024

My rough idea for a template syntax workaround is this:

function mogrify($source) {
        //Change to match your template delimiters
    $ldelim = preg_quote('[[');
    $rdelim = preg_quote(']]');
    $tags = [];
    preg_match_all('/'.$ldelim.'.*?'.$rdelim.'/', $source, $tags);
    $repl = [];
    foreach ($tags[0] as $tag) {
        $repl[$tag] = md5($tag);
    }
    //Replace tags with hashes
    $source = str_replace(array_keys($repl), array_values($repl), $source);
    //Do the CSS inlining
    $emogrifier = new \Pelago\Emogrifier();
    $emogrifier->setHtml($source);
    $emogrifier->disableInvisibleNodeRemoval();
    $source = $emogrifier->emogrify();
    //Put tags back in
    $source = str_replace(array_values($repl), array_keys($repl), $source);
    return $source;
}

This, in theory at least, should be reasonably robust, and it's simple and fast. It hides all template markup from emogrifier then restores it afterwards. Most common template tag contexts should survive. It could get messy if you have template tags that generate style sheets, in which case you have a chicken and egg problem... This works with simple content examples that emogrifier otherwise makes a mess of, for example:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8" />
    <title>s</title>
    <style>
    p {color:red;}
    </style>
</head>
<body>
[[thing]]
<p>hello</p>
<a href="[[blah]]">asdas</a>
</body>
</html>

With the above code, this results in correct-looking output:

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>s</title>
</head>
<body>
[[thing]]
<p style="color: red;">hello</p>
<a href="[[blah]]">asdas</a>
</body>
</html>

I will see how well this fares with more complex examples.

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

Hi @Synchro , thanks for the implementation proposal.

Actually, the PR #170 should fix this by not encoding the text nodes anymore.

from emogrifier.

monstrfolk avatar monstrfolk commented on August 11, 2024

This is still an issue

from emogrifier.

oliverklee avatar oliverklee commented on August 11, 2024

@monstrfolk Could you please open a separate issue for this including the steps to reproduce this, the actual result and the expected result? Thanks!

from emogrifier.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.