darylldoyle / svg-sanitizer Goto Github PK
View Code? Open in Web Editor NEWA PHP SVG/XML Sanitizer
License: GNU General Public License v2.0
A PHP SVG/XML Sanitizer
License: GNU General Public License v2.0
I'm investigating this library and associated plugins for my employer on behalf of some clients. I'm curious, are there any known SVG issues that this does not catch? I know that SVG's can never be made 100% safe, but I'm trying to ascertain the level of mitigation this library provides, and determine the language and certainty that can be provided to clients if it's integrated into our processes
INPUT
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 145 45">
<metadata id="metadata"></metadata>
<g id="whatever">
<path id="path123"/>
</g>
</svg>
OUTPUT
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 145 45">
<metadata></metadata>
<g>
<path/>
</g>
</svg>
If there is a doctype then this is reported as an error
For example take any svg from the svg 1.1 structural specification
<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg width="8cm" height="3cm"
xmlns="http://www.w3.org/2000/svg" version="1.1">
<desc>Local URI references within ancestor's 'defs' element.</desc>
<defs>
<linearGradient id="Gradient01">
<stop offset="20%" stop-color="#39F" />
<stop offset="90%" stop-color="#F3F" />
</linearGradient>
</defs>
<rect x="1cm" y="1cm" width="6cm" height="1cm"
fill="url(#Gradient01)" />
<!-- Show outline of canvas using 'rect' element -->
<rect x=".01cm" y=".01cm" width="7.98cm" height="2.98cm"
fill="none" stroke="blue" stroke-width=".02cm" />
</svg>
This produces the error
{
"message": "Suspicious node 'svg'",
"line": -1
}
Hello i wanna ask my application use php 5.6
when i'm try to install it no show error and when use it not show error also.
there are possibility issue when use php 5.6? because i see support package "php": "^7.0 || ^8.0"
Thanks
I was testing out my project on PHP 8 and got a warning about libxml_disable_entity_loader being deprecated:
https://www.php.net/manual/en/function.libxml-disable-entity-loader.php says:
Warning This function has been DEPRECATED as of PHP 8.0.0. Relying on this function is highly discouraged.
I've only taken a cursory glance at it, but it seems like the solution would be to pass an empty function or some other no-op into https://www.php.net/manual/en/function.libxml-set-external-entity-loader.php instead.
Any reason why in the context of WordPress this library do not use KSES? Is it just to keep it portable or are there any other reasons?
(very shallow inspection of the code reveals functionality with is very similar to what kses does)
Hi! IMHO the whitelisted attributes and tags should be case sensitive, e.g. viewbox becomes viewBox.
Hi,
not sure if I am doing anything wrong here. The sanitizer removes the DOCTYPE which breaks entities being used, e.g. in this adobe export file. After sanitizing this the file and opening directly in a browser, it produce errors like "Entity 'ns_extend' not defined".
before
<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 17.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" [
<!ENTITY ns_extend "http://ns.adobe.com/Extensibility/1.0/">
<!ENTITY ns_ai "http://ns.adobe.com/AdobeIllustrator/10.0/">
<!ENTITY ns_graphs "http://ns.adobe.com/Graphs/1.0/">
<!ENTITY ns_vars "http://ns.adobe.com/Variables/1.0/">
<!ENTITY ns_imrep "http://ns.adobe.com/ImageReplacement/1.0/">
<!ENTITY ns_sfw "http://ns.adobe.com/SaveForWeb/1.0/">
<!ENTITY ns_custom "http://ns.adobe.com/GenericCustomNamespace/1.0/">
<!ENTITY ns_adobe_xpath "http://ns.adobe.com/XPath/1.0/">
]>
<svg version="1.1" id="Layer_1" xmlns:x="&ns_extend;" xmlns:i="&ns_ai;" xmlns:graph="&ns_graphs;"
xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="32px" height="32px"
viewBox="0 0 32 32" enable-background="new 0 0 32 32" xml:space="preserve">
...
</svg>
after
<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 17.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
<svg version="1.1" id="Layer_1" xmlns:x="&ns_extend;" xmlns:i="&ns_ai;" xmlns:graph="&ns_graphs;"
xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="32px" height="32px"
viewBox="0 0 32 32" enable-background="new 0 0 32 32" xml:space="preserve">
...
</svg>
From version 15.0 CDATA nodes are removed.
Example document:
<svg>
<defs>
<style type="text/css"><![CDATA[
.fil0 {fill:#FF0000}
]]></style>
</defs>
</svg>
Result from 15.0 (15.1, 15.2):
<svg>
<defs>
<style type="text/css"></style>
</defs>
</svg>
Suspicious node '#cdata-section'
Result before 15.0 (14.1):
<svg>
<defs>
<style type="[text/css]()"><![CDATA[
.fil0 {fill:#00994E}
]]></style>
</defs>
</svg>
Can't find a way to add #cdata-section
to safe nodes, as list is hardcoded
$safeNodes = [
'#text',
];
Since foreignObject
is not allowed, the only places you can use HTML or MathML would be title
or desc
(at least in a text/html context). Since both of those are invisible, it doesn't seem compelling to allow usage of HTML or MathML elements at all?
Allowing HTML tag names makes it easier to find bypasses, especially when the output is used inline in text/html.
There are alot of erroneous "Suspicious attribute 'href'" in the log ( $xmlIssues )
This makes it difficult to use the function getXmlIssues() the check if an svg string was correct.
This is caused by $element->getAttribute('href') returning an empty string for elements that doesn't have a href attribute, in combination with the function isHrefSafeValue saying that empty href aren't safe.
Adding an empty check to the function isHrefSafeValue solves this problem.
//Allow empty URI.
if (empty($value)){
return true;
}
This is really a strange scenario - however in the end it occasionally caused segmentation faults...
vendor/bin/phpunit --no-coverage --filter=/doctypeAndEntityAreRemoved/
PHPUnit 6.5.14 by Sebastian Bergmann and contributors.
[1] 63007 segmentation fault vendor/bin/phpunit --no-coverage --filter=/doctypeAndEntityAreRemoved/
PR #53 adds corresponding entityTest.svg
(used in test-case doctypeAndEntityAreRemoved
) which defined a XML entity using <!DOCTYPE fortiguard [ <!ENTITY lab "cool, text as an image">]>
.
It turned out that the sequence of removing doctype, followed by \DOMXPath
on the document causes a segmentation fault (at least on PHP 7.2 and 7.4, using libxml2 2.9.10 (always) and 2.9.12 (occasionally)). This is the call stack:
Now that #31 has been merged, is there a chance a new tag could be created so I can update drupal/svg_sanitizer to pin composer.json to versions after the security fix.
Thanks in advance
Reference issue in drupal.org tracker https://www.drupal.org/project/svg_sanitizer/issues/3064351
Thanks again for this library!
We have exported a SVG with Adobe and the sanitizer does not like that. It give the following errors:
There are sanitization issues with this SVG file:
Suspicious attribute 'space' in line 4
Suspicious attribute 'enable-background' in line 4
Generator is: Adobe Illustrator 19.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)
Issue #63 is for the enable-background.
But the space attribute is something weird.
Relevant code of the SVG:
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" id="Layer_1" version="1.1" x="0px" y="0px" width="600px" height="600px" viewBox="0 0 600 600" xml:space="preserve">
If you sanitize this the getXmlIssues() function wil return the error above:
Suspicious attribute 'space' in line 4
Somehow the code strips xml:
Did not found the problem/solution for this.
Version: 0.16.0
Given:
$oddSvg = <<<SVG
<svg>
<defs>
<style>
.\37 15ca94c-fc50-4dc6-8356-e55b8cb855fa { fill: #1d526b; }
</style>
</defs>
</svg>
SVG;
(new enshrined\svgSanitize\Sanitizer())->sanitize($oddSvg); // => ""
Expected: There should be no changes.
Actual: It returns an empty string.
Context:
Apparently some SVG generators use UUID class names. According to the CSS spec class selectors can lead with escaped digits (TIL).
I'm not sure I'll have time to look into the solution; but, wanted to file this so others know. Ideally any tools starting its CSS selectors with numbers should be thrown to the wolves.
As the title states, all ARIA attributes are stripped from the SVG. This shouldn't be the case and they should be preserved.
See https://css-tricks.com/accessible-svgs/ and http://rosel.li/wceu for info on this issue.
Initially reported at https://wordpress.org/support/topic/svg-sanitizing-strips-out-aria-attributes/
https://github.com/darylldoyle/svg-sanitizer/blob/master/src/svg-scanner.php
symfony/console
packageRelated: #43
"The XML declaration is only required if you are using a non-Unicode character encoding"
ref. https://oreillymedia.github.io/Using_SVG/extras/ch01-XML.html
Hello,
The SVG parser might be vulnerale to a XXE.
This kind of construction:
https://github.com/cariagency/spip-logo-svg/blob/master/vendor/enshrined/svg-sanitize/src/Sanitizer.php#L231
Might be vulnerable to a race condition because libxml_disable_entity_loader() is not thread-safe, as described here:
http://legalhackers.com/advisories/zend-framework-XXE-vuln.txt
Using payload below to bypass XSS filter:
<?xml version="1.0" standalone="no"?>
<svg viewBox="0 0 100 100" xmlns="http://www.w3.org/2000/svg">
<a href="javascript	:alert(document.domain)">
<circle cx="0" cy="0" r="300"/>
</a>
</svg>
Video POC: https://www.youtube.com/watch?v=MIAiX4gkp6U&feature=youtu.be
Just wanted to let you know, I just made a Drupal module that integrates with your svg-sanitizer. Thanks for the library!
We have no tests currently in place for removal of remote references or minification.
These should be added at some point soon, see: https://codeclimate.com/github/darylldoyle/svg-sanitizer/coverage/58ab95a234f76d000101501b
We have exported a SVG with Adobe and the sanitizer does not like that. It give the following errors:
There are sanitization issues with this SVG file:
Suspicious attribute 'space' in line 4
Suspicious attribute 'enable-background' in line 4
Generator is: Adobe Illustrator 19.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)
I wil make an PR for the enable-background attribute.
The space attribute is something different. Because it's in the list. It's looks like that XPath or something else is removing the xml: prefix when walking over the elements. Will make a separated issue for this.
While visiting the demo website, I noticed it wasn't using https. Same with the link in your README
<svg xmlns="https://www.w3.org/2000/svg" viewBox="0 0 100 100">
<path d="M30,1h40l29,29v40l-29,29h-40l-29-29v-40z" stroke="#000" fill="none"/>
<path d="M31,3h38l28,28v38l-28,28h-38l-28-28v-38z" fill="#a23"/>
<text x="50" y="68" font-size="48" fill="#FFF" text-anchor="middle"><![CDATA[410]]></text>
</svg>
Copy the above into the sanitiser and you will see the correct SVG. However when loading on an actual webpage, having the W3C XML namespace pointing to https
will cause the SVG to fail to load.
We’ve run into an issue with SVGs exported from Figma that utilise the filterUnits
attribute.
<filter ... filterUnits="userSpaceOnUse">
Are we able to include this in the attributes whitelist, or is there a reason for it’s exclusion?
When uploading SVG file (via Safe SVG WordPress plugin) attributes for "blur" are removed from elements.
Original:
task-states.zip
Sanitized:
task-states-sanitized.zip
infinity-next/infinity-next#121 (comment)
http://www.w3.org/TR/SVG/types.html#DataTypeFuncIRI
Not sure how this could be fixed without castrating the SVG format.
CraftCMS (https://github.com/craftcms/cms) is violating the GPLv2 on your project svg-sanitzer.
It seems tag 0.15.0
addressed a security vulnerability, see corresponding advisory GHSA-fqx8-v33p-4qcc (CVE-2022-23638)
Corresponding commit at 17e12ba contains a new test case tests/data/htmlTest.svg
.
svg.svg
in browser, mime-type image/svg+xml
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="-1 -1 2 2">
<!--><img src onerror=alert(1)><!-->
<?x ><img src onerror=alert(1)><?x?>
<p/><![CDATA[ ><img src onerror=alert(1)> ]]>
<font face=""/><![CDATA[ ><img src onerror=alert(1)> ]]>
</svg>
→ no problem since <img>
is not a SVG element
-> not a vulnerability
svg.html
in browser, mime-type text/htm
<html>
<body>
<div>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="-1 -1 2 2">
<!--><img src onerror=alert(1)><!-->
<?x ><img src onerror=alert(1)><?x?>
<p/><![CDATA[ ><img src onerror=alert(1)> ]]>
<font face=""/><![CDATA[ ><img src onerror=alert(1)> ]]>
</svg>
</div>
</body>
</html>
→ valid concern, since HTML is used in inline SVG
→ scripts are executed in browser
→ cross-site scripting vulnerability
<img>
) seems to be fine, see https://developer.mozilla.org/en-US/docs/Web/SVG/Element#cdata
and #comment
nodes seems to be superfluous and leads to regressions like in #70It seems that tag 0.16.0
did not actually fix a real security vulnerability, see corresponding advisory GHSA-xrqq-wqh4-5hg2 (CVE-2023-28426).
The provided new test SVG files in commit cce18bc still seem to be fine (encoded correctly) even when being processed with the previous version 0.15.4
of this library.
svg.svg
in browser, mime-type image/svg+xml
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" xml:space="preserve">
<form action="javascript:alert('1')">
<input type="submit"></input>
</form>
<form action="javascript:alert('1')">
<input type="submit" onclick="javascript:alert('1')"/>
</form>
</svg>
→ the <form>
tag does not look nice, but is without any functionality inside an SVG context
→ the nested <form>
tag in a CDATA
section is correctly encoded (this is what the security advisory is referring to)
→ not a vulnerability
svg.html
in browser, mime-type text/htm
<html>
<body>
<div>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" xml:space="preserve">
<form action="javascript:alert('1')">
<input type="submit"></input>
</form>
<form action="javascript:alert('1')">
<input type="submit" onclick="javascript:alert('1')"/>
</form>
</svg>
</div>
</body>
</html>
→ the <form>
tag does not look nice, but is without any functionality inside an SVG context
→ the nested <form>
tag in a CDATA
section is correctly encoded (this is what the security advisory is referring to)
→ not a vulnerability
<form>
tag) was handled in #74 and commit 355a65d - which was totally fine to be adjusted, but did not qualify as a security vulnerabilityCDATA
sections was handled in a previous fixes already, see issue #71ezyang/htmlpurifier
seems to be superfluous in this scenarioHappens when trying to sanitize the attached image.
The SVG validates as valid XML, but I'm not smart enough to know if it's poorly done SVG or a bug in SVG sanitization.
Looks like this commit is where the behavior changed as it worked fine before it: 504da82
Can the mask-type
attribute be added with no vulns?
Hi Daryll, as discussed we still need some help from you:
Thank you!
`php ../svg-sanitizer/src/svg-scanner.php "../myTestSVG.svg"
Fatal error: Uncaught Error: Class 'enshrined\svgSanitize\data\XPath' not found in C:\svg-sanitizer\src\Sanitizer.php:214
Stack trace:
#0 C:\svg-sanitizer\src\svg-scanner.php(126): enshrined\svgSanitize\Sanitizer->sanitize('<?xml version="...')
#1 {main}
thrown in C:\svg-sanitizer\src\Sanitizer.php on line 214`
I've used different versions from version 11 to 13.3 and they all produce this error, I have also tried reinstall the package via composer but still receive the same error.
If you have an SVG files that begins with <?xml version="1.0" encoding="utf-8"?>
, then Sanitizer will remove that when saving the clean XML elements back out to the XML file.
It happens in this line:
$clean = $this->xmlDocument->saveXML($this->xmlDocument->documentElement, LIBXML_NOEMPTYTAG);
The reason is that all of the XML header info (verison, encoding, etc.) is saved on $this->xmlDocument
. Changing that line to:
$clean = $this->xmlDocument->saveXML($this->xmlDocument, LIBXML_NOEMPTYTAG);
Makes it behave as you'd expect. Just making sure that wasn't a conscious design decision before I submit a pull request.
Using both WP 4.9.8 and 5.0.2 and Safe SVG version 1.8.1, I get the following error when dragging a SVG file into the Media window.
The afflicted file is at: http://test.richb-hanover.com/downloads/monguin.svg (because I cannot upload a .svg to github(!))
What other information could I collect for diagnosing this?
Hello! Glad someone finally took on the SVG mess on WP :)
I ran in to this issue though, causing the SVG image not to display:
Before
<clipPath id="SVGID_2_">
<use xlink:href="#SVGID_1_" overflow="visible"/>
</clipPath>
After (upload)
<clipPath id="SVGID_2_">
</clipPath>
Similair with the image element, though just removing the xlink:href attribute:
Before
<image overflow="visible" width="66" height="77" xlink:href="data:image/jpeg;base64,/9j/4AA...jbf8ADaP/2Q==" transform="matrix(0.48 0 0 0.48 521.2959 384.499)">
After
<image overflow="visible" width="95" height="77" transform="matrix(0.48 0 0 0.48 638.2959 384.499)">
Which makes me wonder if xlink:href should be removed when containing just an #id or an base64 data-image.
SVGZ's can be handled by running the content through gzdecode()
before sanitisation and then running through gzencode()
afterwards.
I'm fairly certain we can use 0 === mb_strpos($contents , "\x1f" . "\x8b" . "\x08")
to check if it's a gzipped string or not
Simple but should this be handled by the library or not?
When the attached file is uploaded via this plugin, the arrow on the vertical axis is rotated 90degrees - instead of pointing up it points to the right.
Original:
scheduling.zip
Sanitized:
scheduling-sanitized.zip
If's hard to get any useful diff, as the plugin also changes all whitespace in the file...
Hi @darylldoyle!
I hope everything is great.
We would like to use your library, however we are working on a non-open source software.
The current license (GPL v2) is blocking us from using the library.
Could you please consider changing the license to MIT or to Apache 2.0?
Or perhaps having a double license like GPL v2 + MIT or GPL v2 + Apache 2 would be fine for you?
In that case more software developers will be able to use the library.
To complete #10 the <use...
tag is often used with relative ids : https://developer.mozilla.org/fr/docs/Web/SVG/Element/use
To make stars for example where each part is a repetition of the main one.
There is a hook approach in DomPurifier that doesn't exists here : cure53/DOMPurify#233 (comment)
The cleaning should be done by cleaning xlink:href
and href
, which seems to be done also by cleanXlinkHrefs()
and cleanHrefs()
isn't it?
Since I don't think this is currently possible, it would be nice to be able to be able to use setAllowedAttrs()
to detect a starting pattern inside a href
attribute like data:image/*
.
e.g. This gets false positive flagged:
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 735 70" width="735" height="70">
<defs>
<image width="735" height="70" id="img1" href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUrPcSqAAAAABJRU5ErkJggg=="/>
</defs>
</svg>
Currently wont work on PHP 5.2 due to namespacing. Not sure what the pros/cons of dropping namespacing to support 5.2 could be.
Hello @darylldoyle thank you for this great library, it works really well.
I would like to use this library as a validation when uploading a SVG. In general it works ok, but there are some "feature requests" which probably could improve the handling. The use case it, that we want to decide if we allow the upload of the SCG or not not to clean up the uploaded SVG.
So we tried to sanitize and then check if there are any xmlIssues
or if sanitize returns false. So far so good, but we stumbled here over some challenges:
xmlIssue
hereIt would be nice to enhance the doc block comments with the description of the parameters and return values.
Is there anything which is on the roadmap for future releases?
Thank you in advance.
Should <animate>
tags be whitelisted? I'm not sure if you've chosen to strip them out specifically, but 'animate' is not in the allowed tags array. I've used the svg_allowed_tags
filter provided by the WordPress plugin to add it for now.
PHP 8 comes out in a few days - from what I can tell, this library works except for a few deprecation warnings.
https://stitcher.io/blog/new-in-php-8
PR #45 resolves it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.