Comments (29)
Hi, thanks for the detailed post. I was not aware of a CVE being assigned to this issue and I agree that this is not a security issue. The documentation in the README explains this situation in detail: https://github.com/cebe/markdown#security-considerations-
I have requested the CVE to be rejected.
from markdown.
I got the bug tag, never been more happy in my life lol
from markdown.
Thanks for the detailed report. As pointed out by @samdark this is due to the nature of markdown and not a bug. You may remove HTML support by creating your own Markdown flavor class, which does not render the HTML, but the safest solution is to use HTML Purifier or similar tools.
I added a section in the README about this: https://github.com/cebe/markdown/blob/master/README.md#security
from markdown.
Markdown doesn't ensure output is secure in any way by design. It is allowing HTML so you don't need to craft it like that, just use <script
directly.
from markdown.
@cebe This still does not explain why it mitigates simplistic attacks such as:
`<script>alert(1);</script>`
into:
<code><script>alert(1);</script></code>
If you are saying that the design of the parser allows this, then the simplistic attacks as stated would not work.
from markdown.
Please re-read markdown specification. It says explicitly that it allows any HTML by design and it's unsafe by definition to allow users to enter markdown w/o further escaping/cleanup.
from markdown.
It's not the job of the markdown parser to escape/cleanup output.
from markdown.
@samdark I think you should incorporate HTMLPurifier into the code itself if that is what would be needed to fix the XSS issues in it. That would probably save you guys a lot of headache along with people like me who find bugs in stuff that they use sometimes. Thank you for your input I'll wait for @cebe
from markdown.
Well, the rendering is technically correct according to the markdown spec. If you use multiple backticks you need to leave space before and after the code.
But it seems other parsers are doing it differently....
https://johnmacfarlane.net/babelmark2/?normalize=1&text=something+%60%60%60%3Cscript%3E%60%60%60
from markdown.
No problem. I also put in a request and just received this response from Mitre:
Thank you for your submission. CVE-2018-1000874 has been updated to a status of DISPUTED, with the rationale you have provided. These changes should be made public on cve.mitre.org within 1-2 hours.
from markdown.
FYI: The issue is not reported by snyk anymore.
from markdown.
I don't think escaping HTML is the job of markdown processing library.
from markdown.
@samdark then why does it escape it if the payload isnt crafted?
from markdown.
It's by design of markdown: https://daringfireball.net/projects/markdown/syntax#autoescape
from markdown.
That's expected. You should process result of markdown conversion with something like http://htmlpurifier.org/ if you want to allow users to enter text.
from markdown.
For example, see https://github.com/yiisoft-contrib/yiiframework.com/blob/master/widgets/views/comments.php#L44 and https://github.com/yiisoft-contrib/yiiframework.com/blob/a88510f17e6fdf225248b822022798baa679d78b/components/Formatter.php#L100
from markdown.
@samdark I shouldn't need to rely on a second library because one of them doesn't do its job properly
from markdown.
https://michelf.ca/blog/2010/markdown-and-xss/
from markdown.
The point of a parser is to render the data in a safe way. You're saying I should rely on a second library to render the data in a safe way, why should I have to do that when this parser is already here and should do that for me?
from markdown.
The point of a parser is to render the data in a safe way.
No since it's a markdown parser and markdown wasn't meant to be safe.
You're saying I should rely on a second library to render the data in a safe way
Yes.
why should I have to do that when this parser is already here and should do that for me?
This parser converts markdown to HTML. Markdown, by definition, allows any HTML and that is unsafe if you allow users to enter data you render. There are valid use cases for non-filtered HTML. For example, you can allow full markdown for admin only and that's huge flexibility. You can do things like this by embedding HTML and CSS into blog post.
from markdown.
@cebe I think security topic should be emphasized in readme pointing to HTMLPurifier.
from markdown.
Code tags are expected to display the raw text inside them, that means every HTML inside code tags is escaped properly. In your case above you had multiple code tags: ```
was converted to <code>`</code>
, so the script tag was not inside the code tag in that case.
from markdown.
So then if that is the case, there is a bug in your code with multiple backticks.
from markdown.
related to #99
from markdown.
Any idea when we will have the 1.2.1?
Thanks,
from markdown.
I just stumbled across CVE-2018-1000874 referencing this issue and wanted to leave a note with my thoughts. I disagree with the CVE-2018-1000874 being assigned as an XSS vulnerability in cebe/markdown
.
Dispute of Vulnerability CVE-2018-1000874
I agree with @samdark that it is not the responsibility of the markdown processor to perform any escaping, cleanup, or sanitization outside of anything that might be required by the markdown specification of each respective markdown flavor.
Such additional escaping, cleanup, and sanitization would be no different than expecting json_decode()
to strip out <script>
tags. Suppose I have stored a user's comments as {"commented_at":"2019-07-03 12:45:00","comment":"Payload: <script>alert(1);</script>"}
. Nobody would expect json_decode()
to prevent the XSS vulnerability that is exposed when I save unsanitized user payload and then render it out to the browser as HTML. The vulnerability lies not within the decoder/parser, but in the code that invokes the decoder/parser and then presents the result to the browser as HTML.
Asking for the markdown parser to also perform sanitization isn't necessarily an invalid request, but it is a feature request, not a vulnerability bug to fix. However, since good sanitization libraries already exist, are easy to integrate, and have no coupling with the act of parsing markdown, it seems unnecessary and unwise to build such functionality into the parser.
Backtick Parsing
I readily concede that the Traditional Markdown specifications do not unambiguously describe the case demonstrated in this issue with respect to multiple backticks, but I think the current behavior is incorrect, and I agree with @cebe's renaming of the issue from "Cross site scripting vulnerability" to "Inconsistent behavior of multiple backticks".
Traditional Markdown Specification
The relevant portion of the Traditional Markdown specification says:
To include a literal backtick character within a code span, you can use multiple backticks as the opening and closing delimiters...
It goes on to show an example that begins and ends with two backticks with a single backtick contained inside:
``There is a literal backtick (`) here.``
produces:
<p><code>There is a literal backtick (`) here.</code></p>
Interpretation
My interpretation is that "you can use multiple backticks" seems to allow any number N of backticks to be selected as the open/close delimiter and that you are not required to have a string of N-1 backticks contained within them for the N backticks to be considered open/close delimiters.
CommonMark Specification for Comparison
This interpretation is consistent with the CommonMark specification, which attempts to provide a "standard, unambiguous syntax specification for Markdown." From the Code spans section of the specification:
A backtick string is a string of one or more backtick characters (
`
) that is neither preceded nor followed by a backtick.A code span begins with a backtick string and ends with a backtick string of equal length. The contents of the code span are the characters between the two backtick strings...
Result of Interpretation
The result of this interpretation is that on the respective Payload lines, `
, ``
, ```
in the case demonstrated in this issue should be considered open/close delimiters so that:
Payload: `<script>alert(1);</script>`
Payload: ``<script>alert(1);</script>``
Payload: ```<script>alert(1);</script>```
produces:
<p>Payload: <code><script>alert(1);</script></code></p>
<p>Payload: <code><script>alert(1);</script></code></p>
<p>Payload: <code><script>alert(1);</script></code></p>
from markdown.
This is currently being reported by snyk It does mention it's "fixed" by way of the readme edit but I guess because these commits aren't published as a version yet it's still flagging it.
Might this get a push to a 1.2.1.1 / 1.2.2 so we can clear this up too?
from markdown.
I agree with the dispute, anyone coming here because of the CVE use a purification system before allowing anything from end users. They tend to do stupid stuff.
from markdown.
I have contacted snyk and asked them to remove the report.
from markdown.
Related Issues (20)
- Php 7 compatibility
- Inline HTML support with <details> and <summary> in Github flavored markdown HOT 6
- ListTrait looping through all lines HOT 2
- Code block with empty newline does not working HOT 2
- Potential XSS in link rendering HOT 2
- GFM: Line break using a backslash HOT 1
- Difference traditional and GitHub markdown HOT 3
- Can this be used to pretty print Markdown? HOT 4
- Definition list is not rendered HOT 2
- Get all headings for table of contents HOT 6
- Support for PHP source directory HOT 7
- Tags separated by spaces are rendered together HOT 1
- Table without body at the end of markdown not detected
- Confirm (modified) PSR2 coding standard HOT 3
- renderLink() breaks URLs with query params HOT 1
- h is singled out in parsing
- [FEATURE] Header links HOT 8
- Getting Fatal error Out of memory on ReflectionClass HOT 2
- Maintenance status? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from markdown.