Giter Club home page Giter Club logo

sanitize's People

Contributors

beyang avatar eduncan-sp avatar eduncan911 avatar frankbille avatar jhvst avatar kennygrant avatar pawka avatar retailify avatar tabacco avatar xboston avatar yarcat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sanitize's Issues

Sanitize doesn't adequately protect HTML

This has the makings of a great sanitization library but right now it appears to have some vulnerabilities, based on a quick read-through of the clear and well-written code.

https://github.com/OWASP/CheatSheetSeries/blob/master/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.md

To quote the first cheatsheet: Even if you use an HTML entity encoding method everywhere, you are still most likely vulnerable to XSS. You MUST use the escape syntax for the part of the HTML document you're putting untrusted data into.

It might be useful to develop a test suite based on this:
https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet

For example, escaping only <> isn't enough. OWASP used to have a list (as follows), but now even this isn't sufficient.

&  -> &amp;
< -> &lt;
> -> &gt;
" -> &quot;
' -> &#x27;
/ -> &#x2F;
\n ->  <br>

Also have a look at how https://github.com/microcosm-cc/bluemonday does it.

This is another OWASP cheat sheet that might be valuable:

https://github.com/OWASP/CheatSheetSeries/blob/master/cheatsheets/Input_Validation_Cheat_Sheet.md

Multiple lines are joined after sanitizing.

Go code:

package main

import (
    "fmt"

    "github.com/kennygrant/sanitize"
)

func main() {
    content := `<p>LINE 1<br />
LINE 2<br />
LINE 3</p>`
    fmt.Println(sanitize.HTML(content))
}

Will provide:

LINE 1LINE 2LINE 3

New lines are missing. I can fix this by myself, but want to be sure if you'll merge my PR as latest commit is 1 year old.

override defaults

A couple of defaults I'd like to be able to override:

  • lower case strings: I'd like to preserve mixed case
  • use of - separator: I'd like to use _

sanitize.HTMLAllowing() breaks when encountering a self-closing iframe tag

package main

import (
	"fmt"

	"github.com/kennygrant/sanitize"
)

func main() {
	input1 := `<iframe></iframe><script>alert('uh oh');</script><p>hello</p>`
	input2 := `<iframe /><script>alert('uh oh');</script><p>hello</p>`

	allowedTags := []string{"p"}

	output1, _ := sanitize.HTMLAllowing(input1, allowedTags)
	fmt.Println(output1) // <p>hello</p>

	output2, _ := sanitize.HTMLAllowing(input2, allowedTags)
	fmt.Println(output2) // &lt;script&gt;alert(&#39;uh oh&#39;);&lt;/script&gt;&lt;p&gt;hello&lt;/p&gt;
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.