Santitize HTML using a whitelist of allowed elements and attributes. Parses the HTML using parse5 which uses the HTML5 parsing algorithm (meaning it should parse documents the same way your browser does).
var santitized = safeHtml("<div onclick=\"javascript:alert('Oh no!')>Hello <script>alert('Whoops!')</script>World</div>");
// santitized is now "<div>Hello World</div>";
Written by Thomas Parslow (almostobsolete.net and tomparslow.co.uk) as part of Active Inbox (activeinboxhq.com).
You might want to also check out sanitize-html which has more features and has been around longer.
npm install --save safe-html
var safeHtml = require('safe-html');
var config = {
allowedTags: ["div", "span", "b", "i", "a"],
allowedAttributes: {
'class': {
allTags: true
},
'href': {
allowedTags: ["a"],
filter: function (value) {
// Only let through http urls
if (/^https?:/i.exec(value)) {
return value;
}
}
}
}
};
var santitized = safeHtml("...potentially bad html...", config);
WARNING: SECURITY IS HARD
I am not perfect and I make mistakes, you are not perfect and you make mistakes. If you're using this in a secuirity critical thing then be cautious and think very carefully about what you're doing.
Fixed or improved stuff? Great! Send me a pull request through GitHub or get in touch on Twitter @almostobsolete or email at tom@almostobsolete.net