sanitize-html provides a simple HTML sanitizer with a clear API.

Kei

Kei

Jun 24, 2024

14 min read

sanitize-html provides a simple HTML sanitizer with a clear API.

sanitize-html is tolerant. It is well suited for cleaning up HTML fragments such as those created by CKEditor and other rich text editors. It is especially handy for removing unwanted CSS when copying and pasting from Word.

sanitize-html allows you to specify the tags you want to permit, and the permitted attributes for each of those tags. If an attribute is a known non-boolean value, and it is empty, it will be removed. For example checked can be empty, but href cannot.

If a tag is not permitted, the contents of the tag are not discarded. There are some exceptions to this, discussed below in the "Discarding the entire contents of a disallowed tag" section.

The syntax of poorly closed p and img elements is cleaned up.!

href attributes are validated to ensure they only contain http, https, ftp and mailto URLs. Relative URLs are also allowed. Ditto for src attributes.

Allowing particular urls as a src to an iframe tag by filtering hostnames is also supported.

HTML comments are not preserved. Additionally, sanitize-html escapes ALL text content - this means that ampersands, greater-than, and less-than signs are converted to their equivalent HTML character references (& --> &amp;, < --> &lt;, and so on). Additionally, in attribute values, quotation marks are escaped as well (" --> &quot;).

Requirements

sanitize-html is intended for use with Node.js and supports Node 10+. All of its npm dependencies are pure JavaScript. sanitize-html is built on the excellent htmlparser2 module.

Regarding TypeScript

sanitize-html is not written in TypeScript and there is no plan to directly support it. There is a community supported typing definition, @types/sanitize-html, however.

npm install -D @types/sanitize-html

If esModuleInterop=true is not set in your tsconfig.json file, you have to import it with:

import * as sanitizeHtml from 'sanitize-html';

When using TypeScript, there is a minimum supported version of >=4.5 because of a dependency on the htmlparser2 types.

Any questions or problems while using @types/sanitize-html should be directed to its maintainers as directed by that project's contribution guidelines.

How to use

Browser

Think first: why do you want to use it in the browser? Remember, servers must never trust browsers. You can't sanitize HTML for saving on the server anywhere else but on the server.

But, perhaps you'd like to display sanitized HTML immediately in the browser for preview. Or ask the browser to do the sanitization work on every page load. You can if you want to!

  • Install the package:

npm install sanitize-html

or

yarn add sanitize-html

The primary change in the 2.x version of sanitize-html is that it no longer includes a build that is ready for browser use. Developers are expected to include sanitize-html in their project builds (e.g., webpack) as they would any other dependency. So while sanitize-html is no longer ready to link to directly in HTML, developers can now more easily process it according to their needs.

Comments

Add a comment
    Kei

    Written by Kei

    Co-founder of Glasp