Which characters actually need to be HTML-escaped for XSS defence?

Five characters matter. For content that lands in an HTML text node, escape `&`, ` ` to `&`, `<`, and `>` — that alone prevents a string from opening a new tag or closing an existing one. For content that lands inside a quoted HTML attribute value, also escape `"` (to `"`) and `'` (to `'`) so the value cannot prematurely terminate the attribute. The HTML Entities tool has separate Text-safe and Attribute-safe scopes so you don't over-escape when you don't need to and don't under-escape when you do.

When should I use the "All non-ASCII" scope?

Pick it when the HTML needs to survive a strict-ASCII pipeline — older XML tooling, email templates that still run through 7-bit transports, or diff-friendly pull requests where reviewers prefer to see `café` instead of the raw `é`. Every codepoint above 0x7E becomes a decimal numeric entity (`&#NNNN;`). Astral plane characters such as 🔥 are emitted as a single numeric entity for their actual codepoint (`🔥`) rather than a surrogate pair, so round-tripping remains lossless.

Why does the decoder silently leave unknown entities like "&notAThing;" alone?

Permissive decoding matches how every real browser handles unknown entities — browsers never throw, they just render the literal. If your pipeline actually needs a loud error when an entity is unrecognised, use the strict decode mode from the transform layer (exposed to tests and programmatic callers). The UI decoder is permissive so that pasting legitimate HTML which happens to contain a stray `&foo;` doesn't produce a useless "invalid input" toast.

How are emoji and astral-plane characters handled?

Every emoji and supplementary-plane character goes through `String.fromCodePoint`, which correctly handles codepoints above 0xFFFF — unlike the older `String.fromCharCode`, which quietly truncates. Encoding an astral codepoint in ascii scope emits a single decimal entity for the real codepoint (e.g. `🔥` for 🔥). Decoding either decimal or hex numeric entities for an astral codepoint, or the raw encoded-HTML hex form `🔥`, yields the correct single character back.

Is the text I paste sent to your servers?

No. The entire encode/decode transform runs locally in your browser. The tool never uses DOMParser, `innerHTML`, or a fetch call — it walks the input character-by-character against a hand-maintained entity map and the `String.fromCodePoint` primitive. The Share button encodes state into the URL fragment (`#s=…`), and browsers do not transmit URL fragments to servers by design.

HTML Entities Encode / Decode

A live, client-side HTML entity encoder and decoder. Paste raw text, pick a scope, and get safe HTML out — or paste HTML and get the raw characters back. Everything runs in your browser; nothing leaves the tab.

Why a dedicated HTML entity tool

Cross-site scripting is still the single most common web vulnerability because developers misjudge which characters are dangerous in which context. A value that is perfectly safe inside a URL can be explosive inside an attribute; a value that is safe inside a text node can still break an attribute if you forget to escape the quote it lives in. The encoder here is opinionated about that distinction on purpose: the scope toggle forces you to state where the output will be embedded, and it only escapes the characters that matter for that context.

Meanwhile, every "online HTML entity tool" that shows up in search either runs your input through a server endpoint (risk), wraps it in ads (noise), or silently uses the browser's textarea.innerHTML trick for decoding (which explodes on anything but ASCII and sometimes executes attribute side-effects). This tool does none of that. The transform is a pure TypeScript function you can read, audit, and reuse from Node, and every character path is covered by a unit test.

Encode HTML entities

Three scopes, each tuned to a specific embedding context:

Text-safe (minimal). Escapes only &, <, and >. Use this when you are interpolating user text into an HTML text node — the body of a <p>, the content of a <div>, anywhere between opening and closing tags. Escaping the quote characters in this context is unnecessary and makes the source harder to read.
Attribute-safe. Escapes &, <, >, ", and '. Use this when the value goes inside a quoted attribute — title="...", alt='...', data-*="...". Both quote flavours get escaped so the output is safe regardless of which quote the attribute uses. The ' character becomes ' because not every ancient parser recognises the named '.
All non-ASCII. Escapes the three text-critical characters plus every codepoint above 0x7E as a decimal numeric entity (&#NNNN;). Use this when your downstream pipeline is strict ASCII — email templates, XML tools that choke on UTF-8, pull-request diffs where reviewers prefer decomposed ASCII. Astral plane characters such as 🔥 become a single numeric entity for their real codepoint (🔥), not a surrogate pair.

Decode HTML entities

Decoding is always permissive. The decoder recognises:

Named entities from a hand-maintained table of ~80 names that cover everything you meet in the wild: the core five (& < > " '),  , the common punctuation (… – — “ ” « » …), currency (€ £ ¥ ¢), legal/symbols (© ™ ® °), and the full Latin-1 accented alphabet (Ä ä ß Ø ç …). The full HTML5 reference has ~2,200 names, most of which are APL and maths glyphs nobody hand-writes; the tool ships the pragmatic subset.
Numeric entities in decimal ({) and hex (« — case-insensitive, so &#X1f525; decodes too). Codepoints above 0x10FFFF are rejected as invalid; everything else round-trips through String.fromCodePoint, which correctly handles the astral plane.

Unknown named entities pass through unchanged by default — exactly how browsers behave when they see &notAThing;. Programmatic callers that need strict validation can set { strict: true } on the transform function; the UI sticks with permissive so legitimate HTML containing the odd stray ampersand sequence doesn't produce a useless error toast.

The astral-plane gotcha

Most hand-rolled HTML decoders on the web use String.fromCharCode, which only covers the Basic Multilingual Plane (U+0000–U+FFFF). Feed it a codepoint above 0xFFFF — any emoji, many CJK extensions, ancient scripts — and it quietly truncates. This tool uses String.fromCodePoint everywhere and has a dedicated test covering 🔥 (U+1F525) in both decimal and hex forms. Surrogate-pair encoding is handled by reading the source via for..of (which iterates by codepoint, not code unit) so ascii scope emits 🔥 rather than two mangled surrogate halves.

Why client-side matters

HTML entity tools are often reached for when something is broken — a production log line with a mysterious &amp;, a CMS export with mismatched escapes, an email template that renders gibberish. Those inputs often contain sensitive data by accident: customer names, internal URLs, auth tokens glued into error messages. A client-side tool with no network path eliminates that chain of custody entirely. The JSON-LD on this page declares "offers": { "price": "0" } because the tool genuinely ships at zero cost and zero server hops.

What's not supported (yet)

Full HTML5 entity reference. We ship the ~80 names that matter. The full 2,200-entry table is an opt-in we'll add when there's real demand.
Context-aware escaping inside <script> or <style>. HTML entity escaping doesn't work in those contexts — you need JavaScript string escaping or CSS escaping instead. See the JSON Escape tool for the former.
URL encoding. For %20 and friends, use the URL Encode tool.

Related tools

URL Encode / Decode — percent-encode and decode URLs and form bodies.
Base64 Encode / Decode — round-trip arbitrary bytes through ASCII.
JSON Escape / Unescape — escape strings for JSON string literals and source code.

HTML Entities Encode / Decode

How to use HTML Entities Encode / Decode

HTML Entities Encode / Decode

Why a dedicated HTML entity tool

Encode HTML entities

Decode HTML entities

The astral-plane gotcha

Why client-side matters

What's not supported (yet)

Related tools

Frequently asked questions

Related tools