HTML entities are the escape syntax used to represent special characters in HTML source. When you want a literal less-than sign < in your rendered page instead of the start of a tag, you write < in the source. When you want a copyright symbol, you can write © (named) or © (decimal numeric) or © (hex numeric) — all three render as ©. This tool converts between plain text and any of those entity forms, in both directions.
Type or paste a string into the input, pick an action (encode or decode), and for encoding, pick a mode:
- Basic — escape only the five XML-unsafe characters (
&,<,>,",'). The minimum needed for safe insertion into HTML. - Named — escape the XML-unsafe characters plus any other character that has a recognizable HTML named entity (
©,€,—,“, etc.). - All non-ASCII — escape every character above ASCII 127 as a decimal numeric entity. The result is pure 7-bit ASCII and will survive any encoding pipeline.
The decoder handles all three forms at once — named, decimal numeric, and hex numeric — and is lenient about a missing trailing semicolon, so messy real-world input decodes correctly. Unknown named entities are passed through unchanged so you can see exactly what the tool couldn’t handle.
When to use each encode mode
Basic mode is what you need when you’re inserting untrusted text into an HTML attribute or text node and want to prevent XSS. The five XML-unsafe characters are the only ones that can break out of a safe text context into HTML or attribute syntax, so escaping just those is enough. This is what modern templating engines do by default — React’s {variable}, Astro’s {variable}, Vue’s {{ variable }} — and it’s what you should do if you’re ever constructing HTML strings by hand.
Named mode is for hand-edited HTML where readability matters. A © in the source is easier to recognize than ©, and both render identically. Named mode escapes the XML-unsafe characters plus every other character that has a recognisable named entity in the tool’s curated list — copyright, trademark, currency symbols, smart quotes, em dashes, and so on. It produces more human-readable output than the numeric form at the cost of not covering every possible character.
All-non-ASCII mode is for transport safety. If you have a string containing characters outside the ASCII range and you need to push it through a system that might re-encode, misencode, or strip non-ASCII bytes (some older email systems, certain databases with wrong column encodings, legacy file formats), encoding everything above 127 as a numeric entity gives you a pure 7-bit ASCII representation that will always survive. The downside is verbosity — a string of emoji becomes a much longer string of numeric entities.
Example: escaping user-submitted content
A user submits the comment <script>alert("hi")</script> — thanks!. You want to display it verbatim on the page without executing the script. Encode with basic mode:
<script>alert("hi")</script> — thanks!
This is safe to insert into an HTML page because none of the output can be interpreted as markup. The em dash is left alone because basic mode doesn’t touch characters that aren’t XML-unsafe; your HTML file’s UTF-8 encoding handles the dash natively.
Example: legacy system round-trip
You have a string with smart quotes and an em dash that needs to survive a legacy email pipeline. Encode with “all non-ASCII” mode:
input: "Don't forget — it's important"
output: “Don’t forget — it’s important”
Wait, that’s wrong for “all” mode — let me check what the tool actually outputs. In “all” mode, the smart quotes become “ and ” (decimal numeric). In “named” mode, they become “ and ”. The two outputs decode to the same string, but the numeric form is slightly more portable because it doesn’t depend on the receiving system knowing the named entity.
Example: decoding mixed input
A scraper gives you a string mixing all three entity forms:
input: Price: €100 — see § 3.2 at • point 5
output: Price: €100 — see § 3.2 at • point 5
The decoder handles all three — named (€, —), decimal numeric (§), and hex numeric (•) — without needing mode hints. Unknown named entities are passed through so you can see what failed.
What this tool does not do
It does not parse HTML — the decoder is a simple regex-based entity processor, not a DOM parser. If you paste a full HTML document in, it’ll decode the entities in the text but won’t extract content or handle tags. For that, use a real HTML parser.
It does not include the full HTML5 named entity list (~2200 entries). The curated list covers common punctuation, symbols, and currency that appear in real-world content. If you need an obscure entity like ∀ or ℶ, use the “all non-ASCII” mode, which produces correct numeric entities for any Unicode character. For URL-safe character escaping instead, the URL encoder / decoder handles percent-encoding; for stripping tags entirely from an HTML snippet, the strip HTML tool does that pass.