What Is an HTML Encoder?
An HTML encoder converts characters that have a special meaning in HTML into HTML entities — safe text references that a browser renders as literal characters instead of interpreting as markup. An HTML decoder does the reverse, turning entities back into the characters they represent. This tool does both directions instantly, entirely in your browser, so you can encode HTML online without installing anything or sending your data to a server.
You reach for an HTML encoder whenever you need to display code, user comments, or arbitrary text inside a web page without the browser mistaking angle brackets and ampersands for real tags. Paste your text, click Encode, and copy the result — or paste entity-laden text and click Decode to get the original back.
What Are HTML Entities?
HTML entities are character references used in HTML documents to represent characters that either have special meaning in HTML syntax or can’t be easily typed on a standard keyboard. Every HTML entity begins with an ampersand (&) and ends with a semicolon (;).
The five characters that almost always need encoding are & for the ampersand (&), < for the less-than sign (<), > for the greater-than sign (>), " for double quotes ("), and ' (or ') for single quotes ('). These are the reserved characters of HTML — the ones that, left raw, change how the parser reads your document.
Why Encode HTML?
Encoding HTML serves three main purposes in web development:
Security (XSS prevention). The most important reason is stopping Cross-Site Scripting (XSS) attacks. If user input containing HTML or JavaScript is rendered without encoding, an attacker can inject a <script> tag that runs in your visitors’ browsers. Encoding turns that payload into inert text — <script> becomes <script> and is displayed, not executed.
Correct rendering. Characters like < and > would otherwise be interpreted as the start of HTML tags. Encoding them guarantees they appear on screen exactly as written — essential for tutorials, documentation, and anything that shows code samples.
Special and non-ASCII characters. Symbols outside the basic ASCII range — em dashes, copyright and trademark signs, currency symbols, accented letters — can be represented with entities to ensure consistent rendering regardless of the page’s character encoding.
How to Use the HTML Encoder / Decoder
- Paste your text or HTML entities into the input area.
- Click Encode to convert special characters to HTML entities, or Decode to convert entities back to characters. Press
Ctrl+Enterto run the primary action from the keyboard. - Copy the result with the Copy button or
Ctrl+Shift+C. UseCtrl+Lto clear and start over.
Named vs. Numeric vs. Hexadecimal Entities
The same character can be written three ways, and this tool decodes all of them:
- Named — a human-readable alias, e.g.
©for©. Easiest to read, but only characters with an assigned name can use this form. - Numeric (decimal) —
&#followed by the Unicode code point in base 10 and a semicolon, e.g.©for©. - Numeric (hexadecimal) —
ollowed by the code point in base 16, e.g.©for©. Hex is common because Unicode charts list code points in hexadecimal.
Numeric references can encode any Unicode character — including emoji and rare symbols that have no named entity — which is why encoders often fall back to them.
Common HTML Entities Reference
| Character | Named Entity | Decimal | Hex | Description |
|---|---|---|---|---|
| & | & | & | & | Ampersand |
| < | < | < | < | Less than |
| > | > | > | > | Greater than |
| " | " | " | " | Double quote |
| ' | ' | ' | ' | Single quote / apostrophe |
| (space) | |   |   | Non-breaking space |
| © | © | © | © | Copyright |
| ® | ® | ® | ® | Registered trademark |
| ™ | ™ | ™ | ™ | Trademark |
| — | — | — | — | Em dash |
| € | € | € | € | Euro sign |
Encoding HTML Programmatically
Once you understand what the tool does, you can reproduce it in code. A few common approaches:
JavaScript (browser). Let the DOM escape for you:
const encode = (s) => { const d = document.createElement('div'); d.textContent = s; return d.innerHTML; };
const decode = (s) => { const d = document.createElement('div'); d.innerHTML = s; return d.textContent; };
For Node.js there’s no DOM, so use a maintained library such as he or html-entities, which ship the full named-entity table.
Python. The standard library has it built in:
import html
html.escape("<a href=\"x\">&") # '<a href="x">&'
html.unescape("<p>") # '<p>'
PHP. Use htmlspecialchars() for the five reserved characters, or htmlentities() to encode everything with a named equivalent; html_entity_decode() reverses either.
Prefer these battle-tested functions over hand-rolled string replacement — subtle mistakes in encoding order (encoding & last instead of first) are a classic source of bugs.
HTML Encoding vs. URL Encoding vs. Unicode Escaping
These three are easy to confuse because they all “escape” characters, but they target different contexts:
- HTML encoding makes text safe for display inside an HTML document. A space stays a space;
<becomes<. - URL encoding (percent-encoding) makes text safe inside a URL or query string. A space becomes
%20, and&becomes%26. Use the URL encoder/decoder for that. - Unicode escaping (
\uXXXX) makes characters safe inside source-code string literals — JSON, JavaScript, or Java. The Unicode escape tool handles that form.
Applying the wrong one — or applying two of them to the same string — produces garbled output, so pick the encoding that matches where the text will live.
Common Errors and How to Fix Them
Double-encoding. If content is already encoded and you encode it again, & becomes &amp; and shows up literally on the page. When you see stray amp; in your output, decode once and stop re-encoding already-safe strings.
Missing semicolons. An entity without its trailing semicolon (< instead of <) may or may not be recognized depending on the parser. Always terminate entities with ; — decode here and re-encode cleanly if you inherit malformed input.
Mojibake from wrong charset. Entities like ’ rendering as ’ usually means the page is served with the wrong charset. Serve UTF-8 (<meta charset="utf-8">) and the underlying characters render without needing entities at all.
Encoding the ampersand last. When rolling your own encoder, always replace & first; otherwise you re-escape the ampersands you just introduced. This tool handles ordering for you.
Best Practices for HTML Encoding
- Always encode user-generated content before rendering it in HTML — treat every external string as untrusted.
- Use your framework’s built-in encoding functions (template auto-escaping,
htmlspecialchars,html.escape) rather than manual string replacement. - Be aware of context-specific encoding — HTML body text, HTML attributes, JavaScript strings, and CSS values each require different escaping rules.
- Don’t double-encode; if content is already encoded, decoding and re-encoding will corrupt it.
- Layer a Content Security Policy (CSP) on top of encoding for defense in depth against XSS.