What are HTML entities?
An HTML entity is a text string that represents a character in HTML. Entities exist because HTML uses certain characters as part of its own syntax — angle brackets (< >) delimit tags, ampersands (&) start entity references, and quote marks (") delimit attribute values. If you want to display these characters literally in a web page, you must use their entity equivalents so the browser doesn't interpret them as HTML.
Named vs numeric entities
Every HTML entity can be written in two forms:
- Named entity — a human-readable name:
&,<,© - Numeric entity (decimal) — the Unicode code point:
&,<,© - Numeric entity (hexadecimal) — prefixed with
x:&,<,©
All three forms render identically in the browser. Named entities are more readable; numeric entities work for any Unicode character, whether it has a named alias or not.
The most important entities every developer should know
| Character | Named entity | Numeric entity | When you need it |
|---|---|---|---|
| & | & | & | Displaying a literal ampersand |
| < | < | < | Displaying a less-than sign |
| > | > | > | Displaying a greater-than sign |
| " | " | " | Double quote inside an attribute value |
| ' | ' | ' | Single quote inside an attribute value |
|   | Non-breaking space (prevents line wrapping) | |
| © | © | © | Copyright symbol |
| ® | ® | ® | Registered trademark |
XSS prevention: why encoding user input is critical
Cross-Site Scripting (XSS) is one of the most common web vulnerabilities. It occurs when an application includes user-supplied data in an HTML page without encoding it first. An attacker who can inject <script>tags into your page can execute arbitrary JavaScript in the victim's browser — stealing cookies, hijacking sessions, or redirecting users.
// Vulnerable — user input inserted directly into HTML
const comment = getUserInput(); // "Nice post <script>stealCookies()</script>"
element.innerHTML = comment; // XSS! The script executes.
// Safe — encode before inserting
function escapeHtml(str) {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
element.innerHTML = escapeHtml(comment); // Renders as literal text, no executionIn modern frameworks (React, Vue, Angular), template expressions are HTML-escaped by default. The danger arises when you use escape hatches like dangerouslySetInnerHTML in React or v-html in Vue with untrusted content. Always encode user input before passing it to these APIs.
HTML encoding vs URL encoding
These are two different encoding schemes for two different contexts:
- HTML encoding — converts characters to
&name;or&#decimal;format. Used when inserting text into HTML markup. - URL encoding — converts characters to
%XXhexadecimal format. Used when inserting values into URLs and query strings.
// The same string, encoded differently for different contexts:
Input: 'Hello & "World" <2024>'
HTML encoded (for page content):
Hello & "World" <2024>
URL encoded (for a query parameter):
Hello%20%26%20%22World%22%20%3C2024%3EUsing the wrong encoding for the context — URL-encoding content destined for HTML, or HTML-encoding values destined for a URL — will result in garbled output or unexpected behavior.
When browsers auto-encode vs when you must do it manually
Browsers automatically encode characters when you set element.textContent — the text is always treated as plain text and rendered safely. They do not encode when you use element.innerHTML — content inserted there is parsed as HTML. When generating HTML programmatically (templating, string concatenation, server-side rendering outside a framework), you are responsible for encoding all dynamic values.