HTML Encoding (Character Sets)

HTML Encoding (Character Sets)
If encoding is wrong, text may appear as � weird symbols.
1. What Is Character Encoding?
Character encoding defines:
How characters are stored
How browsers read and display text
Example characters:
English:
A B CHindi:
नमस्तेSymbols:
₹ © →Emojis: 😀 🚀 ❤️
All these need a proper character set to display correctly.
2. UTF-8 (Recommended & Default )
What Is UTF-8?
Most widely used encoding
Supports all languages
Supports symbols & emojis
Backward compatible with ASCII
UTF-8 is the standard for modern web development
3. Setting Character Encoding in HTML (Mandatory)
Add this inside the <head> section:
Full Example
- All characters display correctly
4. What Happens If Encoding Is Missing?
Without Charset
Output may look like:
- Browser guesses encoding → wrong result
5. Common Character Sets
| Encoding | Description | Use |
|---|---|---|
| UTF-8 | Universal, modern | Best |
| ASCII | English only | Limited |
| ISO-8859-1 | Western Europe | Old |
| UTF-16 | Unicode | Rare in HTML |
- Always choose UTF-8
6. Encoding & Emojis (Important )
Emojis require UTF-8.
- Without UTF-8 → emojis may break
7. Encoding & HTML Entities
Even with UTF-8, entities are still valid:
- UTF-8 + entities = maximum compatibility
8. Encoding in External Files
CSS
JavaScript
JS files usually inherit UTF-8 automatically, but ensure editor saves as UTF-8.
Common Mistakes
- Forgetting
<meta charset="UTF-8"> Placing charset after content- Using old encodings
- File saved in wrong encoding
Best practice: Charset should be first thing inside
<head>
Key Points to Remember
Character encoding defines how text is displayed
UTF-8 supports all languages & emojis
<meta charset="UTF-8">is mandatoryPrevents broken or unreadable text
Essential for international websites
