Technology Fundamentals

Character Encoding

Definition

Character encoding is a system that pairs a sequence of characters from a given set with something else, such as a sequence of natural numbers, octets, or electrical pulses, in order to facilitate the storage of text in computers and the transmission of text through telecommunication networks.

Why It Matters

Without a standard character encoding, a computer wouldn't know how to display text. It determines how the bytes of a text file are interpreted into the characters you see on screen.

Contextual Example

UTF-8 is the most common character encoding on the web. It can represent every character in the Unicode standard, including letters, symbols, and emojis from all languages, while remaining backward-compatible with ASCII.

Common Misunderstandings

  • Seeing garbled text (like "â€" instead of a dash) is often a sign of a character encoding mismatch, where text saved in one encoding is read using another.
  • ASCII was an early, simple encoding that only supported English characters. Unicode (and its implementation UTF-8) is the modern standard that supports all languages.

Related Terms

Last Updated: December 17, 2025