Illustration of text characters mapping to Unicode code points and UTF-8 hexadecimal byte sequences.

UTF-8 & Unicode

Code points and UTF-8 bytes

Enter text to see each Unicode scalar value and how it is encoded in UTF-8. Surrogate pairs in JavaScript strings are shown as a single code point where possible.

All UTF-8 bytes (hex)
Total UTF-8 bytes
Char Code point UTF-8 bytes Length
Enter text above

UTF-8 uses 1 byte for ASCII (U+0000–U+007F), 2 bytes for U+0080–U+07FF, 3 bytes for U+0800–U+FFFF, and 4 bytes for U+10000–U+10FFFF.