How do I convert text to Unicode escape sequences?

Paste your text into the input area and click the 'Escape' button. Every character will be converted to its \uXXXX representation. You can then copy the result for use in your code.

What are Unicode code points?

A Unicode code point is a unique number assigned to each character in the Unicode standard. Code points are written as U+XXXX (e.g., U+0041 for 'A'). The Unicode standard covers over 149,000 characters from 159 modern and historic scripts.

How do I unescape Unicode sequences?

Paste your escaped text containing \uXXXX sequences into the input area and click the 'Unescape' button. The tool will convert all escape sequences back to their original characters.

Is my data safe when converting?

Yes. All processing happens entirely in your browser using JavaScript. Your data is never sent to any server.

Unicode Escape/Unescape

What is Unicode?

Unicode is a universal character encoding standard that assigns a unique code point to every character across all writing systems. Maintained by the Unicode Consortium, the standard currently defines over 149,000 characters covering 159 modern and historic scripts, as well as symbols, emoji, and control characters. Unicode replaced the fragmented set of regional character encodings (like ASCII, Latin-1, Shift_JIS) with a single, unified system.

Unicode code points are written in the format U+XXXX, where XXXX is a hexadecimal number. For example, the Latin letter “A” is U+0041, the Greek letter alpha is U+03B1, and the emoji grinning face is U+1F600. The full Unicode range spans from U+0000 to U+10FFFF.

What is Unicode Escaping?

Unicode escaping is the process of converting characters into a text-based representation using their hexadecimal code points. The most common format is \uXXXX, used in JavaScript, Java, C#, and JSON. For characters outside the Basic Multilingual Plane (above U+FFFF), surrogate pairs or extended syntax like \u{1F600} may be used.

Unicode escaping is needed when you want to include non-ASCII characters in source code files that use ASCII encoding, represent special characters in JSON strings, transmit Unicode data through systems that only support ASCII, or debug character encoding issues.

How to Use the Unicode Escape/Unescape Tool

Paste your text or escaped sequences into the input area
Click “Escape” to convert text to \uXXXX sequences, “Unescape” to convert back, or “Code Points” to view the Unicode code point for each character
Copy the result with the “Copy” button or Ctrl+Shift+C

Unicode Escape Formats Across Languages

Language	Escape Format	Example (for “A”)
JavaScript/JSON	`\u0041`	`\u0041`
Python	`\u0041` or `\U00000041`	`\u0041`
Java	`\u0041`	`\u0041`
C#	`\u0041`	`\u0041`
HTML	`A` or `A`	`A`
CSS	`\0041`	`\0041`
Ruby	`\u0041` or `\u{41}`	`\u0041`

Common Use Cases for Unicode Escaping

Internationalization (i18n): When building multilingual applications, Unicode escaping ensures that non-Latin characters in translation files and resource bundles are correctly preserved regardless of the file encoding.

JSON Data: The JSON specification requires that certain characters be escaped, and Unicode escaping is the standard way to include non-ASCII characters in JSON payloads when UTF-8 encoding isn’t available.

Debugging Encoding Issues: When text appears garbled or contains unexpected characters, viewing the Unicode code points helps identify whether the issue is a wrong encoding, a missing font, or corrupted data.

Source Code Portability: Escaping non-ASCII characters in source code ensures that the code works correctly even if the file is opened in an editor or system that doesn’t support UTF-8.

Understanding UTF-8, UTF-16, and Code Points

Unicode defines code points, but the actual byte representation depends on the encoding:

UTF-8 uses 1 to 4 bytes per character and is the dominant encoding on the web
UTF-16 uses 2 or 4 bytes per character and is used internally by JavaScript and Java
UTF-32 uses exactly 4 bytes per character, providing direct code point mapping

The \uXXXX escape format corresponds to UTF-16 code units. Characters in the Basic Multilingual Plane (U+0000 to U+FFFF) use a single \uXXXX escape, while characters above U+FFFF (like emoji) require a surrogate pair of two \uXXXX escapes.

Unicode Escape/Unescape

You might also need

HTML Entity Encode/Decode

URL Encode / Decode

Base64 Encode / Decode