ASCII / Unicode Codepoint Lookup
Look up the ASCII/Unicode codepoint for any character โ or go the other way and turn a list of codepoints back into text. Shows decimal, hex (U+XXXX), octal, binary, HTML entity, JS escape, and URL-encoded forms for every code point.
Result
- CodepointH U+0048 dec 72 0x48 0o110 0b1001000 HTML H / H JS \u0048 URL %48
- Codepointi U+0069 dec 105 0x69 0o151 0b1101001 HTML i / i JS \u0069 URL %69
- Codepoint! U+0021 dec 33 0x21 0o41 0b100001 HTML ! / ! JS \u0021 URL %21
- Codepoint U+0020 dec 32 0x20 0o40 0b100000 HTML   /   JS \u0020 URL %20
- Codepoint๐ U+01F44B dec 128075 0x1F44B 0o372113 0b11111010001001011 HTML 👋 / 👋 JS \u1F44B URL %1F44B
Step-by-step
- Iterate the input by Unicode code point (not UTF-16 code unit) so surrogate pairs collapse into one entry.
- For each code point, emit U+HEX, decimal, octal, binary, HTML numeric entity, JS \u escape, and the URL-percent byte.
How to use this calculator
- Switch the direction depending on whether you have text or codepoints.
- Paste any text โ emoji and supplementary-plane characters (codepoint > U+FFFF) are handled by code point, not by UTF-16 code unit.
- When entering codepoints, mix and match notations โ U+0041, 0x41, 65, A, \u0041 are all accepted.
- Use the per-codepoint rows to grab the exact HTML entity or JS escape you need, with no further hand-conversion.
About this calculator
Every character on the modern web โ Latin letters, CJK ideographs, emoji, math symbols โ has a Unicode code point. This tool maps in both directions: type some text and read out the U+HEX code point for each character, or paste a list of code points (in any common notation) and reconstruct the text. Itโs the lookup table you reach for when youโre writing a regex that needs to allow only certain script ranges, sanity-checking a copy-paste that arrived with garbled mojibake, or building a font fallback test page. For each code point you get the decimal, hex, octal, binary, HTML numeric entity (both forms), JavaScript \u escape, and the single-byte URL-percent representation โ the full set youโd otherwise paste-into-multiple-tools to assemble.
How it works โ the formula
codepoint(ch) = ch.codePointAt(0)
text(cps) = cps.map(String.fromCodePoint).join("")The Unicode standard maps every abstract character to a unique 21-bit integer (the code point) in the range U+0000 to U+10FFFF. JavaScript exposes the code point via String.prototype.codePointAt and the reverse via String.fromCodePoint; these correctly handle UTF-16 surrogate pairs so supplementary-plane characters (most emoji) are not split.
Worked examples
- Inputs:
- text = "A"
- Output:
- U+0041 dec 65 0x41
- Inputs:
- text = "ไธญ"
- Output:
- U+4E2D dec 20013 0x4E2D
- Inputs:
- text = "๐"
- Output:
- U+1F44B dec 128075 0x1F44B
Limitations
- Grapheme clusters (composed emoji, combining marks) are reported by their constituent code points, not as a single glyph. This matches the iterator behaviour of every modern JS engine.
- Codepoints in the surrogate range (U+D800โU+DFFF) are not scalar values and cannot be encoded; entering them returns the Unicode replacement character.
- URL %XX bytes shown are correct for code points โค U+007F only. For higher code points use proper UTF-8 percent-encoding (encodeURIComponent).
Every numeric value here is derived directly from the Unicode code-point integer โ there is no rounding or transcoding loss.