Unicode typefaces 

Unicode typefaces (also known as UCS fonts and Unicode fonts) are typefaces containing a wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc., which are collectively mapped into the standard Universal Character Set, derived from many different languages and scripts from around the world. Unlike most conventional computer fonts, which are specific to a particular language or legacy character set and contain only a small subset of the UCS characters, these fonts attempt to include many thousands of possible glyphs, so that they can be used as a single typeface across multi-lingual documents.

The Unicode standard does not specify the typeface (a collection of graphical shapes called glyphs) itself, but rather instead, it defines the abstract characters as a specific number (known as a codepoint) and also defines the required changes of shape depending on the context the glyph is used in (e.g., Combining characters, precomposed characters and letter-diacritic combinations). The choice of font, which governs how the abstract UCS characters are converted into a bitmap or vector output that can be viewed on a screen or printed, is left up to the user. If a font is chosen which does not contain a glyph for a codepoint used in the document, typically a question mark ("?"), a box, or some other Substitute character is displayed.

Currently (August, 2008), no single "Unicode font" includes all the characters defined in the present revision of the ISO 10646 (Unicode) standard. In fact, it would be impossible to create such a font in any common font format, as Unicode includes over 100,000 characters, while no widely-used font format supports more than 65,535 glyphs. So while one could make a set of related fonts to cover all of Unicode, a single Unicode font is not possible at this time.

Many Unicode fonts are continually updated to incorporate characters which were previously omitted or which were added in a newer version of the standard. Additionally, fonts may be updated to correct errors in past versions.

The UCS has over 1.1 million code points, but only the first 65,536 (the Plane 0: Basic Multilingual Plane, or BMP) had entered into common use before 2000. (See the Mapping of Unicode characters article for more information on other planes, including Plane 1: SMP, Plane 2: SIP, Plane 14: SSP, Plane 15 and 16: reserved for PUA.)

The first Unicode font (with very large character set, and supporting many Unicode blocks) was Lucida Sans Unicode, it was developed by Charles Bigelow & Kris Holmes' in March, 1993 (Shipped with Windows NT 3.1). The second was Unihan font, developed by Ross Paterson in 1993. The third was Everson Mono Unicode font, released in 1995, developed by Michael Everson.

Unicode
Character encodings
UCS
Mapping
Bi-directional text
BOM
Han unification
Unicode and HTML
Unicode and E-mail
Unicode typefaces

Contents

Issues

There are typographical ambiguities in Unicode, so that some of the unified Han characters (seen in Chinese, Japanese, and Korean) will be typographically different in different regions. For example, Unicode point U+9AA8 (骨) is typographically different between simplified Chinese and traditional Chinese. This has implications for the idea that a single typeface can satisfy the needs of all locales.1

Application of Unicode typefaces

Beside all the issues, Unicode is now the base character set for many new standards and protocols, and is built into the architecture of operating systems (Microsoft Windows, Apple Mac OS X, and many versions of Unix), programming languages (Ada, Perl, Python, Java, Common LISP, APL), and libraries (IBM International Components for Unicode (ICU) along with the Pango, Graphite, Scribe, Uniscribe, and ATSUI rendering engines), font formats (TrueType and OpenType) and so on. Many other standards are also getting upgraded to Unicode compliance, day by day.

Utility software

Utility software can be used to see exactly which characters are included inside a font file:

List of Unicode fonts

Of the many Unicode fonts available, the few are listed below are the most commonly used by a majority of users around the world on mainstream computing platforms. More Unicode fonts can be found in the (List of typefaces) article's "Unicode fonts" section.

List of Unicode Fonts
Font Char(s) Glyphs Kernpairs Version Font Family Font style Font type Serif style License Notes
Arial 1,419 1,674 909 3.00 Arial Regular OTF+TTO Normal Sans Proprietary Included with Microsoft Windows.
Arial Unicode MS 38,917 50,377 0 1.00 Arial Regular OTF+TTO Normal Sans Proprietary Included with Microsoft Office.
Bitstream Cyberbit 32,910 29,934 935 2.0 beta Bitstream Cyberbit Roman TTF Cove Freeware For non-commercial use only.
Cardo 2,879 2,882 216 0.098 (2004) Cardo Regular TTF Cove Freeware For non-commercial and non-profit uses only.
Caslon Roman 3,684 3,686 0 001.000 16-12-2001 Caslon Roman TTF   BSD-like license
Code2000 51,239 61,864 115 1.16 Code2000 Regular TTF Any Shareware Register after "reasonable" period (author's words).
Charis SIL 1,958 3,084 0 4.002 Charis SIL Regular TTF Any OFL
Chryſanþi Unicode (Chrysanthi Unicode) 4,818 4,383 0 3.1 Chrysanthi Regular TTF Cove Freeware Commercial use must be first approved by author.
ClearlyU 9,538 0 1.9 Freeware
DejaVu Sans 5,223 5,427 2,558 2.18 DejaVu Book OTF+TTO Normal Sans Bitstream Vera license and public domain for additions
Doulos SIL 1,958 3,083 0 4.014 Doulos SIL Regular TTF Any OFL
Everson Mono Unicode 4,893 4,899 0 3.2b4 Everson Mono Regular TTF Any Shareware Monospaced width.
Font Char(s) Glyphs Kernpairs Version Font Family Font style Font type Serif style License Notes
FreeSerif 3,914 5,257 0 1.52 FreeSerif Medium TTF Cove GPL Sans serif (FreeSans) and monospaced (FreeMono) variants.
Gentium Regular 1,469 1,699 2,857 1.0.2 (2005) Gentium Regular TTF Any OFL
GNU Unifont 63,446 63,446 0 5.1.20080914 Unifont Medium Bitmap Any GPL Included more than 27,000 Hanzi glyphs from WenQuanYi Bitmap Song
Junicode 2,235 2,256 0 0.6.12 Junicode Regular TTF Any GPL
Linux Libertine 1,982 1,985 0 2.2.0 Linux Libertine Regular OTF+TTO Any GPL, OFL
Lucida Grande 2,245 2,826 0 5.0d8e1 (Revision 1.002) Lucida Grande Regular OTF Normal Sans Proprietary Included with Mac OS X. Any proportion.
Lucida Sans Unicode 1,765 1,776 0 2.00 Lucida Sans Regular OTF+TTO Normal Sans Proprietary Included with Microsoft Windows.
Microsoft Sans Serif 2,301 2,257 0 1.41 Microsoft Sans Serif Regular OTF+TTO Normal Sans Proprietary Included with Microsoft Windows.
New Gulim 46,567 49,284 0 3.10 New Gulim Regular TTF Obtuse Cove Proprietary Included with Microsoft Office 2000. Any Proportion.
Tahoma 1,912 2,034 674 3.14 Tahoma Regular OTF+TTO Normal Sans Proprietary Included with Microsoft Windows.
Times New Roman 2,790 3,380 867 5.01 Times New Roman Regular OTF+TTO Cove Proprietary Included with Microsoft Windows Vista.
TITUS Cyberbit Basic 9,341 10,044 0 3.0 (2000) (Revision 4.00) TITUS Cyberbit Regular TTF Cove Freeware
WenQuanYi Bitmap Song 41,270 131,980 0 0.9.9 WenQuanYi Bitmap Song Regular Multi-strike Bitmap Font Song(Serif) Style for Chinese GPL It has full coverage to GB18030 Hanzi at 11-16px font sizes
WenQuanYi Zen Hei 34,930 36,974 0 0.8.34 WenQuanYi Zen Hei and WenQuanYi Zen Hei Mono Regular TTC Hei(Sans) Style for Chinese GPL Zen Hei and Zen Hei Mono co-exist in a single TTC file; also with embedded bitmaps
Y.OzFontN 21,360 59,678 0 9.41 Y.OzFontN Regular TTF Any Freeware Sans-serif (for Japanese) and Monospace (for Latin).
Font Char(s) Glyphs Kernpairs Version Font Family Font style Font type Serif style License Notes
Note
^†  OTF+TTO: OpenType font with TrueType outlines.
^‡  OpenType fonts sometimes don't contain a one-by-one Kernpair table but a kern-by-classes table where groups of similar characters are seen as one kern group. I.e. have V and W nearly the same left and right geometry. So “0” doesn't mean that no kerning is supported!

Comparison of fonts

Number of characters included by the above version of fonts, for different Unicode blocks (or, ranges), are listed below. Basic Latin (128: 0000–007F) means that in the range called 'Basic Latin', there are 128 assigned codes, numbered 0 to 7F. The cells then show the number of those codes which are covered by each font.

✓ = Most or some portion out of all characters in that range are present in the font.
 X  = No characters are included in the font for that range or Unicode block.
  -  = Data not available now.
Cells shaded red are the most complete of the fonts listed, though they may not give complete coverage.

0000-077F

Unicode Fonts
Font Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN
Basic Latin (128: 0000–007F) 95 95 128 95 96 95 95 95 95 95 95 95 95 95 128 97 96 98 95 95 128 95 95 95 95
Latin-1 Supplement (128: 0080–00FF) 96 96 128 96 96 96 96 96 96 96 96 96 96 96 128 96 96 96 96 96 96 96 96 96 96
Latin Extended-A (128: 0100–017F) 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128
Latin Extended-B (208: 0180–024F) 28 148 208 52 178 208 194 188 178 208 194 183 173 178 156 179 194 183 119 179 7 29 28 183 28
IPA Extensions (96: 0250–02AF) 1 89 96 96 94 96 96 94 94 96 96 96 96 94 89 94 94 96 89 94 X 2 1 96 55
Spacing Modifier Letters (80: 02B0–02FF) 9 57 80 80 63 80 80 63 62 63 80 80 29 56 57 63 48 80 57 9 10 9 9 80 16
Combining Diacritical Marks (112: 0300–036F) 5 72 112 112 82 112 104 82 82 92 104 107 72 82 72 106 62 106 68 82 X 82 5 106 32
Greek and Coptic (144: 0370–03FF) 73 105 144 124 110 127 14 76 110 127 14 118 95 82 105 80 106 106 91 112 73 73 73 128 76
Cyrillic (256: 0400–04FF) 118 226 256 2 238 255 209 238 244 255 209 246 247 80 230 X 142 244 153 246 94 122 118 247 66
Cyrillic Supplement (48: 0500–052F) X X X X 16 20 16 X 16 20 16 16 16 1 X X X 16 X 16 X X X 16 X
Armenian (96: 0530–058F) X 85 X X 85 86 X 85 86 86 X 86 X X 85 X X X X X X X X 86 X
Hebrew (112: 0590–05FF) 52 82 47 86 83 86 X 60 82 54 X 82 44 X 82 X X 82 51 52 X 52 52 83 X
Arabic (256: 0600–06FF) 208 194 65 10 X 185 X 69 201 111 X 3 63 X 62 X X X X 208 X 206 208 185 X
Syriac (80: 0700–074F) X X X X X 50 X X X X X X X X X X X X X X X X X 76 X
Arabic Supplement (48: 0750–077F) X X X X X X X X X X X X X X X X X X X X X X X X X
Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN

0780-139F

Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN
Thaana (0780–07BF) X X X X X 50 X X 49 X X X 49 X X X X X X X X X X 50 X
N'ko (07C0–07FF) X X X X X X X X X 54 X X X X X X X X X X X X X X X
Devanagari (0900–097F) X 104 X X X 110 X 104 103 X X X 93 X 104 X X X X X X X X 106 X
Bengali (0980–09FF) X 89 X X X 91 X 89 X X X X 91 X X X X X X X X X X X X
Gurmukhi (0A00–0A7F) X 75 X X X 77 X X X X X X 73 X X X X X X X X X X X X
Gujarati (0A80–0AFF) X 78 X X X 83 X 78 X X X X X X X X X X X X X X X X X
Oriya (0B00–0B7F) X 79 X X X 81 X X X X X X X X X X X X X X X X X X X
Tamil (0B80–0BFF) X 61 X X X 71 X X X X X X 49 X X X X X X X X X X X X
Telugu (0C00–0C7F) X 80 X X X 80 X X X X X X 42 X X X X X X X X X X X X
Kannada (0C80–0CFF) X 80 X X X 86 X X X X X X X X X X X X X X X X X X X
Malayalam (0D00–0D7F) X 78 X X X 78 X X X X X X 79 X X X X X X X X X X X X
Sinhala (0D80–0DFF) X X X X X X X X X X X X X X X X X X X X X X X X X
Thai (0E00–0E7F) X 87 91 X 86 87 X X 87 1 X X 87 X 87 X X 87 X 87 X 87 X 87 X
Lao (0E80–0EFF) X 65 X X X 65 X X 65 65 X X X X 65 X X X X X X X X X X
Tibetan (0F00–0FFF) X 168 X X X X X 168 55 X X X X X 34 X X X X X X X X X X
Burmese (Mayanmar) (1000–109F) X X X X X 78 X X X X X X X X X X X X X X X X X X X
Georgian (10A0–10FF) X 78 X 1 X 81 X X 78 83 X 80 X X 40 1 X X X X X X X 83 X
Hangul Jamo (1100–11FF) X 240 X X X 240 X X 240 X X X X X 67 X X X X X 250 X X X X
Ethiopic (Ge'ez) (1200–137F) X X X X X 356 X X 345 X X X 346 X 348 X X X X X X X X 364 X
Ethiopic Supplement (1380–139F) X X X X X 26 X X X X X X X X X X X X X X X X X X X
Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN

13A0-1DBF

Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN
Cherokee (13A0–13FF) X X X X 85 85 X X 85 X X 85 X X X X X X X X X X X X X
Unified Canadian Aboriginal Syllabics (1400–167F) X X X X X 630 X X 630 404 X 630 X X X X X X X X X X X X
Ogham (1680–169F) X X X X 29 29 X X 29 X X 29 X X 29 X X X X X X X X 32 X
Runic (16A0–16FF) X X X 81 81 81 X 83 81 X X 81 X X 81 81 X X X X X X X 81 X
Tagalog (Baybayin) (1700–171F) X X X X X X X X X X X X X X X X X X X X X X X X X
Hanunoo (1720–173F) X X X X X 2 X X X X X X X X X X X X X X X X X X X
Buhid (1740–175F) X X X X X 20 X X X X X X X X X X X X X X X X X X X
Tagbanwa (1760–177F) X X X X X X X X X X X X X X X X X X X X X X X X X
Khmer (1780–17FF) X X X X X 114 X X X X X X X X X X X X X X X X X X X
Mongolian (1800–18AF) X X X X X 155 X X X X X X X X X X X X X X X X X X X
Limbu (1900–194F) X X X X X 66 X X X X X X X X X X X X X X X X X X X
Tai Le (1950–197F) X X X X X X X X X X X X X X X X X X X X X X X X X
Tai Lue (1980–19DF) X X X X X X X X X X X X X X X X X X X X X X X X X
Khmer Symbols (19E0–19FF) X X X X X 32 X X X X X X X X X X X X X X X X X X X
Buginese (1A00–1A1F) X X X X X 30 X X X X X X X X X X X X X X X X X X X
Phonetic Extensions (1D00–1D7F) X X X 17 X 109 128 X X 105 128 107 14 X X 108 X 108 X X X X X 108 X
Phonetic Extensions Supplement (1D80–1DBF) X X X X X X 64 X X 38 64 107 5 X X X X X X X X X X X X
Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN

1DC0-257F

Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN
Combining Diacritical Marks Supplement (1DC0–1DFF) X X X 2 X 13 1 X X 6 1 X X X X X X X X X X X X X X
Latin Extended Additional (1E00–1EFF) 96 246 96 88 246 246 246 246 246 246 246 246 246 246 246 246 246 246 8 246 8 246 96 247 8
Greek Extended (1F00–1FFF) X 233 X 233 233 233 X 233 233 233 X 233 233 233 233 232 232 233 X 233 X 233 X 236 4
General Punctuation (2000–206F) 38 63 112 65 69 105 73 56 77 104 73 97 50 39 77 72 56 87 67 27 25 27 38 68 91
Superscripts and Subscripts (2070–209F) 1 28 48 9 28 29 34 31 28 34 34 29 25 28 28 29 36 29 28 1 6 1 1 29 29
Currency Symbols (20A0–20CF) 6 13 48 6 16 22 22 15 16 22 22 18 13 14 14 1 3 18 12 16 5 16 6 12 18
Combining Diacritical Marks for Symbols (20D0–20FF) X 18 48 X 20 27 X X 20 4 X 27 1 X 20 1 X 2 X X X X X X 27
Letterlike Symbols (2100–214F) 6 57 80 13 59 78 2 59 59 75 2 74 46 1 57 3 35 32 57 6 10 6 6 13 75
Number Forms (2150–218F) 6 48 64 4 49 50 49 49 49 50 49 49 45 4 48 49 50 49 4 4 26 4 6 28 49
Arrows (2190–21FF) 7 91 112 14 100 112 19 92 100 112 19 112 73 X 91 X 32 20 91 X 13 X 7 21 112
Mathematical Operators (2200–22FF) 17 242 256 24 242 256 17 242 242 245 17 256 219 2 242 13 45 18 242 14 43 14 17 80 256
Miscellaneous Technical (2300–23FF) 4 123 256 36 57 228 2 4 154 64 2 207 20 X 123 2 1 14 10 X 5 X 4 8 209
Control Pictures (2400–243F) X 37 64 X 39 39 X X 39 2 X 39 1 X 37 X X 1 37 X X X X X 4
Optical Character Recognition (2440–245F) X 11 32 X X 11 X X 10 X X 11 X X 11 X X X X X X X X X 11
Enclosed Alphanumerics (2460–24FF) X 139 160 X 73 160 X X 139 10 X 159 10 X 139 160 73 1 X X 82 X X 112 160
Box Drawing (2500–257F) 40 128 128 1 128 128 X 128 X X X 128 106 X 128 X X X 128 X 97 X 40 112 128
Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine Lucida Grande Lucida Sans Unicode Microsoft Sans Serif New Gulim Tahoma Times New Roman TITUS Cyberbit Basic Y.OzFontN

2580-2DDF

Image:U+2191.svg
Range

Font  Image:U+2192.svg

Range Image:U+21B4.svg
Arial Arial Unicode MS Bitstream Cyberbit Cardo Caslon Roman Code2000 Charis SIL Chrysanthi Unicode ClearlyU DejaVu Fonts Doulos SIL Everson Mono FreeSerif Gentium GNU Unifont Junicode Linux Libertine