What is a typeface or complete set of characters?

Skip to main content

From alternates to X-height, this list of typography terms and definitions covers just about everything you’d want to know about fonts and typography.

From alternates to X-height, this list of typography terms and definitions covers just about everything you’d want to know about fonts and typography.

A font contains all the information needed to position and image the characters that it represents. How a computer operating system and an application program team up to use this information is covered in detail in Chapter 7. Here we're just concerned with what's inside a font and what it means to you as you set type.

The most important constituents of a font are the character outlines themselves. The entire collection of characters in a font is called its character set. For most alphanumeric fonts (that is, the ones used for text containing letters and numerals), character sets are standardized to a degree. Nearly all of these fonts share a basic set of characters, although they may contain optional extra characters as well. Figure 4.2 shows the core character set of a standard text font as well as some common variants used by various font vendors. Fonts based on Unicode (see the section on OpenType fonts on page 55) may contain additional characters beyond these basic collections.

What is a typeface or complete set of characters?

Figure 4.2 At the top is the standard character set of a PostScript Type 1 font used by most vendors. Although such a font can nominally contain 256 characters, 33 "slots" in the font are taken up by commands such as backspace and delte, and 2 by the word space and nonbreaking space. Below it are the additions made to create the standard character sets for OpenType fonts from Adobe and Bitstream. Monotype uses the same character set as Adobe for its Basic OpenType fonts, with the exception of the characters noted at the bottom.

The character outlines in a font are size independent. Inside each font a width table lists the horizontal space allotted to each character, as measured in fractions of an em. Computer programs use these widths to calculate how to fill lines with type, adding up the cumulative widths of the characters on a line until the line is filled.

A font may also contain tables for the widths of other members in its family. This is typically the case for the "regular," or roman text-weight, member of a family. These tables enable a computer program to compose type for all four members of a family—regular, italic, bold, and bold italic—using only the regular font. The computer's operating system, using the widths of the other family members, can synthesize false italics, bolds, and bold italics for onscreen display, relying on width tables in the regular font for getting the spacing and positioning right. The typesetting program, which relies only on the character widths, follows suit and can make appropriate decisions about how much text will fit on a line and how lines should be broken. When it comes time to print, all the necessary fonts will have to be present, as their outlines will be needed to image the type (see Figure 4.3). But to simply compose the type onscreen, only the regular-weight font is needed. The relationship between application and operating system is detailed in Chapter 7.

What is a typeface or complete set of characters?

Figure 4.3 In this illustration, the top four lines of screen type were generated from their actual fonts. The computer generated the second set of four lines by interpolating the outlines of the plain roman font. You can see that the "italics" are simply obliqued roman characters.

The high-resolution lines at the bottom show what you get if you try to print the two samples. With all the fonts available, printing proceeds normally. But without the outlines for the other three members of the family, the printer uses the plain roman font for all four lines.

A font also contains a kerning table, which lists specific letter pairs and how the typesetting program should adjust the spacing between them. Kerning adjustments are also expressed in fractions of an em, which enables them to function at any point size. For more information about kerning, see Chapter 11.

Font Formats

Ultimately, what's inside a font depends on its format. The word format has two meanings in computer type. First, it can refer to the platform for which the font was designed. For example, two fonts with the same data for the same typeface may have different file formats depending on whether they're designed for use on an Apple Macintosh or a Windows PC. Until the development of the OpenType font format, fonts were created to meet the data-structuring needs of one platform or the other, and a font designed for one machine would not work on the other. A single OpenType font file will work on either a Mac or a PC.

Another kind of font format reflects how the typographic information itself is described and organized. The three leading font formats today are PostScript, TrueType, and OpenType.

Postscript Fonts

PostScript fonts are written in the PostScript page description language, and they need to be processed by a PostScript interpreter before they can be imaged. (See "The PostScript Model" in Chapter 1 for more information on PostScript interpreters.) For high-resolution printers and imagesetters, this interpreter is generally built into the device itself; it's a separate onboard computer dedicated to turning PostScript code into printable output. For lower-resolution devices, such as computer monitors and desktop printers, PostScript fonts can be imaged by a PostScript interpreter built into the operating system. PostScript fonts are generally accompanied by a set of bitmapped fonts for screen display, and unless these screen fonts are installed alongside the outline fonts, your computer cannot image their type. Even though your computer may not use the screen fonts' bitmapped images, it relies on the font metrics contained within the screen fonts to compose type using their companion outline fonts. This is an artifact of older technology, but it continues to function perfectly well.

The several kinds of PostScript fonts are distinguished from one another by number. The only one you're likely to come across is Type 1, and it's only mentioned here because of references you may come across to "PostScript Type 1" fonts. In publishing and typesetting contexts, when you talk about a PostScript font, it's assumed you're talking about the Type 1 variety.

Until the advent of the OpenType font format, PostScript fonts were the standard of the publishing industry. Today the PostScript format has been completely overtaken by OpenType, and most type vendors, including Adobe, have converted their entire libraries of PostScript fonts into the OpenType format. PostScript fonts continue to be fully supported by applications and operating systems, which is a good thing, because there are literally millions of them still in circulation and daily use. They are, however, platform specific, and different versions of a font are required for Macintosh and Windows.

Truetype Fonts

For a few years in the late 1980s, the typesetting world had in PostScript a single, standard font format for the first time in its history. It wasn't to last. For a combination of primarily commercial but also technological reasons, Apple Computer and Microsoft collaborated to create a new font format: TrueType. The new format enabled both companies to build outline font-imaging capabilities into their respective operating systems without being beholden to Adobe.

TrueType introduced many improvements over the PostScript format. The most prominently touted was its hinting, instructions added to the font that tell the character outlines how to reshape themselves at low and medium resolutions in order to create character images of maximum clarity. (For more on hinting, see "Imaging PostScript Fonts" in Chapter 1.) Because of the high quality of these hints, TrueType fonts were and still are typically delivered without any hand-drawn, bitmapped screen fonts. Screen type generated from the font's character outlines is generally quite legible even in small point sizes.

TrueType also allowed for larger character sets. The PostScript font format had used a numbering system to identify the characters in its fonts based on a single byte of computer data, yielding a maximum of 256 distinct ID numbers. (Fonts of this kind are still referred to as single-byte fonts.) TrueType introduced a two-byte numbering system, which allowed much larger character sets by creating over 65,000 unique ID numbers.

This made plenty of room for alternate forms of characters as well as allowing languages that rely on huge character sets (such as Chinese, Japanese, and Korean) to be supported by a single font. TrueType fonts are still included as a part of major operating systems, but most independent digital font foundries have shifted to OpenType because it allows a single font file to work under multiple operating systems. TrueType fonts are still platform specific, and a TrueType font created for use on a Mac will not work on a Windows PC, and vice versa. TrueType fonts use a different technology than PostScript fonts do for describing the outline shapes of characters, but any system that can image type from PostScript fonts can also image type from TrueType fonts.

Macintosh Dfonts

Many Macintosh-specific fonts use a file structure that predates OS X. In this structure, the file contents are divided into two parts: a data fork and a resource fork. Older versions of the Mac OS used data in the resource fork to tell (among other things) what application created a specific file. Mac OS X does this by reading a file's filename extension, such as .doc. Dfonts are a variety of TrueType font that have no resource fork, and they are included in OS X for the sake of font compatibility with other computers running the UNIX operating system. (OS X, like Microsoft Windows, is based on UNIX.)

You can use dfonts just as you would any other Macintosh TrueType font. Documents formatted with them will not, however, display correctly on Macs running operating systems that predate OS X.

Opentype Fonts

OpenType is a hybrid font format created by Adobe and Microsoft. It reconciles the differences in the PostScript and TrueType formats, allowing them to exist together in a single file. OpenType fonts are also written in a file format that allows the same font file to be used on either a Macintosh or a Windows PC. Crudely put, an OpenType font is a TrueType font with a "pocket" for PostScript data. An OpenType font can contain TrueType font data, PostScript font data, or (theoretically) both. Thus it has the potential to combine the best of both formats in a transparent way. The operating system of your computer will sort out the data in an OpenType font and use what's appropriate for it. A problem with OpenType fonts, as with the TrueType fonts that preceded them, is that from the outside there's no way to know what's inside. The original generation of PostScript fonts generally contained a standard character set with standard features. The TrueType format and, to an even greater extent, the OpenType format offer a wide range of optional features that may or may not be built into every font, although the core character set used in the original PostScript fonts has generally been retained. An OpenType font can contain anywhere from a handful of characters to more than 65,000. There's no way of knowing what a particular font contains or what it can do unless the features of the font are documented in some way.

OpenType fonts also enable a variety of so-called layout features, which give a typesetting program the ability to automatically substitute one character for another. Using an appropriate OpenType font, for example, a program can automatically convert the keystroke sequence

What is a typeface or complete set of characters?
into a proper fraction: ½. Layout features are discussed in detail on pages 62–64.

Web Fonts

The term web font does not refer to a specific font format but to fonts that have been extensively hinted for optimum legibility when displayed on computer screens and other electronic devices. Some have been designed from scratch for electronic display, while others have been adapted retroactively.

Popular web standards permit designers to specify the use of particular fonts when their pages are displayed, even though these fonts are not embedded in the file or necessarily available on the device displaying it. In this sense, web fonts are also understood to be those that exist on web servers for real-time use for imaging online documents that call for them. Some of these are available for free, but others are available only under license, with a fee paid for their use; they are, in effect, rented.

Web fonts are also discussed in Chapter 17, in the context of the Cascading Style Sheet standard used to structure many web documents.

Unicode: The Underlying Technology

All computer programs identify characters by number. International standards correlate every number to a unique character, so that a computer file from Europe, for example, can be properly typeset in Asia. It took decades before a single standard international numbering system was established: Unicode. Both TrueType and OpenType fonts use Unicode numbers to identify their constituent characters.

The goal of Unicode is to assign a unique ID number to every character, linguistic symbol, or ideogram in all of the world's languages, living or dead. The number of such IDs now exceeds 100,000.

To facilitate backward compatibility, and to support legacy documents, today's computing systems still suffer from vestiges of earlier numbering systems. The first of these was ASCII (the American Standard for Computer Information Interchange), which used the numbers 0 through 127, as shown in Figure 4.4. The original desktop computing systems—including Microsoft DOS and Windows and the Apple Macintosh OS—used one-byte numbering systems that were consistent through the ASCII range but differed in the ID numbers assigned to the other 128 characters a font could contain. This made communications between the two platforms needlessly complicated, with characters often incorrectly displayed on a nonnative system.

What is a typeface or complete set of characters?

Figure 4.4 Computers identify characters by numbers, and all systems agree on the meanings of 0 through 127, the so-called character set. The numbers 0 through 31, not shown here, are either unassigned or assigned to nonprinting commands such as return and backspace. The character set is printed on most English-language computer keyboards.

For technical reasons, the ID numbers assigned by Unicode are written in hexadecimal format. Hexadecimal, in addition to using the numerals 0 through 9 to express numbers, also uses the letters A through F. This allows 16 values to be expressed with a single character, like so: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. The letters following 9 represent 10, 11, 12, 13, 14, and 15, respectively, in our everyday counting system. In hexadecimal, the value expressed as 0010 (Unicode values are always expressed using four "digits") is the equivalent of 16 in our normal base-10 system.

Fortunately, you don't need to know anything more than this about hexadecimal notation, and even the preceding paragraph is added only to explain why Unicode character numbers look so peculiar when seen in a font browsing window.

Both Windows and the Mac OS now support Unicode as well as continuing to support the numbering schemes used in older font formats. This happens more or less transparently, although how you access certain characters in certain fonts will vary according to their format. This is described in detail later in the chapter, in the section "Finding the Characters You Need."

Character vs. Glyph

An important aspect of Unicode is that it recognizes that a single character may have several forms, each one of which is represented by a distinct glyph, as shown in Figure 4.5. Unicode's main concern is clear communication, not typography per se, so it does not distinguish between a simple roman A and a decorated A used for design purposes. For Unicode, the goal is simply to accurately depict a capital A as a capital A. All capital As, then, have the same Unicode number—0041—although they may be represented by alternate glyphs. Tracking which glyph you've chosen to use is the job of your typesetting or page layout application.

What is a typeface or complete set of characters?

Figure 4.5 A single character with a single Unicode number can have several forms, each represented by a unique . Here, a lowercase –Unicode number 0067—from the typeface Hypatia Sans Pro can be represented by any of five alternate glyphs.

For this reason, computer tools used for browsing the contents of fonts are often called glyph palettes, and a given font's glyph set can be far larger than its character set.


Page 2

The legacy left by evolving font standards continues to bedevil the movement of document files between different computer systems. The only way to be sure that a typeset document appears on one platform exactly as it was designed on another platform is to create it using the same OpenType fonts from the same vendor on both platforms.

Font-Encoding Issues

How numbers are assigned to the characters within a typeface is referred to as a font's encoding. Before they supported Unicode, the Macintosh and Windows operating systems used different encoding schemes.

Not only did the pre-Unicode operating systems use different character-numbering schemes, but they also used different subsets of the basic Latin 1 character collection as their standard character sets. The Macintosh set and encoding scheme were called MacRoman; the Windows character set and encoding scheme were called Win ANSI. Although a vendor might sell identical fonts for both platforms, the Mac would allow its users to access one group of characters within a font, and Windows another. Figure 4.6 shows the characters that were unique to each platform.

What is a typeface or complete set of characters?

Figure 4.6 Of the basic PostScript Type 1 character set shown in , only Windows programs have direct keyboard access to the group of characters shown at the top here. Only Macintosh programs can use the keyboard to access the ones in the middle group. The bottom group includes characters in the basic MacRoman encoding that appear to be in every Mac font, but they are actually borrowed from the Symbol font.

Today's operating systems on both platforms allow access to all of these characters. But both the Mac and PC lack keystroke combinations that allow you to easily type their formerly inaccessible characters. For the sake of backward compatibility, and in respect for people's keyboarding habits, both operating systems act as if their old encoding schemes were still in use. To get access to the Unicode characters, you have to use special techniques (discussed in the next section).

Although Unicode is not a font encoding per se, it does provide applications on any platform with a standard way of indicating which characters to use. To assure the accurate representation of text as it travels through other computer systems, using Unicode-based fonts is a must.

Footnote: The Mac's "Borrowed Characters"

When you're working with PostScript fonts (and many TrueType fonts) on a Macintosh, the MacRoman encoding borrows certain characters from the Symbol font (see Figure 4.6). Such characters seem to be a part of every font you use. The keystroke combination Option-D, for example, always yields a lowercase Greek delta:

What is a typeface or complete set of characters?
. But the numbers assigned to these characters in the MacRoman encoding scheme point to blank "slots" in a Mac font. Calls for these numbers are diverted by the operating system to the Symbol font. That explains why these characters never match the style of the typeface you're working in (unless it happens to be Times Roman, upon whose design the seriffed Symbol characters are based).

This curious situation is unique to the Mac and unique to this small handful of characters. It's been largely corrected in most OpenType equivalents of older PostScript fonts through the incorporation of these formerly borrowed characters into their expanded character sets. The Mac OS now explicitly shows that it's using the Symbol font when you use the original keyboard commands to set these characters.

4. Finding the Characters You Need | Next Section Previous Section