Raph Levien <raph@acm.org>
10 Jul 1999
See also: Gnome-Font API documentation.
Owen Taylor is working on gscript, which has some overlap with the functions described in this interface. We're workingon unifying the two api's as much as possible.
levien.com Gnome homeThis document describes the Gnome-Text api. Gnome-Text is a work in progress. Thus, some of the specifics may change. Nonetheless, this document should be useful as a guideline.
Overview
Gnome-Text is a module for extremely high quality text formatting. It is based on Unicode, and will eventually be capable of rendering a wide variety of scripts. Because the rendering done by Gnome-Text is highly dependent on the details of the font, you will need fonts obtained from Gnome-Font before you can use Gnome-Text.
The input to Gnome-Text is a sequence of Unicode characters, along with markup information. The markup information contains information about the font, font size, weight, italics, and other attributes.
The output of Gnome-Text is a sequence of GnomeTextLine data structures, each of which contains a sequence of glyphs, each with a precise (x, y) location. Gnome-Text does not itself render glyphs to a display or printer. Instead, it relies on a separate renderer. The GnomeTextLine data structure is designed to be simple enough that it should be easy to implement renderers. To obtain the actual glyph shapes (for example, aas Type1 charstrings, bezier outlines, or bitmaps), the renderer must negotiate with Gnome-Font.
In particular, the renderers provided with Gnome-Print and the Gnome Canvas (the latter being developed in the Gill CVS tree at the time of this writing) should take care of all these details.
However, the use of Gnome-Text is not dependent on using Gnome-Font or Gnome-Print. You need only provide font metrics corresponding to the Gnome-Font interface, then render the resulting GnomeTextLine text using your own renderer.
Unicode with markup
The primary input to Gnome-Text is UTF-8 encoded Unicode text, with added markup. The markup is given as an array of GnomeTextAttrEl structures:
typedef struct _GnomeTextAttrEl GnomeTextAttrEl; struct _GnomeTextAttrEl { int char_pos; /* offset in (possibly wide) characters from start of string */ GnomeTextAttr attr; int attr_val; };Each such attribute stays in effect from the char_pos specified until the the next attribute with the same attr value, or the end of the string. The char_pos is an offset specified as the number of characters. In the UTF-8 encoding, a Unicode character may be encoded as from 1 to 6 bytes. Thus, to interconvert between character offset and byte offset requires scanning the actual character data.
Each attribute value is given as an integer, designed to fit comfortably in 32 bits. Thus, attributes that take on fractional values are generally encoded with some fixed-point encoding, most often multiplying by 1000.
The specific attribute values are as follows:
GNOME_TEXT_FONT_LIST
The attribute value is a GnomeTextFontListHandle, obtained from gnome_text_intern_font_list ().GNOME_TEXT_SIZE
The attribute value is the font size, in units of 0.001 the user unit. The exact meaning user unit is up to the application, but for printing, 1/72 inch is reasonable, and for display, 1 pixel is reasonable. Thus, an 11.5 point font is typically represented with 11500.
GNOME_TEXT_XSCALE
The attribute value is a scaling value multiplied by 1000. The default value is 1000, which corresponds to no scaling. A value of 800 will yield a condensed font, whlie 1300 will be extended. Optical scaling is generally not recommended; using a real condensed or extended font is preferable.
GNOME_TEXT_OBLIQUING_UGH
The attribute value is the tangent of the obliquing angle, multiplied by 1000. Thus, 1000 corresponds to a 45 degree angle, and 176 corresponds to a 10 degree angle. Obliquing is generally not recommended (hence the _UGH). Using a real italic font is preferable. Obliquing may be acceptable for some san serif fonts.
GNOME_TEXT_WEIGHT
The attribute value is a weight code. These are currently defined as follows:
typedef enum { GNOME_FONT_LIGHTEST = -3, GNOME_FONT_EXTRA_LIGHT = -3, GNOME_FONT_THIN = -2, GNOME_FONT_LIGHT = -1, GNOME_FONT_BOOK = 0, /* also known as "regular" or "roman" */ /* gap here so that if book is missing, light wins over medium */ GNOME_FONT_MEDIUM = 2, GNOME_FONT_SEMI = 3, /* also known as "demi" */ GNOME_FONT_BOLD = 4, /* gap here so that if bold is missing, semi wins over heavy */ GNOME_FONT_HEAVY = 6, GNOME_FONT_EXTRABOLD = 7, GNOME_FONT_BLACK = 8, GNOME_FONT_EXTRABLACK = 9, /* also known as "ultra" */ GNOME_FONT_HEAVIEST = 9 } GnomeFontWeight;For non-typographically sophisticated applications, making only GNOME_FONT_BOOK and GNOME_FONT_BOLD available is reasonable. These codes were selected to fit Adobe font catalog, and should be a reasonable guideline for future font design.
In general, GnomeText will try to choose a font weight that is closest to the weight code specified. Ties are resolved in favor of the lighter weight.
GNOME_TEXT_ITALICS
The attribute is a boolean indicating whether upright (false) or italic (true) variant of the font is chosen.
GNOME_TEXT_KERNING
The attribute is a boolean indicating whether kerning is enabled. Todo: in a<kern-on>b<kern-off>c, is kerning enabled between ab or bc? ab seems to make more sense to me. In most applications, kerning improves the appearance of text and should be enabled.
GNOME_TEXT_LIGATURES
The attribute is an enumeration indicating whether no ligatation should be performed (GNOME_TEXT_LIG_NONE), normal ligation (GNOME_TEXT_LIG_NORMAL), or maximal ligation (GNOME_TEXT_LIG_MAX).
In almost all cases, the normal Latin ligatures of "fi", "fl", "ff", "ffi", and "ffl" improve the appearance of text and should be enabled. Some fonts, such as Emigre's Mrs Eaves contain a great many additional ligatures (including the relatively common "st" and "ct") that may be desired in some contexts, but not all. These ligatures are unusual enough that they should only be chosen when explicitly desired.
Ligatures are also very important for non-Latin scripts. In most cases, leaving the ligature attribute set to GNOME_TEXT_LIG_NORMAL will result in good quality.
GNOME_TEXT_TRACKING
Tracking refers to the addition of extra space between glyphs. A very large value of tracking might look like t h i s .
The attribute value is the additional letterspace to add before the affected glyph, specified in 0.001 em units. It is specified in this way so that a<track=100>b<track=0>c will add .1 em units of space between the a and b, not the b and c.
Tracking does not affect the positioning of the first character in a line.
GNOME_TEXT_SMALL_CAPS
The attribute is a boolean indicating whether small caps are selected. When small caps are selected, lowercase characters become small capitals. It might look like THIS.
Small capitals are a very nice typographic refinement, and it is recommended that they be used where appropriate. In particular, TLA-heavy text often has a more uniform appearance when small caps are used.
GNOME_TEXT_GLYPH_ALTERNATE
This attribute is used to select glyph alternates, such as swash versions of characters, etc.
The meaning of a glyph alternate is specific to a specific sequence of characters. The space of glyph alternates is informally managed, although zero always means the normal alternate.
The presence of specific glyph alternates is highly dependent on the font.
The present GnomeFont interface does not include a glyph alternate mechanism. Todo: add this.
GNOME_TEXT_RISE
This attribute is useful for superscripts and subscripts, and some other effects as well (like setting the word "TeX" correctly).
The attribute value is a relative vertical offset in 0.001 em units, with positive values going upwards. Thus, one possible value for superscript is to multiply the font size by 0.8, and use a rise value of 200 (this is not a recommendation).
GNOME_TEXT_HYPHENATE
This boolean attribute controls whether GnomeText will attempt to hyphenate words when breaking lines. In general, hyphenation will allow better quality spacing and is thus recommended.
GNOME_TEXT_LANGUAGE
This attribute must be set when hyphenation is enabled. It is a big-endian encoding of a 4 byte ASCII string into a 32-bit word. The first two characters (i.e. high order bytes) are the major language code, and the second two characters (i.e. low order bytes) are a subcode, typically a country. For example, US English is "enUS", or 0x656e5553. In the case of a missing subcode, NUL values are to be used. Thus, Inuktitut is coded as "iu", represented as 0x69750000.
Values for the language codes are specified in RFC 1766. Currently, only four character language codes are supported. An extension mechanism for longer language codes may be added if needed (as would appear to be the case for Klingon, if this language is to be hyphenated). ISO 639 (with 1989 revisions) specifies language codes, and ISO 3166 specifies country codes used for subcodes. Note: the HTML links provided here are believed good, but not guaranteed.
Different language codes will in general result in different hyphenation patterns. For example, the word "General" is hyphenated as "Gen-eral" in English (enUS), or "Ge-ne-ral" in German (deDE) (thanks to Angelika Levien for the example).
The default language code is "enUS".
The language code is primarily used for hyphenation, but may also be used to select glyph variants for high-quality typography, particularly in CJK languages.
Pass-through attributes
All attribute codes greater than GNOME_TEXT_ATTR_MAX are "pass-through" attributes. These are attributes present in the original character stream, and are passed through to the glyph stream. In general, any attribute used for rendering, but not essential for layout, is best implemented as a pass-through.
The following pass-through attributes are recommended for most rendering back-ends:
GNOME_TEXT_UNDERLINE_UGH: a boolean indicating whether underlining is selected (perhaps a value of 2 should indicate a double underline). Underlining is rarely recommended in high-quality text composition (hence the _UGH).
GNOME_TEXT_STRIKETHROUGH: a boolean indicating whether
strikethroughis selected.GNOME_TEXT_COLOR: a 32-bit RGBA value indicating the foreground color of the text.
GNOME_TEXT_BG_COLOR: a 32-bit RGBA value indicating the background color (useful for highlighting selections, etc.).
The text pipeline
The text pipeline consists of four basic stages:
- gnome_text_layout_new: conversion of characters to glyphs, ligation, kerning, and hyphenation.
- gnome_text_just_hs: line breaking
- gnome_text_lines_from_layout: justification and reordering
- rendering
The input to gnome_text_layout_new() is Unicode text with markup, as described above. The handles to fonts in the markup should already have been obtained from gnome_text_intern_font_list().
The gnome_text_layout_new() function does a lot of work. It converts Unicode characters to font-specific glyph numbers, joins glyphs together in ligatures (for example, the "fi" ligature is common in many Latin fonts and improves appearance), spaces glyphs according to kerning rules in the font, and identifies potential line breaks, including hyphenation points.
The result is a GnomeTextLayout structure. This structure contains:
- A list of glyphs, each with an X position
- Markup for the glyphs (font changes and other attributes)
- A list of potential line breaks
Each potential line break contains two x position value. Assuming that only one break is chosen, x0 is the width of the first line, while x1 is the "starting position" of the second line relative to the original unbroken line. Formally, x1 is the length of the unbroken line minus the sum of the lengths of the two broken lines.
In the case of a plain word break, the sum of the lengths of the two lines is less than that of the unbroken line (the space is deleted), so x1 > x0. In the case of an inserted hyphen, the sum of the lengths is greater than the unbroken line (the hyphen is inserted), so x1 < x0.
Each line break also contains a "diff" of glyphs that are modified. In the case of plain word breaks, there is no diff. In the case of hyphenated line breaks, the diff consists of inserting the hyphen at the end of the first line. Sometimes, diffs can get more complex. For example, in hyphenating the word "traffic", the unbroken glyphs may be "t r a ffi c", and the glyphs of the broken lines may be "t r a f -" and "fi c".
In summary, GnomeText's method for representing line breaks is simple and general, and handles a wide range of line breaking patterns with no loss of precision.
The output of gnome_text_layout_new also contains markup for bidi level (even is left-to-right, odd is right-to-left). This is ignored during line breaking, but applied between line breaking and rendering to reverse right-to-left substrings as needed.
The line break algorithm (for example, gnome_text_just_hs) takes these line breaks (using only the x0 and x1 information from each line break) and chooses where to break the lines. gnome_text_just_hs is a simple line breaker that assumes a constant width paragraph, but more sophisticated line breakers are certainly possible. In particular, I plan to implement gnome_text_just_hq soon, an adaptation of the libhnj line breaking algorithm, which does TeX-style whole paragraph optimization of line breaks. Other interesting extensions include fitting paragraphs to more complex shapes.
The result of the line break algorithm is simply a list of the line breaks which have been chosen. It is the responsibilty for the third phase in the GnomeText pipeline to justify the wordspacing and apply the diffs specified for the chosen line breaks. When doing bidirectional text processing, this phase is also responsible for a final reordering of right-to-left substrings.
The interface for the third phase is gnome_text_lines_from_layout(). The result is a list of GnomeTextLine structures. These consist of a list of glyphs (each with an X position) along with some markup. It is this data structure that is passed to the renderer.
The GnomeText interface itself does not contain any methods for rendering. Different renderers are used for different output contexts. I forsee implementing renderers for PostScript output, to an RGB buffer (especially useful in antialiased canvas items), to X (using either client-side rendered fonts or server-side X fonts if available), and perhaps others.
Open Issues
The fact that fonts are interned into a global data structure makes me a tad uncomfortable. Perhaps it would be better to have a GnomeTextContext of some kind that held interned fonts?
Color, strikethrough, and underlining are "olestra" attributes, i.e. they are simply passed through without processing. Should there be a more general mechanism for olestra attributes? I can easily imagine, for example, that both background and foreground colors would be desired in Gtk+ (for highlighting selections). Note: currently addressed.
We also need to return a mapping that tells which glyphs correspond to which characters. Among other things, this is necessary for resolving (x, y) coordinates (i.e. a mouse click) back to a character position. This mapping is one-to-one for pure ASCII, but can easily be many-to-one (for ligatures), and very likely one-to-many as well, once we get into the "interesting" scripts. It's possible to get one-to-many from European Unicode too, i.e. U+0133 (LATIN SMALL LIGATURE IJ) will usually be rendered using a glyph for "i" and a glyph for "j". Even U+00C6 (LATIN CAPITAL LETTER AE) may be rendered using a glyph for "A" and a glyph for "E" in some fonts.
I have not yet decided on a data structure to represent this mapping. It is not currently implemented.
Correct handling of glyphs added by Gnome-Text (e.g. hyphens) is tricky. This is in some ways a "zero-to-one" mapping.
It occurs to me that providing this mapping may render explicit handling of olestra attributes unnecessary. However, not passing them through may be cumbersome.