Rules for Simple Placement of Japanese Ruby (Draft)

The latest version of this document is now maintained at the W3C.

Table of Contents

Foreword

Ruby is the name given to the small annotations in Japanese content that are rendered alongside base text, usually to provide a pronunciation guide, but sometimes to provide other information. (See the article “What is ruby” by the internationalization Working Group for more information.)

The Difficulties of Ruby Processing

When performing ruby layout in Japanese, the following factors need to be considered in order to decide on the position:

  1. How to handle the correspondence between the base characters and the ruby
  2. What to do when the string of base characters is longer than the ruby string
  3. What to do when the string of base characters is shorter than the ruby string
  4. When the ruby string protrudes from the base character string, whether it can be allowed to be laid over the characters preceding or following, and whether this affects the position of the base characters

  5. When the ruby string protrudes from the base character string, and the base character string is at the start or the end of the line, whether the base character string or the ruby string should be aligned with the line edge
  6. When there are multiple base characters, whether there can be line wrap opportunity between them

In movable type typography, such matters were resolved based generic principles, and could always be corrected during the proofreading phase. Essentially, each case was adjusted individually in a flexible manner.

In computer-based typesetting, the layout needs to be more or less determined based on predetermined rules, but it remained necessary to adjust the results in certain cases, for example by changing the association between base characters and the ruby string, or by switching to a different placement policy.

Web Ruby placement

When thinking about computing placement for web content, it is not practical to decide on the positioning case by case as was done in movable type typography. It is therefore necessary to decide upon comprehensive rules that provide solutions to all the problems listed above, so that placement may be determined fully automatically. Considering all the possibilities that existed in movable type typesetting, the system to be designed needs to be very complex.

However, when considering the ideal positioning of ruby, it seems inevitable that exceptions will occur, causing issues.

In such cases, rather than ideal positioning, we must at least make sure that the positioning causes no misunderstanding; there are also practical limits to how complex the system can be in order to be practically implementable.

The following is a proposal for a simple processing system. The target audience is implementers and specification writers. It is expected that a full system may be more complex that what is described here, both due to the interaction with other features or other writing systems, and because those designing such system may wish to provide alternative options. Note that the terminology is based on that defined in JLReq.

Matters considered by the simple placement rules

Matters considered by the simple placement rules

Here are the fundamental assumptions underlying the simple placement rules.

  1. Ruby is used to display the reading or the meaning of the base characters. Therefore, the number one priority here is to avoid misreadings.
  2. The method detailed in this document attempts to reduce exceptions as much as possible. Therefore, there is no requirement for complex processing.
  3. The method is agnostic to horizontal vs vertical writing, and will use the same logic in either case.
  4. The method places the ruby string relative to the base character string the same way when they occur in the middle, start, or end of the line. Moreover, this method does not change the relative position of the ruby string to the base character string depending on preceding or subsequent characters. In other words, this method calculates a position for the ruby relative to the base string that does not change depending on context.
  5. Generally speaking, the processing method is based on JIS X 4051 (Formatting rules for Japanese documents). However, in some cases, optional steps are used.
  6. The ruby font size is set to half of the base character’s size as a default. However, the method supports using different sizes than 1/2.
  7. While there are cases of ruby on both sides of the base string exist, the method defined here only handles ruby on one side. Handling both sides is left as a future exercise.

Types of ruby

Ruby in Japanese may be divided into the following 3 different types, based on the relationship between the ruby and the base characters (see JLReq “3.3.1 Usage of Ruby”).

  1. Mono-ruby
  2. Jukugo-ruby
  3. Group-ruby

Which one to use depends on the relationship between the ruby and the base characters. Mono-ruby is used to connect ruby to a single base character, Jukugo-ruby is used when multiple base characters each have a corresponding ruby and at the same time the whole group needs to be processed together, and group-ruby is used when ruby is attached to a group of base characters together (see fig. 1). Each is used when specified.

Rules for Simple Placement of Japanese Ruby

Ruby character size and character placement

The size of the ruby characters and their placement in the inline direction relative to the base characters is as follows:

  1. The size of the ruby is by default set to half of the size of the base characters.
  2. In vertical text, ruby is placed to the right of the base characters, and the character frame of the ruby is placed flush against the character frame of the base characters. (see fig. 2)
  3. In horizontal text, ruby is placed to the top of the base characters, and the character frame of the ruby is placed flush against the character frame of the base characters. (see fig. 3)

The following sections describe in detail the placement of mono-ruby, jukugo-ruby, and group-ruby. However, since jukugo-ruby is more complex, it is explained last.

Placement of mono-ruby

Mono-ruby is placed as follows:

  1. When the ruby is made of two or more characters, each character in the ruby string is placed immediately next to its neighboring character, without any inter-letter spacing. Furthermore, when the ruby is composed of characters such as Grouped numerals (cl-24), Unit symbols (cl-25), Western word space (cl-26), or Western characters (cl-27) which have their own individual width, they are placed based on each character’s metrics. (see fig. 4)

  2. The center of the ruby string and of the base character string are aligned in the inline direction. (see fig. 5).
  3. Since the base character and its associated ruby form a single unit there is no line wrapping opportunity inside a mono-ruby.
  4. When the ruby string is longer than the base character string, the part of the ruby string that extends beyond the base characters must not hang over the characters preceding or following, if they are ideographic characters (cl-19), Hiragana (cl-15), Katakana (cl-16), etc. Space is introduced accordingly between these preceding or following characters and the base characters. (see fig. 5) However, in the following cases, the ruby characters do hang over the preceding or following characters. (see fig. 6)

  5. When the ruby string is longer than the base character string, and the ruby falls at the start of the line, then the start of the ruby string is aligned with the line’s start edge (see fig. 7), while if the ruby falls at the end of the line, then the end of the ruby string is aligned with the line’s end edge (see fig. 8),

Placement of group-ruby

Group-ruby is placed as follows:

  1. When the ruby string and the base character string are composed of characters such as Hiragana (cl-15), Katakana (cl-16), Ideographic characters (cl-19), and so on, excluding characters like Grouped numerals (cl-24), Unit symbols (cl-25), Western word space (cl-26), or Western characters (cl-27) which have their own individual width, the way they are positioned depends on how their respective lengths would compare if they were each laid out without any inter-letter spacing:

  2. When the base character string is composed of characters like Grouped numerals (cl-24), Unit symbols (cl-25), Western word space (cl-26), or Western characters (cl-27) which have their own individual width, and the ruby string is composed of characters such as Hiragana (cl-15), Katakana (cl-16), Ideographic characters (cl-19), and so on, the placement depends on the following (see fig. 13):

  3. When the ruby string is composed of characters like Grouped numerals (cl-24), Unit symbols (cl-25), Western word space (cl-26), or Western characters (cl-27) which have their own individual width, and the base character string is composed of characters such as Hiragana (cl-15), Katakana (cl-16), Ideographic characters (cl-19), and so on, the placement depends on the following (see fig. 13):
  4. When the ruby string is longer than the base character string and protrudes, whether and how it hangs over characters preceding or following the base character string is handled in the same way as with mono-ruby (see fig. 14). Also, when the ruby string is longer than the base character string, protrudes, and is located at the start or end of the line, the resulting layout is also identical to that of mono-ruby.

  5. In the case of group ruby, the base character string and its associated ruby string are treated as a unit, so there is no line wrapping opportunity inside either string.

Placement of Jukugo-ruby

Jukugo-ruby is placed as follows:

  1. With jukugo-ruby, each base character is associated with its own ruby string. When the length of each of these ruby string laid out without inter-letter spacing is shorter than the length of all their corresponding base characters, placement is determined as follows:

  2. For simple ruby implementations, if even a single ruby string is longer than its corresponding base character when laid out without inter-letter spacing, the resulting layout would look identical to group-ruby. (see fig. 17 and 18).

  3. With jukugo-ruby, individual base characters and their associated ruby string are treated as a unit, and line wrap opportunities are allowed between two base characters. When such a line wrap occurs, if a single base character that is part of the jukugo is placed alone at the end or at the start of a line, it is laid out identically to mono-ruby; conversely when several base characters that are part of the jukugo are placed together at the end or start of a line, they are laid out together as has been described in this section about jukugo-ruby (see fig. 19).
    Fig. 19
    Example of wrapping jukugo-ruby
  4. When the ruby string is longer than the base character string and protrudes, whether and how it hangs over characters preceding or following the base character string is handled in the same way as with mono-ruby. Also, when the ruby string is longer than the base character string, protrudes, and is located at the start or end of the line, the resulting layout is also identical to that of mono-ruby.

Ruby and Accessibility

Accessibility Improvements Using Ruby

Ruby plays a role in improving accessibility for people with visual impairments, and other sources of reading difficulties. Therefore, this section examines the relationship between ruby and accessibility.

Reading difficulties can be caused by a variety of factors, and therefore, requirements to improve accessibility also vary. For example, here are some common requirements:

Ruby Display Requirements for Accessibility

Based on the above, we can gather the following ruby display requirements for accessibility:

  1. Support for general-ruby is required.
  2. Support for para-ruby is required. Moreover, as the number of kanji known increases with the level of studies, based on the content and on the level of the reader, it must be possible to only display ruby for kanji assigned to a particular school year (or later).
  3. Support for hiding ruby is required.
  4. Considering the cost of production, distribution, and of user management, it is necessary to support ruby-less display, general-ruby display, and para-ruby display with the same content.
  5. A method to clearly visually distinguish the ruby characters and their based characters, such as displaying them in different colors, is required.