Ruby is the name given to the small annotations
in Japanese content that are rendered alongside base text,
usually to provide a pronunciation guide,
but sometimes to provide other information.
(See the article “What is ruby”
by the internationalization Working Group
for more information.)
The Difficulties of Ruby Processing
When performing ruby layout in Japanese,
the following factors need to be considered
in order to decide on the position:
How to handle the correspondence between the base characters and the ruby
What to do when the string of base characters is longer than the ruby string
What to do when the string of base characters is shorter than the ruby string
When the ruby string protrudes from the base character string,
whether it can be allowed to be laid over the characters preceding or following,
and whether this affects the position of the base characters
When the ruby string protrudes from the base character string,
and the base character string is at the start or the end of the line,
whether the base character string or the ruby string should be aligned with the line edge
When there are multiple base characters,
whether there can be line wrap opportunity between them
In movable type typography,
such matters were resolved based generic principles,
and could always be corrected during the proofreading phase.
Essentially, each case was adjusted individually in a flexible manner.
In computer-based typesetting,
the layout needs to be more or less determined based on predetermined rules,
but it remained necessary to adjust the results in certain cases,
for example by changing the association between base characters
and the ruby string,
or by switching to a different placement policy.
Web Ruby placement
When thinking about computing placement for web content,
it is not practical to decide on the positioning
case by case as was done in movable type typography.
It is therefore necessary to decide upon comprehensive rules
that provide solutions to all the problems listed above,
so that placement may be determined fully automatically.
Considering all the possibilities that existed in movable type typesetting,
the system to be designed needs to be very complex.
However, when considering the ideal positioning of ruby,
it seems inevitable that exceptions will occur, causing issues.
In such cases, rather than ideal positioning,
we must at least make sure that the positioning causes no misunderstanding;
there are also practical limits to how complex the system can be
in order to be practically implementable.
The following is a proposal for a simple processing system.
The target audience is implementers and specification writers.
It is expected that a full system may be more complex that what is described here,
both due to the interaction with other features or other writing systems,
and because those designing such system may wish to provide alternative options.
Note that the terminology is based on that defined in
JLReq.
Matters considered by the simple placement rules
Matters considered by the simple placement rules
Here are the fundamental assumptions underlying the simple placement rules.
Ruby is used to display the reading or the meaning of the base characters.
Therefore, the number one priority here is to avoid misreadings.
The method detailed in this document attempts to reduce exceptions as much as possible.
Therefore, there is no requirement for complex processing.
The method is agnostic to horizontal vs vertical writing,
and will use the same logic in either case.
The method places the ruby string relative to the base character string
the same way when they occur in the middle, start, or end of the line.
Moreover, this method does not change the relative position
of the ruby string to the base character string
depending on preceding or subsequent characters.
In other words, this method calculates a position
for the ruby relative to the base string
that does not change depending on context.
Generally speaking, the processing method is based on JIS X 4051
(Formatting rules for Japanese documents).
However, in some cases, optional steps are used.
The ruby font size is set to half of the base character’s size as a default.
However, the method supports using different sizes than 1/2.
While there are cases of ruby on both sides of the base string exist,
the method defined here only handles ruby on one side.
Handling both sides is left as a future exercise.
Types of ruby
Ruby in Japanese may be divided into the following 3 different types,
based on the relationship between the ruby and the base characters
(see JLReq “3.3.1 Usage of Ruby”).
Mono-ruby
Jukugo-ruby
Group-ruby
Which one to use depends on the relationship
between the ruby and the base characters.
Mono-ruby is used to connect ruby to a single base character,
Jukugo-ruby is used when multiple base characters each have a corresponding ruby
and at the same time the whole group needs to be processed together,
and group-ruby is used when ruby is attached to a group of base characters together (see fig. 1).
Each is used when specified.
Rules for Simple Placement of Japanese Ruby
Ruby character size and character placement
The size of the ruby characters
and their placement in the inline direction relative to the base characters is as follows:
The size of the ruby is by default set to
half of the size of the base characters.
In vertical text, ruby is placed to the right of the base characters,
and the character frame of the ruby is placed flush
against the character frame of the base characters.
(see fig. 2)
In horizontal text, ruby is placed to the top of the base characters,
and the character frame of the ruby is placed flush
against the character frame of the base characters.
(see fig. 3)
The following sections describe in detail the placement of
mono-ruby,
jukugo-ruby,
and group-ruby.
However, since jukugo-ruby is more complex,
it is explained last.
Placement of mono-ruby
Mono-ruby is placed as follows:
When the ruby is made of two or more characters,
each character in the ruby string is placed
immediately next to its neighboring character,
without any inter-letter spacing.
Furthermore, when the ruby is composed of characters such as
Grouped numerals (cl-24),
Unit symbols (cl-25),
Western word space (cl-26),
or Western characters (cl-27)
which have their own individual width,
they are placed based on each character’s metrics.
(see fig. 4)
The center of the ruby string and of the base character string
are aligned in the inline direction.
(see fig. 5).
Since the base character and its associated ruby form a single unit
there is no line wrapping opportunity inside a mono-ruby.
When the ruby string is longer than the base character string,
the part of the ruby string that extends beyond the base characters
must not hang over the characters preceding or following,
if they are
ideographic characters (cl-19),
Hiragana (cl-15),
Katakana (cl-16),
etc.
Space is introduced accordingly
between these preceding or following characters and the base characters.
(see fig. 5)
However, in the following cases,
the ruby characters do hang over the preceding or following characters.
(see fig. 6)
If the character preceding the base character is one of:
Closing brackets (cl-02),
Full stops (cl-06),
Commas (cl-07),
Full-width ideographic space (cl-14),
or Middle dots (cl-05),
then the ruby must hang over
the blank portion at the end the character.
(This blank portion is usually half the character’s width,
except in the case of Middle dots (cl-05)
where it is a fourth of the character width).
However, if this blank part has been compressed
due to justification or similar processing of the line,
then the ruby may only hang over the resulting
compressed blank space
(e.g. if it was reduced from half to a quarter em,
hang at most a quarter em).
If the character following the base character is one of:
Opening brackets (cl-01) or
Full-width ideographic space (cl-14),
Middle dots (cl-05),
then the ruby must hang over
the blank portion at the start the character.
(This blank portion is usually
half the character’s width for Opening brackets (cl-01),
or a quarter of the character’s width for Middle dots (cl-05))
However, if this blank part has been compressed
due to justification or similar processing of the line,
then the ruby may only hang over the resulting
compressed blank space
(e.g. if it was reduced from half to a quarter em,
hang at most a quarter em).
When the ruby string is longer than the base character string,
and the ruby falls at the start of the line,
then the start of the ruby string is aligned with the line’s start edge
(see fig. 7),
while if the ruby falls at the end of the line,
then the end of the ruby string is aligned with the line’s end edge
(see fig. 8),
When their respective lengths would be the same,
both are laid out without inter-letter spacing
and placed such that their respective centers in the inline direction are aligned
(see fig. 9).
When the ruby string is shorter than the base character string,
space is inserted between every character in the ruby string
as well as at the start and the end of the ruby string
so that it becomes the same length as the base character string,
then their centers in the inline direction are aligned.
The size of the space inserted between each of the ruby characters
is twice the size of the space inserted at the end and at the start
(see fig. 10).
However, the size space inserted at the start and end must
be capped at no more than half the size of one base character,
and the space inserted between each ruby character is enlarged to compensate
(see fig. 11).
When the ruby string is longer than the base character string,
space is inserted between every character in the base character string
as well as at the start and the end of the base character string
so that it becomes the same length as the ruby string,
then their centers in the inline direction are aligned.
The size of the space inserted between each of the base characters
is twice the size of the space inserted at the end and at the start
(see fig. 12).
When their respective lengths would be the same,
both are laid out without inter-letter spacing
and placed such that their respective centers in the inline direction are aligned.
When the ruby string is shorter than the base character string,
space is inserted between every character in the ruby string
as well as at the start and the end of the ruby string
so that it becomes the same length as the base character string,
then their centers in the inline direction are aligned.
The size of the space inserted between each of the ruby characters
is twice the size of the space inserted at the end and at the start.
When the ruby string is longer than the base character string,
both are laid out without inter-letter spacing
and placed such that their respective centers in the inline direction are aligned.
In this case, the ruby string protrudes from the base character string.
When their respective lengths would be the same,
both are laid out without inter-letter spacing
and placed such that their respective centers in the inline direction are aligned.
When the ruby string is shorter than the base character string,
both are laid out without inter-letter spacing
and placed such that their respective centers in the inline direction are aligned.
When the ruby string is longer than the base character string,
space is inserted between every character in the base character string
as well as at the start and the end of the base character string
so that it becomes the same length as the ruby string,
then their centers in the inline direction are aligned.
The size of the space inserted between each of the base characters
is twice the size of the space inserted at the end and at the start.
When the ruby string is longer than the base character string and protrudes,
whether and how it hangs over characters preceding or following
the base character string
is handled in the same way as with mono-ruby
(see fig. 14).
Also, when the ruby string is longer than the base character string,
protrudes, and is located at the start or end of the line,
the resulting layout is also identical to that of mono-ruby.
In the case of group ruby,
the base character string and its associated ruby string
are treated as a unit,
so there is no line wrapping opportunity inside either string.
Placement of Jukugo-ruby
Jukugo-ruby is placed as follows:
With jukugo-ruby, each base character is associated with its own ruby string.
When the length of each of these ruby string laid out without inter-letter spacing
is shorter than the length of all their corresponding base characters,
placement is determined as follows:
When the ruby string associated with an individual base character is 1 character long,
the ruby character and the base character
are placed such that their respective centers in the inline direction are aligned
(see fig. 16).
When the ruby string associated with an individual base character is 2 characters long or more,
the ruby string is laid out without inter-letter spacing,
and placed such that its center and the center of its base character are aligned in the inline direction
(see fig. 16).
For simple ruby implementations,
if even a single ruby string is longer than its corresponding base character
when laid out without inter-letter spacing,
the resulting layout would look identical to group-ruby.
(see fig. 17 and 18).
With jukugo-ruby, individual base characters and their associated ruby string are treated as a unit,
and line wrap opportunities are allowed between two base characters.
When such a line wrap occurs,
if a single base character that is part of the jukugo is placed alone at the end or at the start of a line,
it is laid out identically to mono-ruby;
conversely when several base characters that are part of the jukugo
are placed together at the end or start of a line,
they are laid out together as has been described in this section about jukugo-ruby
(see fig. 19).
When the ruby string is longer than the base character string and protrudes,
whether and how it hangs over characters preceding or following
the base character string
is handled in the same way as with mono-ruby.
Also, when the ruby string is longer than the base character string,
protrudes, and is located at the start or end of the line,
the resulting layout is also identical to that of mono-ruby.
Ruby and Accessibility
Accessibility Improvements Using Ruby
Ruby plays a role in improving accessibility for people with visual impairments,
and other sources of reading difficulties.
Therefore, this section examines the relationship between ruby and accessibility.
Reading difficulties can be caused by a variety of factors,
and therefore, requirements to improve accessibility also vary.
For example, here are some common requirements:
To accommodate young children who cannot read any kanji,
general-ruby must be added to all kanji.
As studies progress, a greater number of kanji is known.
After having read general-ruby many times,
ruby on difficult kanji only becomes sufficient.
Therefore para-ruby on only some of the kanji is required.
Some people have difficulties in visually distinguishing between
ruby characters and the base characters to which they are attached,
and misread the combination as a different character altogether.
There must be a display method that enables clearly distinguishing between the two.
Also, for those who already know how to read the kanji,
there must be a way to hide the ruby.
As inline parenthesised annotations can be used instead,
there is no strong need for double-sided ruby.
Ruby Display Requirements for Accessibility
Based on the above,
we can gather the following
ruby display requirements for accessibility:
Support for general-ruby is required.
Support for para-ruby is required.
Moreover, as the number of kanji known increases with the level of studies,
based on the content and on the level of the reader,
it must be possible to only display ruby for kanji
assigned to a particular school year (or later).
Support for hiding ruby is required.
Considering the cost of production, distribution, and of user management,
it is necessary to support ruby-less display, general-ruby display, and para-ruby display
with the same content.
A method to clearly visually distinguish the ruby characters and their based characters,
such as displaying them in different colors,
is required.