Cuneify (cfy) is the Oracc subsystem for
rendering transliterations in cuneiform. Its primary purpose is
to produce online cuneiform versions of Oracc text editions and
it can be controlled via the project configuration file, via ATF
protocols, and via the period field of project
metadata. At its simplest, Cuneify can auto-configure itself by
looking up the period in its table of configuration
files.
At the more complex end of the scale, Cuneify works with the ATF processor to support selecting the cuneiform font down to the level of individual characters as well as selecting font-specific or sign list-specific character variants using graphetics tags. Given appropriate annotation in the transliteration or in its own configuration files, Cuneify can also lay out cuneiform text in a manner that partially but imperfectly mimics the original.
Cuneify implements the ligature recommendations of UTR56.
There is a test/demo page [./demo/] that also functions as a quick reference for Cuneify's features.
Cfy keys (cfy-key) control how cfy renders
cuneiform with different fonts (fnt), font-feature-settings
(ffs), and an appropriate default magnification for the font
(mag). It may also use a script tag to manage splits
and mergers. A cfy-key also includes a sign list member
(asl).
A cfy-key is a string where each member is given joined by
dashes, using '0' for empty members and '*' for wildcard
members, by convention the string always begins with the member
cfy:
cfy-gudea-0-150-middle-osl
When cfy reads a text, it starts with a default cfy-key. As it reads configuration data from the various sources the default key is updated and the new data becomes the default. The '*' value can be used for members that should be inherited from the previous cfy-key; it is usually safer always to specify all members of the cfy-key.
In the 'asl' member of the key, the special value '.' means use the project sign list.
Cuneify implements a simple substitution facility. The basic syntax is:
LHS_ELEMENTS '=>' RHS_ELEMENTS '.'
LHS_ELEMENTS are matched against the input and consist of
graphemes and other element tokens; note that cfy
substitutions only work with transliteration and there is no
access to lemmatization. Special elements are prefixed with
&, e.g., &w which matches a
word-space element.
RHS_ELEMENTS can either be a sequence of elements as for
the LHS_ELEMENTS, but they have an additional element type,
the assignment. The target of the assignment is expressed
with @DIGITS. In the simple case the target is
replaced with the corresponding LHS_ELEMENT.
The UTR56 recommendation to drop a space between adjacent ligatured graphemes which are separate words can thus be expressed in two ways:
saŋ &w ŋal₂ => saŋ ŋal₂ . saŋ &w ŋal₂ => @1 @3 .
Cuneify configuration files have the extension
.ccf. System .ccf files are associated with a
period in cfy's
internal table of Period::Font data [https://github.com/oracc/oracc2/blob/main/bin/xx/cun/perfnt.g].
The sequence in which .ccf files are read (or not) is designed to allow configuration of manuscripts in proxy projects even when they have their own embedded rules:
There is a single project configuration option to specify the name of a cfy configuration file for the project:
<option name="cfy-ccf" value="CFY-CCF"/>
If the CFY-CCF path does not begin with /, it is
looked for in the project 00lib directory, then in the system
$ORACC/lib/data/ directory.
There is a single ATF protocol to specify the name of the cfy configuration file for the ATF document:
#cfy: ccf CFY-CCF
The look-up rules for CFY-CCF are similar to those for project configuration, except that the same directory as the text is checked first, then the build directory for the text (if it is different), i.e., the look-up order is:
[TEXTDIR]/CFY-CCF [ORACC]/bld/[PROJECT]/[FOUR]/[PQX]/CFY-CCF [ORACC]/[PROJECT]/00lib/CFY-CCF [ORACC]/lib/CFY-CCF
Where [FOUR] is the first four characters of
the PQX number, e.g., P010.
Common elements can occur on the left hand side (LHS) and right hand side (RHS) of substitutions.
In addition to the common elements, RHS elements may include assignments. A simple assignment consists of @digits, where the digits give the index of an LHS element.
There is provision in the configuration grammar for assigning values in an assignment but this is not yet fully implemented and it is not clear that it will be necessary.
There are two groups of formatting elements: ruling elements and justification elements. The justification elements are mutually exclusive: only one can be active at any given time. The ruling elements are independent; none, any, or all of them can be active.
Formatting elements are normally given with the cfy-format keyword, with a separate cfy-format line in the config file for each formatting element.
Justification elements can also be given in substitutions; ruling elements may not.
Justification formatting has two versions: a keyword for use
with cfy-format and an element-style with
& for use in substitutions.
| Option | Element | Description |
|---|---|---|
| Justification Options | ||
| left | &Jl | Left-aligned |
| right | &Jr | Right-aligned |
| centre | &Jc | Center-aligned |
| spread | &Js | Justified across the line on word boundaries; this is Cuneify's default |
| penult | &Jp | As for
spread, but with an additional block of space
before the last word |
| char-spread | &Jcs | Justified
across the line along character boundaries. This is
implemented with CSS trickery because
text-justify is only supported in FireFox and
may be dropped from CSS. |
| char-penult | &Jcp | As for
char-spread but with an extra block of space
before the last character. This applies to compound signs
as well. |
| Ruling Options | ||
| boxed | n/a | Box the text with a thick ruling |
| colrule | n/a | Provide gutter rules between columns |
| ruled | n/a | Provide a light ruling between lines |
Font switching in ATF is done with %-commands as
with language switching. All cfy font switches consist of one
or more digits 0..9: Oracc reserves single-digit codes for
system use, which means that user font switches always consist
of two or more digits 0..9.
Each cfy font switch must be associated with a cfy-key; this
association may be made in any of the cfy configuration loci,
but the effect of a cfy-key selected via a font switch is local:
it lasts only until the next font switch or the end of the
current grapheme-group, cell or line. The special font switch
%00 terminates the local cfy-key selection.
The fnt, fss, and scr members must all be predefined in fonts.css. Cuneify tries not to emit conflicting class information, and because ffs, scr, cvNN and salt all use font-feature-settings, Cuneify prioritizes these: if there is salt/cvnn only those are emitted in the CSS; if there is ffs it takes precedence over scr (because scr typically combines several features into one font-feature-settings value).
Cuneify can check that the characters it is outputting were
actually in the selected font, and this can be leveraged to do
coverage checking. Use of the coverage-only mode is only
available from the command line and is most easily managed with
the script cfycov.sh which can be invoked with no
arguments to get a help screen.
Coverage always needs a .uni file, which is a simple list of
Unicode characters in the font. Oracc installed fonts have such
a list in /home/oracc/lib/data, so one could
do:
ln -sf /home/oracc/lib/data/<FONTNAME>.uni f.uni
To make it use of cfycov.sh more convenient.
Then one could say, e.g.:
cfycov.sh -u f.uni -p rinap/rinap2
This command checks the entire rinap/rinap2
project for coverage against the list in f.uni.