U+0304
.
ⲁ̄ = U+2C81 U+0304 ?
U+FE24
after first character and U+FE25
after last character.
ⲓ︤ⲥ︥ = U+2C93 U+FE24 U+2CA5 U+FE25 ?If the Bindestrich covers more than two characters, the in-between character(s) should be followed by
U+FE26
ⲓ︤ⲏ︦ⲙ︥ = U+2C93 U+FE24 U+2C8F U+FE26 U+2C99 U+FE25 ?
U+0305
. This practice is not recommended and should be explicitly declared.
ⲁ̅ = U+2C81 U+0305 ?
ⲓ̅ⲥ̅ = U+2C93 U+0305 U+2CA5 U+0305 ?
ⲓ̅ⲏ̅ⲙ̅ = U+2C93 U+0305 U+2C8F U+0305 U+2C99 U+0305 ?
U+0305
must be that of marking letters as numerals.
ⲁ̅ = U+2C81U+0305 ?
ⲃ̅ = U+2C83U+0305 ?
ⲅ̅ = U+2C85U+0305 ?
ⲇ̅ = U+2C87U+0305 ?
Pay attention: U+0305
is very similar to U+FE26
(at least in Antinoou font), but these two strokesmust not be mixed up and their use must not be confused!
Special attention must be paid to diacritics, particularly to superlinear strokes (see above). The converter will properly work and promptly guess the correct form to use in most cases if conversion from cmcl to unicode is performed. But it will fail to correctly convert the way back, specially in most complex cases.
For example, CMCL a_
(Coptonew: a_) will
be correctly converted to Antinoou: ⲁ̄ (Unicode U+2C81 U+304
),
but it will not work the way back. Antinoou: ⲁ̄
(Unicode U+2C81 U+304
) will be converted to CMCL a+
(Coptonew: a+). This should not be considered a bug and no fix will be provided in the future.
The same is true for other combinations, eg:
CMCL (Coptonew) | Antinoou (Unicode) | CMCL (Coptonew) |
---|
ASCII shortcut | Unicode output |
---|
Regex | Verbose explanation | Meaning | Replace policy | Examples |
---|---|---|---|---|
&([0-9]{1,2})n; |
an integer of one or two digits followed by n | Lacuna of known length | plus-minus (±, U+00B1) character followed by the number of missing characters, enclosed by brackets | &2n; = [±2] |
&([0-9]{1,2})\?; |
an integer of one or two digits followed by ? | Lacuna of supposed length | space and dot repeted the supposed length, enclosed by parentheses | &2?; = ( . .) |
&\?(cap|capitale); |
? followed by string cap or capitale | Unknown capital character | space followed by dot (same output as entity &1?; ) |
&?cap; = . |
&[0-9]{1,2}b; |
an integer of one or two digits followed by question mark | Blank space of known length | Not to be rendered | &2b; =
|
&([a-z]{1})\?; |
one alphabetic character followed by question mark | Uncertain alphabetic character | The alphabetic character followed by subliteral dot (U+0323) | &a?; = ạ |
&coppa; |
coppa string | Character coppa | Character coppa (U+03D9) | &coppa; = ϙ |
&(basilios|Crs|Cs|eiote| |
One of the following strings (comma separated): basilios, Crs, Cs, eiote, ekklHsia, fq, i:lHm, iHl, iHs, ilHm, is, isrl, iws, js, monaCos, oute, pna | The same string (CMCL encoding system) converted to Unicode | &ekklHsia; = ⲉⲕⲕⲗⲏⲥⲓⲁ |
|
&ebol_compresso; |
string: ebol_compresso | CMCL's ebol equivalent in Unicode | ⲉⲃⲟⲗ | &ebol_compresso; = ⲉⲃⲟⲗ |
&etcompresso; |
string: etcompresso | CMCL's et equivalent in Unicode | ⲉⲧ | &etcompresso; = ⲉⲧ |
&Hspir; |
string: Hspir | Heta with combining dot above (U+2C8F U+0307) | ⲏ̇ | &Hspir; = ⲏ̇ |
&.b; |
Simple dot | . | &.b; = . |
cmcl2unicode is an open source software available for download or fork on GitHub. Please report any issue you might encounter here.
This software is archived in Zenodo. Please cite it by referring the DOI: 10.5281/zenodo.76262299
An Archaeological Atlas of Coptic Literature Literary Texts in their Geographical Context: Production, Copying, Usage, Dissemination and Preservation