This document doesn't pretend to be the written Torah for a font designer. It is just a compilation of many notes, both paper and electronic, which used to clutter my table and computer, and which would inevitably be lost, unless I organized them as a single document and put on the Web. The primary purpose of this document is to serve me. But since Culmus is an open-source project, and the notion of "source" for artistic items is quite obscure, I declare hereby this document to be a part of source. As such, it immediately attains the right to be published here.
Basic Hebrew characters - see chart
The Unicode standard reserves a range of 112 characters in 0x0590-0x05FF. This range includes basic Hebrew letters with final forms, diacritics and cantillation marks (Tiberian system), special Hebrew punctuation (maqaf, sof-pasuk, geresh and gershayim) and Yiddish digraphs. Please note that complete Yiddish support also requires four vowels from Alphabetic Presentation Forms.
Hebrew ligatures and special forms - see chart
Another range of Unicode is called "Alphabetic Presentation Forms" and part of it (0xFB1D - 0xFB4F) is devoted to Hebrew ligatures and special forms. This includes all letters combined with dagesh, several Yiddish and Ladino ligatures, wide letters and special forms.
Notes: (1) Yiddish language requires the following vowels: 0xFB1D, 0xFB1F, 0xFB2E, 0xFB2F.
(2) The only Ladino ligature included in Unicode is aleph-lamed (0xFB4F). Unlike this one, other common ligatures can be produced using the basic forms.
(3) The Unicode standard doesn't define the purpose of alternative ayin (0xFB20). In contrary to the official chart, I create an outline for this glyph such that it doesn't descend below the baseline. This form of the letter ayin can be utilized when you need to position a diacritical mark below it.
A. A New Israeli Sheqel sign is defined at 0x20AA.
B. Microsoft specification defines five combinations (0xE801 - 0xE805) in the Private Use Area, which include vav with holam haser, final kaf with shva and qamats, and lamed with holam haser with or without dagesh. All these combinations can be produced using OpenType rendering engine and therefore are not necessary, but for convenience I include final kaf with shva and qamats (0xE802 - 0xE803).
C. Microsoft also recommends to include the following characters: LTR (0x200E), RTL (0x200F) and dotted circle (0x25CC). Their reason is described at http://www.microsoft.com/typography/otfntdev/hebrewot/other.htm .
D. In the spirit of Microsoft recommendation, I also include the following characters: Zero Width Non-Joiner (0x200C) and Zero Width Joiner (0x200D). Their purpose is as follows: ZWJ may force ligation of aleph and lamed into aleph-lamed ligature, which would normally not occur. ZWNJ may prevent the same conversion when using "JUD " (Ladino) language tag which would otherwise force a ligation. Microsoft has opinion on this topic too: http://www.microsoft.com/typography/otfntdev/glyphs.htm .
E. Always make sure that your font includes character "zero" (0x0030), because presence of this character declares the font as ASCII-enabled. If your font is not marked as ASCII-enabled, most software will utilize only its Hebrew part, and substitute punctuation and digits from some other font.
The task of biblical typesetting, apart from the difficulty of proper diacritics positioning and a plenty of other problems, requires several very rare glyph forms. A very good explanation of some of them can be found at the site of Mordechai Pinchas Sofer, in the section "Scribal oddities"
Appears only once in the entire Bible (Numeri 25:12, in the word "shalom"). An explanation of its meaning can be found at http://www.sofer.co.uk/html/broken_vav.html.
Appears twice in the Bible (Numeri 10:35-36). An explanation can be found at http://www.sofer.co.uk/html/nun_hafucha.html. Note that the tradition doesn't define whether the letter should be reversed upside-down or right-to-left. In my edition of Bible, the nun is simply turned 180°, probably because this way the typesetter could utilize a common glyph and avoid producing a special one.
Some designers argue that bowed lamed was introduced in order to squeeze more lines into a piece of paper in the times when paper was rare and expensive. For this reason, nowadays, bowed lamed makes the text look old-fashioned and sometimes is considered a bad typographic style. If you still wish to use it, please download "Frank Curled Lamed" from the Developers' area.
I decided to introduce my own grouping of glyphs, to help user easily find out which features are supported by each font.
- 22 Hebrew letters + 5 final forms (U+05D0 - U+05EA)
- Digits (U+0030 - U+0039)
I tend sometimes to design old style digits, which usually have descenders in 34579 and ascenders in 68. When the font has heavy horizontal elements (such as Frank-Ruehl and most classic ashkenazi-style fonts), designing a digit such as 5 can be challenging as its two horizontal strokes are too close to each other and produce excessively black glyph. In this case turning to the old style form can give better results.
Basic: Exclamation mark (U+0021), double quote (U+0022), single quote (U+0027), comma (U+002C), hyphen (U+002D), period (U+002E), colon (U+003A), semicolon (U+003B), question mark (U+003F), ellipsis (U+2026).
Hebrew: Geresh (U+05F3), gershayim (U+05F4).
Dashes: En-dash (U+2013), em-dash (U+2014), direct speech dash (U+2015).
Quotes: Left quote (U+2018), right quote (U+2019), single base quote (U+201A), left double quote (U+201C), right double quote (U+201D), double base quote (U+201E).
In older books, which are mostly typeset with Drugulin, sometimes with Frank-Ruehl, double quotes and geresh/gershayim are usually aligned with the mean line of the letters. I deilberately choose to raise quotes considerably, as I want them to be distinctive and highly visible, just like any other punctuation. Geresh and gershayim are also raised, but to smaller extent. In their case one of my concerns is that in standalone counting geresh can be confused with "yod".
Ordinary parentheses (U+0028, U+0029), left and right brackets (U+005B, U+005D)
- Mathematical symbols
Number sign (U+0023), percent sign (U+0025), asterisk (U+002A), plus (U+002B), slash and backslash (U+002F, U+005C), less, equal and greater signs (U+003C - U+003E), minus (U+2212), alternative plus (U+FB29).
People commonly use hyphen instead of minus, but hyphen is not really a minus, and hyphen is also significantly shorter. The minus has the same width and leads as the plus.
- Currency symbols
New shequel (U+20AA), dollar sign (U+0024), euro (U+20AC), pound (U+00A3).
The new shequel is naturally a must, and dollar and euro are frequently used in Hebrew internet sites too. Regarding pound, somebody once asked me for it, and I think it's nice.
- Diacritics (nikud, shin and sin dots, dagesh, rafe and varika)
- Precomposed forms with dagesh
- Forms of sin/shin with dot and with/without dagesh
- Yiddish and Ladino letters
- Microsoft precomposed forms (0xE801-0xE803)
- Misc symbols (NIS, zero-width spaces, dotted circle, alternative ayin and alternative plus)
- Cantillation marks
- Masoretic letterforms
- Wide forms
Some people say that Hebrew fonts don't need kerning. Indeed, non-kerned Hebrew fonts are quite bearable. But nevertheless, kerning would never hurt, even considering that software rarely supports it. I will not present the list of most recommended kerning pairs, these could significantly vary according to style. If you want an example, take a newspaper and look for the letters "יב" or "יג" in a headline set in Haim.
Now to the work...
First you will need a couple of pages of random garbage. I created a Perl script heblorem.pl, which outputs such garbage in iso-8859-8 encoding. The special feature of this script is that its output contains every possible combination of adjacent letters, so no pair will be possibly missed. Now load the garbage into some program which supports kerned fonts (or doesn't, but just for the first pass), print it with a good laser printer, and start sqeezing your eyes. The rest is obvious.