3. Fonts
It can seem like anarchy. There are an unknown number of fonts, each
encoded with their own tables, driven by arbitrary keyboard
layouts and outputs. In my opinion, Tamil can seriously compete
with any other language for maximum number of font tables.
Added to this commotion are the dynamic fonts for the web
pages, that enable anyone to get away with a non-standard font
as long as his pages are viewable.
Adding to all these is the official Indian Standard Code
for Information Interchange (ISCII), the Government of India
sponsored "unifying"
scheme to bring all Indian fonts under the Devanagari umbrella.
Anyone familiar with the way the characters are written in
Tamil and in Devanagari script will understand the lack of any
rationale in this approach.
Needless to say, this is serving to only add to the
confusion. A good analysis of this and the unicode for Tamil is
once again written by Sivaraj and can be found at
. For those not familiar with the Tamil script, a good
introduction written by Sivaraj is at
.
Let us ignore the anarchy for a moment and get a picture
of the frequently used font encodings. There are two main
contenders and luckily they will converge soon. The first and
most popular one is the Tamil Standard Code for Information
Interchange (TSCII), developed by volunteers throughout the world,
and the other, TAmil Monolingual (TAM), and TAmil Bilingual
(TAB) encodings, were proposed by the Tamil Nadu Government. Once
again, TAM is of limited use in an OS environment and we can
safely ignore that. Almost all Linux efforts are in TSCII
(Console, KDE, GNOME localizations).
3.1. TSCII
TSCII is a glyph-based, 8-bit bilingual encoding. It
uses a unique set of glyphs; the usual lower ASCII set.
Roman letters with standard punctuation marks occupy the
first 128 slots and the Tamil glyphs occupy the
upper ASCII
segment (slots 128-256). A good overview of the early font
encoding schemes and a the rationale behind the TSCII
approach can be found at
http://www.geocities.com/Athens/5180/tscii.html.
The home URL for TSCII volunteers is
http://www.tamil.net/tscii.
This site discusses the TSCII
encoding and provides tools including fonts, keyboard
drivers, editors and inter-conversion tools for various
platforms. The font encoding table according to TSCII-1.6 can be
found at http://www.tamil.net/tscii/charset16.gif.
The current version of TSCII is 1.6, and a revision is
expected anytime now that will fix some anomalies in using
various slots for encoding. This version 1.7 will be fully
backward compatible with 1.6 and is expected to gain
popularity. The
TSCII discussion group currently brainstorms on
modifications to TSCII-1.6. You may be able to participate in
the discussions by becoming a member. You may also be
able to download various beta tools from there. The font encoding
table according to TSCII-1.7 (draft) can be found at
.
3.2. TAB
TAB is a character based bilingual standard proposed by
the government of Tamil Nadu. The TAB bilingual encoding table can be
found at http://www.tamilnet99.org/annex4.htm.
Tools for TAB encoding (mostly restricted to the Windows
platform) can also be downloaded in the vicinity of this page.
3.3. Miscellaneous fonts and encodings
There are too many types, and unfortunately they are not documented well.
It is beyond the scope of this document to discuss them.