Friday, April 27, 2012

The Turkish Alphabet

The official Turkish Alphabet is composed of the following 29 letters.

a b c ç d e f g ğ h ı i j k l m n o ö p r s ş t u ü v y z


The orphan letters

Like anything else that is official, this is not the complete story. Real life Turkish text will also contain the following letters.

â î û

These three letters written with a circumflex, are long or stressed versions of  a, i, and u. Their status is disputed in multiple ways. They are usually considered to be variations of a, i, and u instead of full members of the alphabet. Some people oppose their use altogether while others use them regularly. TDK (Turkish Language Association) keep changing the rules governing their use every decade. I am afraid the disputes are usually ideological and not scientific in nature. This holds true for most of the discussions regarding the Turkish language.

Note that circumflex is called düzeltme işareti, inceltme işareti, şapka işareti, or uzatma işareti in Turkish. The numerous names given to this diacritic is not unrelated to the confusion regarding its use.

The current set of rules governing the use of circumflex are complex and not followed reliably. You may see the same word written using either âîû or their counterparts without the circumflex. If you are going to perform any processing in Turkish text, it is good idea to convert âîû to aiu first.


The foreign letters

Turkish text may contain letters from other languages, usually to write foreign proper nouns like people and place names. Here is a a list of all foreign letters with frequencies exceeding 0.001%.

w x q


Notes

Letter frequencies are computed from Hurriyet and Zaman newspapers using columnist articles between 2001 and 2011.

See Also
http://en.wikipedia.org/wiki/Turkish_alphabet
http://en.wikipedia.org/wiki/Circumflex
http://tr.wikipedia.org/wiki/Düzeltme_işareti

No comments:

Post a Comment