User:Rkamal1/Language selection

From LING073
Jump to: navigation, search

I am a CS major, so I would like to partner up with someone who is more familiar with linguistics. I have no preference as to who exactly that someone may be. In terms of languages, my ranked preferences are: Central Kurdish, Gurani, and Mazandarani. All of these are Indo-Iranian languages, which is the broad family of languages I am interested in working with.

Central Kurdish

  • ISO 639-3: ckb [1]
  • Status: 2 (Provincial)
  • Script: Variant of Arabic
  • No. of speakers: 4 million

Central/Sorani Kurdish is mostly spoken in the Kurdistan Autonomous Region of Iraq, but it is also popular in Iran and parts of Sweden [2]. It is agglutinative and "highly inflectional," using both prefixes and suffixes [3]. Interestingly, the language has no grammatical gender.

I have a personal connection to the language as it is my native tongue, which makes it my first preference. On the other hand, it is an official language of Iraq [4] and has some Apertium support, so it may not be the quintessential under-resourced language for the purposes of this course. In particular, I cannot develop a morphological analyzer for the language, as one already exists. Still, I am listing it because Prof. Washington hinted at his willingness to make exceptions where there are personal reasons to choose a language.


It is very easy to find texts for Sorani Kurdish from sources such as dictionaries [5] and newspapers [6]. I also already own PDFs of Sorani books.


  • ISO 639-3: hac [7]
  • Status: 7 (Shifting)
  • Script: Variant of Arabic
  • No. of speakers: 300,000

Gurani (not to be confused with Guarani) is spoken in villages near the Iraq-Iran border such as Hewraman and Kermanshah. It is a variety of Kurdish that is similar to but not mutually comprehensible with the Sorani dialect. Its morphology is agglutinative and inflectional (with prefixes and suffixes) [8]. Like Sorani, the language is also genderless.


Many authentic texts in Gurani come in the form of epic poems and religious scripture, such as the Ahl-e-Haqq (I am not sure how to access these online, but I may be able to find them with the help of the Professor/TA). There is also an OLAC Record that contains IPA transcriptions of Gurani audio [9].


  • ISO 639-3: mzn [10]
  • Status: 5* (Developing)
  • Script: Variant of Arabic
  • No. of speakers: 2,270,000

The language is spoken mainly in the Mazandaran Province north of Tehran, Iran. Speakers of the language tend to know Farsi—the official language of Iran—but there is a low literacy level in L1. Usage of the language is decreasing because of the need to educate children in Farsi instead of Mazandarani.

It seems from the limited information at hand regarding morphology that the language is agglutinative, but I cannot be sure. Certainly it has inflectional morphology, making use of both prefixes and suffixes.


I have found a Mazandarani glossary [11] and a collection of Tabari (Mazandarani) poems [12].