Difference between revisions of "Central Kurdish"

From LING073
Jump to: navigation, search
(Computational Resources)
(Developed Resources)
 
(48 intermediate revisions by 3 users not shown)
Line 3: Line 3:
 
Below is a list of resources relevant to the Sorani Kurdish language. Resources are categorized according to type of content. I own italicized resources in PDF format. I have also flagged resources I have not yet obtained.
 
Below is a list of resources relevant to the Sorani Kurdish language. Resources are categorized according to type of content. I own italicized resources in PDF format. I have also flagged resources I have not yet obtained.
  
== Computational Resources ==
+
For the [https://wikis.swarthmore.edu/ling073/Initial_corpus_assembly corpus assembly] lab, a [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb-corpus repository]  of plain text files has been created using excerpts from some of the resources listed below.
 +
 
 +
== External Resources ==
 +
 
 +
=== Computational Resources ===
 +
 
 +
* [https://www.branah.com/kurdish Keyboard Layout] (seems to be quite common)
  
 
* .txt lists of words, word-level digrams, and character-level trigrams
 
* .txt lists of words, word-level digrams, and character-level trigrams
  
* [https://www.branah.com/kurdish Keyboard Layout]: seems to be the common layout
+
* Latin-based phonetic keyboard layout that I own on my computer
 +
{{comment|: does this mean that it came with your OS or that it's a custom one you (or someone else?) created? -Jonathan}}
  
* Latin-based key layout that I own on my computer
+
* [https://arxiv.org/pdf/1811.10278.pdf Rule-based Kurdish Text Transliteration System]: Latin-Arabic orthography conversion
  
== Dictionaries ==
+
* [https://arxiv.org/abs/1911.13087 Kurdish Speech to Text]: early attempt to develop speech recognition for Sorani Kurdish
 +
 
 +
* [https://arxiv.org/abs/2010.06041 Towards Machine Translation for the Kurdish Language]: Sorani machine translation model
 +
 
 +
=== Dictionaries ===
  
 
* [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_3_vocabulary.pdf Sorani Vocabulary]: vocab list with Latin script transliterations, by Harvard
 
* [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_3_vocabulary.pdf Sorani Vocabulary]: vocab list with Latin script transliterations, by Harvard
  
* [https://www.thekurdishdictionary.com/en The Kurdish Dictionary]: translates to and from any combination of ckb/kur/engl via query
+
* [https://www.thekurdishdictionary.com/en The Kurdish Dictionary]: translates to and from any combination of ckb/kur/eng via query
  
* ''Diccionaire Fondamental Kurde-Français-Sorani'': French-Sorani dictionary with Latin transliterations and common phrases
+
* ''Dictionary of Scientific Terms'': includes Sorani definitions of terms like "atom"
  
== Grammatical Descriptions ==
+
* ''Diccionaire Fondamental Kurde-Français-Sorani'': French-Sorani dict with phrases & alphabet
  
== Scientific Works ==
+
=== Grammatical Descriptions ===
  
== Corpora ==
+
* [https://en.wikipedia.org/wiki/Sorani_grammar Sorani Grammar]: high-level description of important grammatical properties of Sorani
  
=== Books and Encyclopediae ===
+
* [https://ia600901.us.archive.org/13/items/KurdishBasicCourseDialectOfSulaimaniaIraq/Kurdish%20Basic%20Course%20(Dialect%20of%20Sulaimania,%20Iraq)_text.pdf Kurdish Basic Course]: has helpful chapters on grammar and word lists
  
* [https://ckb.wikipedia.org/wiki/%D8%AF%DB%95%D8%B3%D8%AA%D9%BE%DB%8E%DA%A9 Sorani Wikipedia]: many articles averaging a few paragraphs in length
+
* [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_1_grammar.pdf A Reference Grammar with Selected Readings]: extensive descriptions of Sorani grammar
  
* [https://www.bible.com/bible/503/GEN.1.KSS Sorani Bible]: one of a few different versions
+
=== Scientific Works ===
  
* [https://www.jw.org/ckb/ JW Website]: few dozen entries accompanied by voice narration
+
* [https://arxiv.org/pdf/1809.10763.pdf Building a Lemmatizer and a Spell-Checker for Sorani Kurdish]: includes background on Sorani morphology
  
* [https://www.hawlergov.org/app/ Hawler Gov]: various entries on the Kurdish capital
+
* [https://www.loc.gov/catdir/cpso/romanization/kurdish.pdf Kurdish Romanization Table]: potentially useful information on Sorani alphabet and orthography
  
* ''شازاده چکۆله'': Sorani version of ''The Little Prince'', by Aso Abdullah
+
* [https://www.aclweb.org/anthology/W17-1208.pdf Kurdish Interdialect Machine Translation]: suggests a method for Kurmanji-Sorani machine translation
  
=== News Sites ===
+
* [https://www.researchgate.net/publication/261379031_Building_a_Test_Collection_for_Sorani_Kurdish Building a Test Collection for Sorani Kurdish]: outlines a Test Collection project + list of affixes
  
This is an aggregation of news sites, all written in Sorani, that contain written media on the order of ~300 words per article, though long-form works of journalism can also be found.
+
=== Corpora ===
  
* [https://www.knnc.net/Details.aspx?jimare=35292 KNN]
+
* [http://www.language-archives.org/language/ckb OLAC Resources] (how do I access these?)
 +
{{comment|: Try clicking the links on that page and then looking for "Identifier (URI)" - Daniel}}
  
* [https://nrttv.com/News.aspx?id=40606&MapID=3 NRT]
+
* [https://ckb.wikipedia.org/wiki/%D8%AF%DB%95%D8%B3%D8%AA%D9%BE%DB%8E%DA%A9 Sorani Wikipedia]: many articles averaging a few paragraphs in length
  
* [https://www.rudaw.net/ Rudaw]
+
* [https://www.bible.com/bible/503/GEN.1.KSS Sorani Bible]: one of a few different versions of the Bible
  
* [https://www.xendan.org/?__cf_chl_jschl_tk__=40b26372944f5b2f5030a42ce2e242824802bf7f-1613677479-0-AdIOUD4PpqZJRxo5UoABrk12U7vL-b-MImh9McpzkptoSp9XEfaSUWZ8CMCghXsnze0cZ_uNou_nz_-dlrtYGSZedfD70albLiukFs_f9mQpfQ4eZtm6GftmTY4oPlVII7xBMhGVVlZdd-y4EqXa0jcXmmvumK-YPYb-72mOZe6Qmfr-Jz_fHbzlznT-kWYV93t-VPAMTzKTp_HzaVf5G3b7dkBX2EpBV302Z28AIubfw293-cWCuVQUOn8cewTE01fknrRdM3p8-m32G7wy1YleNKq-6OR8CuMBYzJ2jxvBmniY9NYxlQ-AUlApfeHigQ Xendan]
+
* [https://www.jw.org/ckb/ JW Website]: few dozen entries accompanied by voice narration
 +
 
 +
* [https://www.hawlergov.org/app/ Hawler Gov]: various entries on the Kurdish capital
 +
 
 +
* ''شازاده چکۆله'': Sorani version of ''The Little Prince'', by Aso Abdullah
  
* [https://www.awene.com/detail?article=44481 Awene]
+
* News sites: [https://www.awene.com/detail?article=44481 Awene], [https://www.knnc.net/Details.aspx?jimare=35292 KNN], [https://nrttv.com/News.aspx?id=40606&MapID=3 NRT], [https://www.rudaw.net/ Rudaw], [https://www.xendan.org/?__cf_chl_jschl_tk__=40b26372944f5b2f5030a42ce2e242824802bf7f-1613677479-0-AdIOUD4PpqZJRxo5UoABrk12U7vL-b-MImh9McpzkptoSp9XEfaSUWZ8CMCghXsnze0cZ_uNou_nz_-dlrtYGSZedfD70albLiukFs_f9mQpfQ4eZtm6GftmTY4oPlVII7xBMhGVVlZdd-y4EqXa0jcXmmvumK-YPYb-72mOZe6Qmfr-Jz_fHbzlznT-kWYV93t-VPAMTzKTp_HzaVf5G3b7dkBX2EpBV302Z28AIubfw293-cWCuVQUOn8cewTE01fknrRdM3p8-m32G7wy1YleNKq-6OR8CuMBYzJ2jxvBmniY9NYxlQ-AUlApfeHigQ Xendan]: aggregation of sites with ~300 words/article
  
=== Blog Pages ===
+
* [https://sites.google.com/site/abasshiwan2/ Abas Shiwan's blog] (with Latin script transliterations)
  
Long-form posts that exceed ~1000 words can be found here. Each link goes to a specific person's blog archive, containing anywhere between 10-100 articles.
+
* [http://ala-hooshiar.blogspot.com/ Ala Hooshiar's blog]
  
* [https://sites.google.com/site/abasshiwan2/ Abas Shiwan] (with Latin script transliterations)
+
* [http://amjad-shakely.blogspot.com/ Amjad Shakely's blog]
  
* [http://ala-hooshiar.blogspot.com/ Ala Hooshiar]
+
== Developed Resources ==
  
* [http://amjad-shakely.blogspot.com/ Amjad Shakely]
+
* [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb-keyboard Keyboard] (see [https://wikis.swarthmore.edu/ling073/Central_Kurdish/Keyboard Wiki page])
 +
* [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb Transducer] (see [https://wikis.swarthmore.edu/ling073/Central_Kurdish/Transducer Wiki page])
 +
* [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb-eng Resources for Kurdish-English Machine Translation] (see [https://wikis.swarthmore.edu/ling073/Central_Kurdish_and_English Wiki page])

Latest revision as of 08:56, 13 April 2021


Below is a list of resources relevant to the Sorani Kurdish language. Resources are categorized according to type of content. I own italicized resources in PDF format. I have also flagged resources I have not yet obtained.

For the corpus assembly lab, a repository of plain text files has been created using excerpts from some of the resources listed below.

External Resources

Computational Resources

  • .txt lists of words, word-level digrams, and character-level trigrams
  • Latin-based phonetic keyboard layout that I own on my computer
does this mean that it came with your OS or that it's a custom one you (or someone else?) created? -Jonathan

Dictionaries

  • Dictionary of Scientific Terms: includes Sorani definitions of terms like "atom"
  • Diccionaire Fondamental Kurde-Français-Sorani: French-Sorani dict with phrases & alphabet

Grammatical Descriptions

  • Sorani Grammar: high-level description of important grammatical properties of Sorani

Scientific Works

Corpora

Try clicking the links on that page and then looking for "Identifier (URI)" - Daniel
  • JW Website: few dozen entries accompanied by voice narration
  • Hawler Gov: various entries on the Kurdish capital
  • شازاده چکۆله: Sorani version of The Little Prince, by Aso Abdullah

Developed Resources