Difference between revisions of "Central Kurdish"
From LING073
(→Grammatical Descriptions) |
(→Developed Resources) |
||
(27 intermediate revisions by 3 users not shown) | |||
Line 3: | Line 3: | ||
Below is a list of resources relevant to the Sorani Kurdish language. Resources are categorized according to type of content. I own italicized resources in PDF format. I have also flagged resources I have not yet obtained. | Below is a list of resources relevant to the Sorani Kurdish language. Resources are categorized according to type of content. I own italicized resources in PDF format. I have also flagged resources I have not yet obtained. | ||
− | == Computational Resources == | + | For the [https://wikis.swarthmore.edu/ling073/Initial_corpus_assembly corpus assembly] lab, a [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb-corpus repository] of plain text files has been created using excerpts from some of the resources listed below. |
+ | |||
+ | == External Resources == | ||
+ | |||
+ | === Computational Resources === | ||
+ | |||
+ | * [https://www.branah.com/kurdish Keyboard Layout] (seems to be quite common) | ||
* .txt lists of words, word-level digrams, and character-level trigrams | * .txt lists of words, word-level digrams, and character-level trigrams | ||
− | + | * Latin-based phonetic keyboard layout that I own on my computer | |
− | + | {{comment|: does this mean that it came with your OS or that it's a custom one you (or someone else?) created? -Jonathan}} | |
− | * Latin-based keyboard layout that I own on my computer | ||
* [https://arxiv.org/pdf/1811.10278.pdf Rule-based Kurdish Text Transliteration System]: Latin-Arabic orthography conversion | * [https://arxiv.org/pdf/1811.10278.pdf Rule-based Kurdish Text Transliteration System]: Latin-Arabic orthography conversion | ||
Line 17: | Line 22: | ||
* [https://arxiv.org/abs/2010.06041 Towards Machine Translation for the Kurdish Language]: Sorani machine translation model | * [https://arxiv.org/abs/2010.06041 Towards Machine Translation for the Kurdish Language]: Sorani machine translation model | ||
− | == Dictionaries == | + | === Dictionaries === |
* [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_3_vocabulary.pdf Sorani Vocabulary]: vocab list with Latin script transliterations, by Harvard | * [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_3_vocabulary.pdf Sorani Vocabulary]: vocab list with Latin script transliterations, by Harvard | ||
* [https://www.thekurdishdictionary.com/en The Kurdish Dictionary]: translates to and from any combination of ckb/kur/eng via query | * [https://www.thekurdishdictionary.com/en The Kurdish Dictionary]: translates to and from any combination of ckb/kur/eng via query | ||
+ | |||
+ | * ''Dictionary of Scientific Terms'': includes Sorani definitions of terms like "atom" | ||
* ''Diccionaire Fondamental Kurde-Français-Sorani'': French-Sorani dict with phrases & alphabet | * ''Diccionaire Fondamental Kurde-Français-Sorani'': French-Sorani dict with phrases & alphabet | ||
− | == Grammatical Descriptions == | + | === Grammatical Descriptions === |
+ | |||
+ | * [https://en.wikipedia.org/wiki/Sorani_grammar Sorani Grammar]: high-level description of important grammatical properties of Sorani | ||
* [https://ia600901.us.archive.org/13/items/KurdishBasicCourseDialectOfSulaimaniaIraq/Kurdish%20Basic%20Course%20(Dialect%20of%20Sulaimania,%20Iraq)_text.pdf Kurdish Basic Course]: has helpful chapters on grammar and word lists | * [https://ia600901.us.archive.org/13/items/KurdishBasicCourseDialectOfSulaimaniaIraq/Kurdish%20Basic%20Course%20(Dialect%20of%20Sulaimania,%20Iraq)_text.pdf Kurdish Basic Course]: has helpful chapters on grammar and word lists | ||
Line 31: | Line 40: | ||
* [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_1_grammar.pdf A Reference Grammar with Selected Readings]: extensive descriptions of Sorani grammar | * [https://sites.fas.harvard.edu/~iranian/Sorani/sorani_1_grammar.pdf A Reference Grammar with Selected Readings]: extensive descriptions of Sorani grammar | ||
− | == Scientific Works == | + | === Scientific Works === |
* [https://arxiv.org/pdf/1809.10763.pdf Building a Lemmatizer and a Spell-Checker for Sorani Kurdish]: includes background on Sorani morphology | * [https://arxiv.org/pdf/1809.10763.pdf Building a Lemmatizer and a Spell-Checker for Sorani Kurdish]: includes background on Sorani morphology | ||
Line 39: | Line 48: | ||
* [https://www.aclweb.org/anthology/W17-1208.pdf Kurdish Interdialect Machine Translation]: suggests a method for Kurmanji-Sorani machine translation | * [https://www.aclweb.org/anthology/W17-1208.pdf Kurdish Interdialect Machine Translation]: suggests a method for Kurmanji-Sorani machine translation | ||
− | + | * [https://www.researchgate.net/publication/261379031_Building_a_Test_Collection_for_Sorani_Kurdish Building a Test Collection for Sorani Kurdish]: outlines a Test Collection project + list of affixes | |
− | === | + | === Corpora === |
+ | |||
+ | * [http://www.language-archives.org/language/ckb OLAC Resources] (how do I access these?) | ||
+ | {{comment|: Try clicking the links on that page and then looking for "Identifier (URI)" - Daniel}} | ||
* [https://ckb.wikipedia.org/wiki/%D8%AF%DB%95%D8%B3%D8%AA%D9%BE%DB%8E%DA%A9 Sorani Wikipedia]: many articles averaging a few paragraphs in length | * [https://ckb.wikipedia.org/wiki/%D8%AF%DB%95%D8%B3%D8%AA%D9%BE%DB%8E%DA%A9 Sorani Wikipedia]: many articles averaging a few paragraphs in length | ||
Line 53: | Line 65: | ||
* ''شازاده چکۆله'': Sorani version of ''The Little Prince'', by Aso Abdullah | * ''شازاده چکۆله'': Sorani version of ''The Little Prince'', by Aso Abdullah | ||
− | + | * News sites: [https://www.awene.com/detail?article=44481 Awene], [https://www.knnc.net/Details.aspx?jimare=35292 KNN], [https://nrttv.com/News.aspx?id=40606&MapID=3 NRT], [https://www.rudaw.net/ Rudaw], [https://www.xendan.org/?__cf_chl_jschl_tk__=40b26372944f5b2f5030a42ce2e242824802bf7f-1613677479-0-AdIOUD4PpqZJRxo5UoABrk12U7vL-b-MImh9McpzkptoSp9XEfaSUWZ8CMCghXsnze0cZ_uNou_nz_-dlrtYGSZedfD70albLiukFs_f9mQpfQ4eZtm6GftmTY4oPlVII7xBMhGVVlZdd-y4EqXa0jcXmmvumK-YPYb-72mOZe6Qmfr-Jz_fHbzlznT-kWYV93t-VPAMTzKTp_HzaVf5G3b7dkBX2EpBV302Z28AIubfw293-cWCuVQUOn8cewTE01fknrRdM3p8-m32G7wy1YleNKq-6OR8CuMBYzJ2jxvBmniY9NYxlQ-AUlApfeHigQ Xendan]: aggregation of sites with ~300 words/article | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | * [https://sites.google.com/site/abasshiwan2/ Abas Shiwan's blog] (with Latin script transliterations) | |
− | + | * [http://ala-hooshiar.blogspot.com/ Ala Hooshiar's blog] | |
− | * [ | + | * [http://amjad-shakely.blogspot.com/ Amjad Shakely's blog] |
− | + | == Developed Resources == | |
− | * [ | + | * [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb-keyboard Keyboard] (see [https://wikis.swarthmore.edu/ling073/Central_Kurdish/Keyboard Wiki page]) |
+ | * [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb Transducer] (see [https://wikis.swarthmore.edu/ling073/Central_Kurdish/Transducer Wiki page]) | ||
+ | * [https://github.swarthmore.edu/Ling073-sp21/ling073-ckb-eng Resources for Kurdish-English Machine Translation] (see [https://wikis.swarthmore.edu/ling073/Central_Kurdish_and_English Wiki page]) |
Latest revision as of 09:56, 13 April 2021
Below is a list of resources relevant to the Sorani Kurdish language. Resources are categorized according to type of content. I own italicized resources in PDF format. I have also flagged resources I have not yet obtained.
For the corpus assembly lab, a repository of plain text files has been created using excerpts from some of the resources listed below.
Contents
External Resources
Computational Resources
- Keyboard Layout (seems to be quite common)
- .txt lists of words, word-level digrams, and character-level trigrams
- Latin-based phonetic keyboard layout that I own on my computer
- does this mean that it came with your OS or that it's a custom one you (or someone else?) created? -Jonathan
- Rule-based Kurdish Text Transliteration System: Latin-Arabic orthography conversion
- Kurdish Speech to Text: early attempt to develop speech recognition for Sorani Kurdish
- Towards Machine Translation for the Kurdish Language: Sorani machine translation model
Dictionaries
- Sorani Vocabulary: vocab list with Latin script transliterations, by Harvard
- The Kurdish Dictionary: translates to and from any combination of ckb/kur/eng via query
- Dictionary of Scientific Terms: includes Sorani definitions of terms like "atom"
- Diccionaire Fondamental Kurde-Français-Sorani: French-Sorani dict with phrases & alphabet
Grammatical Descriptions
- Sorani Grammar: high-level description of important grammatical properties of Sorani
- Kurdish Basic Course: has helpful chapters on grammar and word lists
- A Reference Grammar with Selected Readings: extensive descriptions of Sorani grammar
Scientific Works
- Building a Lemmatizer and a Spell-Checker for Sorani Kurdish: includes background on Sorani morphology
- Kurdish Romanization Table: potentially useful information on Sorani alphabet and orthography
- Kurdish Interdialect Machine Translation: suggests a method for Kurmanji-Sorani machine translation
- Building a Test Collection for Sorani Kurdish: outlines a Test Collection project + list of affixes
Corpora
- OLAC Resources (how do I access these?)
- Try clicking the links on that page and then looking for "Identifier (URI)" - Daniel
- Sorani Wikipedia: many articles averaging a few paragraphs in length
- Sorani Bible: one of a few different versions of the Bible
- JW Website: few dozen entries accompanied by voice narration
- Hawler Gov: various entries on the Kurdish capital
- شازاده چکۆله: Sorani version of The Little Prince, by Aso Abdullah
- Abas Shiwan's blog (with Latin script transliterations)