Difference between revisions of "Central Kurdish and English/Lexical selection"

From LING073
Jump to: navigation, search
(ckb → eng)
(ckb → eng)
 
(4 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
=== زمان → tongue, language ===
 
=== زمان → tongue, language ===
  
The word translates into "tongue" if the sentence contains the Kurdish words for mouth and tooth.
+
The word translates into "language" if the sentence contains the names of some other languages. The default translation is "tongue."
 +
 
 +
* زمانی سەگەکە → "the dog's tongue"
 +
* زمانی کوردی → "the Kurdish language"
  
 
=== ژن → woman, wife ===
 
=== ژن → woman, wife ===
  
If the word is followed by a definite or indefinite prefix (ژنەکە، ژنێک) then it is probably "a/the woman" and not "a/the wife." Another heuristic is that the verb هێنان (to bring) is used commonly with "wife" in contexts such as marriage, so if the word is followed by هێنان, it is wife.
+
This is not a real solution; the default translation is just set as "woman" because it's more frequent.
 +
 
 +
* ژن و مێرد → "husband and wife"
 +
* ژنەکە هات → "the woman came"
  
 
=== بیر → (water) well, thought ===
 
=== بیر → (water) well, thought ===
  
 
The translation defaults to "thought" unless the word is close to "water" or "oil."
 
The translation defaults to "thought" unless the word is close to "water" or "oil."
 +
 +
* بیری ئاو → "water well"
 +
* بیرێکی باش → "a good thought"

Latest revision as of 22:45, 25 April 2021


ckb → eng

There are several cases of one-to-many mappings of lexemes from Sorani Kurdish to English. The objective of this assignment is to be able to choose the correct mapping in different situations depending on the context a lexeme is used. In order to pick a default for each pair of translations, I cited the frequency distribution of English lemmae to get an idea of which one is more commonly used.

زمان → tongue, language

The word translates into "language" if the sentence contains the names of some other languages. The default translation is "tongue."

  • زمانی سەگەکە → "the dog's tongue"
  • زمانی کوردی → "the Kurdish language"

ژن → woman, wife

This is not a real solution; the default translation is just set as "woman" because it's more frequent.

  • ژن و مێرد → "husband and wife"
  • ژنەکە هات → "the woman came"

بیر → (water) well, thought

The translation defaults to "thought" unless the word is close to "water" or "oil."

  • بیری ئاو → "water well"
  • بیرێکی باش → "a good thought"