Central Kurdish/Transducer

From LING073
Revision as of 08:54, 19 March 2021 by Rkamal1 (talk | contribs)

Jump to: navigation, search

Code

GitHub Repository

Evaluation

Stems

The total number of stems can be found below:

  • 8 N-Stems
  • 4 Definite/Plural
  • 4 Verbs_Inf (infinitives)
  • 4 V-Stems_1
  • 4 V-Stems_2
  • 6 Subject_Prn
  • 4 Imperatives
  • 6 Prns
  • 3 Adj-Stem
  • 2 Comparatives
  • 3 Prepositions
  • 3 Conjunctions
  • 2 Adverbs
  • 2 Npast

Coverage

The total coverage over the corpus was 7.3%. After adding three common words for water, earth, and god (all just ن<n> ↔ {{{2}}})

Tests

The transducer currently passes 70/101 (69%) tests. It seems to do well with noun morphology and most verb morphology. The remaining 31 tests fail for the following reasons:

  • There is an issue with some words containing the letter 'ە' that are possibly encoded strangely in Unicode, and it is making some straightforward tests fail.
  • The izafa enclitic was skipped (not implemented). All other grammar points were attempted in some way.
  • Some verbs, particularly هاتن (to come), have irregular stems and/or different imperative/non-past stems. Because only one lexicon was used for both types of verbs, this could not be accounted for.

Notes