Chechen/Transducer

From LING073
Jump to: navigation, search

Transducer Evaluation

Transducer at Github

Transducer: https://github.swarthmore.edu/Ling073-sp19/ling073-che.git

Note

All 73 tests in che.yaml file passed.

Hard-coded Forms

  • йоӏ<n><cl_j><abs><pl> ↔ мехкарий: girl, girls
  • стаг<n><cl_j2><abs><pl> ↔ нах: person, people
  • These forms are hard-coded because they are irregular plural forms, which do not conform to any rule.
  • а<conj> ↔ а: and
  • This form also hard-coded because there is no morphological change at all for conjunction class.
  • иза<prn><pers><abs><sg> ↔ иза: he/she/it
  • уьш<prn><pers><abs><pl> ↔ уьш: they
  • и<prn><pers><abs><pl> ↔ и: "that"
  • These forms are all pronouns.

Additional Top Words

Top Words Coverage Before Coverage After
а<conj> ↔ а 1.59% 15.73%
ала<v><iv><past><rem><wit> ↔ элира 15.73% 17.03%
латта<v><iv><pres> ↔ лаьтта 17.03% 18.23%
Дала<np><abs><sg> ↔ Дала 18.23% 20.45%
масса<n><cl_j2><erg><sg> ↔ массо 20.45% 21.27%
иза<prn><pers><p3><pl><abs> ↔ иза 21.27% 22.09%
уьш<prn><pers><p3><pl><abs> ↔ уьш 22.09% 22.62%
и<prn><dem><dst> ↔ и 22.62% 23.44%
тайпа<n><cl_d><dat><sg> ↔ тайпана 23.44% 24.17%

Evaluation

  • Total number of stems in the transducer: 55
  • Current coverage over your combined corpus: 24.17%
  • The current list of top unknown words returned by aq-covtest:
  • 64 т
  • 60 у
  • 37 х
  • 22 Веза
  • 21 д
  • 20 Т
  • 20 аккха
  • 18 е
  • 18 ехь
  • 18 ера
  • 15 ду
  • 12 к
  • 12 хир
  • 12 аьлла
  • 11 хи
  • 11 дерриге
  • 11 адам
  • 10 де
  • 10 Иштта
  • 10 цу
  • Number of tests that pass in yaml files
yaml file # of passed tests # of total tests
che.yaml 73 73
common words.yaml 8 34

Generator Evaluation

Initial Evaluation of Morphological Generation

#of passing analysis tests: 73
#of failing analysis tests: 73
Current coverage: 26.04%
#of passing generation tests: 69
#of failing generation tests: 68

Final Evaluation of Morphological Generation

#of passing generation tests: 68
#of failing generation tests: 7
#of two rules added: 20
Current coverage: 23.88%