Berik/Transducer
From LING073
Code: https://github.swarthmore.edu/jspring1/ling073-bkl
Evaluation
- Current corpus coverage: 34.65%
- Number of stems: 112
- Does not include pronouns (4?) and verbs (3?)
- Top unknown words:
- jei (84)
- aa (64)
- jeta (42)
- jeiserem (36)
- Jei (35)
- ge (31)
- taterisi (11)
- aane (11)
- jebe (9)
- bosna (9)
- anes (9)
- asal (8)
- asala (8)
- Aamai (8)
- temawer (8)
- Jepga (8)
- aaiserem (8)
- enggame (7)
- Taterisi (7)
- Sanbagiri (6)
- Analyzer tests passing: 85/117 (72.65%)
- Generator tests passing: 85/152 (55.92%)
- Corpus tests passing: 14/33 (42.42%)
Notes
Initial corpus coverage was 21.98%.
Coverage was raised to 34.65% by adding
- jamere<locl> ↔ jamere
- Yesus<n> ↔ Yesus
- Yusuf<n> ↔ Yusuf
- Maria<n> ↔ Maria
- Daud<n> ↔ Daud
- angtane<n> ↔ angtane
- raja<n> ↔ raja
- taman<n> ↔ taman
- kapka<adj> ↔ kapka
- se<imp> ↔ se
- je<prn><pos> ↔ jemna
- je<prn><subj> ↔ jam
- je<prn>+wer<post> ↔ jewer
- gam<part> ↔ gam