Biak/Transducer
From LING073
Contents
Code
Our github repository can be found here: Our Github
Analyser Evaluation
- As of now, our Transducer passes 76/76 tests generated from our Wikipedia page.
- Our initial transducer was capable of analyzing approximately 20% of our corpus file.
- Additions following initial analysis
- Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus.
- Added common adverbs and the noun for village, which brought coverage to 23.1%
- Added verb for "to say," which improved coverage to 23.9%
Form | Meaning |
---|---|
ido<cnjsub> ↔ ido | when |
inja<cnjcoo> ↔ inja | so |
mnu<n> ↔ mnu | village |
wer<adv> ↔ wer | again |
kwar<adv> ↔ kwar | already |
kaku<adv> ↔ kaku | very |
obe<v><tv><p3><sg> ↔ dobe | say |
- Our transducer has 69 stems.
- Current coverage: 23.9%, 2,985 words.
- We pass 76 tests of bhw.yaml
- We pass 6 tests of commonwords.yaml
Generator Evaluation
Initial Evaluation of Morphological Generation
- On our initial run of the generation tests, 107 tests were generated, of which we passed 75 and failed 32
- Initial coverage: 23.9%, 2,985 words.
Final Evaluation of Morphological Generation
- Some of the tests were removed as we honed in on grammatical forms
- After updates to our transducer, we passed 80 out of 86 tests.
- We couldn't pass the remaining tests because of a complex rule that we don't have the tools to write yet, involving the form of the prior word in a sentence.
- We added 3 twol rules.
- Final coverage: 31.06% - 4,442/14,148