Our github repository can be found here: Our Github
- As of now, our Transducer passes 76/76 tests generated from our Wikipedia page.
- Our initial transducer was capable of analyzing approximately 20% of our corpus file.
- Additions following initial analysis
- Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus.
- Added common adverbs and the noun for village, which brought coverage to 23.1%
- Added verb for "to say," which improved coverage to 23.9%
|ido<cnjsub> ↔ ido||when|
|inja<cnjcoo> ↔ inja||so|
|mnu<n> ↔ mnu||village|
|wer<adv> ↔ wer||again|
|kwar<adv> ↔ kwar||already|
|kaku<adv> ↔ kaku||very|
|obe<v><tv><p3><sg> ↔ dobe||say|
- Our transducer has 69 stems.
- Current coverage: 23.9%, 2,985 words.
- We pass 76 tests of bhw.yaml
- We pass 6 tests of commonwords.yaml
Initial Evaluation of Morphological Generation
- On our initial run of the generation tests, 107 tests were generated, of which we passed 75 and failed 32
- Initial coverage: 23.9%, 2,985 words.
Final Evaluation of Morphological Generation
- Some of the tests were removed as we honed in on grammatical forms
- After updates to our transducer, we passed 77 out of 80 tests.
- We couldn't pass the remaining tests because of a complex rule that we don't have the tools to write yet, involving the form of the prior word in a sentence.
- We added 3 twol rules.
- Final coverage: 31.06% - 4,442/14,148