Difference between revisions of "Biak/Transducer"
From LING073
(→Evaluation) |
|||
(19 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
==Code== | ==Code== | ||
Our github repository can be found here: [https://github.swarthmore.edu/Ling073-sp21/ling073-bhw Our Github] | Our github repository can be found here: [https://github.swarthmore.edu/Ling073-sp21/ling073-bhw Our Github] | ||
− | ==Evaluation== | + | |
+ | ==Analyser Evaluation== | ||
*As of now, our Transducer passes 76/76 tests generated from our Wikipedia page. | *As of now, our Transducer passes 76/76 tests generated from our Wikipedia page. | ||
*Our initial transducer was capable of analyzing approximately 20% of our corpus file. | *Our initial transducer was capable of analyzing approximately 20% of our corpus file. | ||
*Additions following initial analysis | *Additions following initial analysis | ||
** Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus. | ** Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus. | ||
+ | ** Added common adverbs and the noun for village, which brought coverage to 23.1% | ||
+ | ** Added verb for "to say," which improved coverage to 23.9% | ||
{| class="wikitable" | {| class="wikitable" | ||
|+ Common words | |+ Common words | ||
Line 11: | Line 14: | ||
! Form !! Meaning | ! Form !! Meaning | ||
|- | |- | ||
− | | {{morphTest|ido | + | | {{morphTest|ido{{tag|cnjsub}}|ido}} || when |
|- | |- | ||
− | | {{morphTest|inja | + | | {{morphTest|inja{{tag|cnjcoo}}|inja}} || so |
+ | |- | ||
+ | | {{morphTest|mnu{{tag|n}}|mnu}} || village | ||
+ | |- | ||
+ | | {{morphTest|wer{{tag|adv}}|wer}} || again | ||
+ | |- | ||
+ | | {{morphTest|kwar{{tag|adv}}|kwar}} || already | ||
+ | |- | ||
+ | | {{morphTest|kaku{{tag|adv}}|kaku}} || very | ||
+ | |- | ||
+ | | {{morphTest|obe{{tag|v}}{{tag|tv}}{{tag|p3}}{{tag|sg}}|dobe}} || say | ||
|} | |} | ||
+ | * Our transducer has 69 stems. | ||
+ | * Current coverage: 23.9%, 2,985 words. | ||
+ | * We pass 76 tests of bhw.yaml | ||
+ | * We pass 6 tests of commonwords.yaml | ||
+ | |||
+ | ==Generator Evaluation== | ||
+ | ===Initial Evaluation of Morphological Generation=== | ||
+ | *On our initial run of the generation tests, 107 tests were generated, of which we passed 75 and failed 32 | ||
+ | *Initial coverage: 23.9%, 2,985 words. | ||
+ | |||
+ | ===Final Evaluation of Morphological Generation=== | ||
+ | *Some of the tests were removed as we honed in on grammatical forms | ||
+ | *After updates to our transducer, we passed 77 out of 80 tests. | ||
+ | *We couldn't pass the remaining tests because of a complex rule that we don't have the tools to write yet, involving the form of the prior word in a sentence. | ||
+ | *We added 3 twol rules. | ||
+ | *Final coverage: 31.06% - 4,442/14,148 | ||
+ | |||
− | |||
− | [[Category:Sp21_Transducers]] | + | [[Category:Sp21_Transducers]][[Category:Biak]] |
Latest revision as of 13:30, 4 April 2021
Contents
Code
Our github repository can be found here: Our Github
Analyser Evaluation
- As of now, our Transducer passes 76/76 tests generated from our Wikipedia page.
- Our initial transducer was capable of analyzing approximately 20% of our corpus file.
- Additions following initial analysis
- Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus.
- Added common adverbs and the noun for village, which brought coverage to 23.1%
- Added verb for "to say," which improved coverage to 23.9%
Form | Meaning |
---|---|
ido<cnjsub> ↔ ido | when |
inja<cnjcoo> ↔ inja | so |
mnu<n> ↔ mnu | village |
wer<adv> ↔ wer | again |
kwar<adv> ↔ kwar | already |
kaku<adv> ↔ kaku | very |
obe<v><tv><p3><sg> ↔ dobe | say |
- Our transducer has 69 stems.
- Current coverage: 23.9%, 2,985 words.
- We pass 76 tests of bhw.yaml
- We pass 6 tests of commonwords.yaml
Generator Evaluation
Initial Evaluation of Morphological Generation
- On our initial run of the generation tests, 107 tests were generated, of which we passed 75 and failed 32
- Initial coverage: 23.9%, 2,985 words.
Final Evaluation of Morphological Generation
- Some of the tests were removed as we honed in on grammatical forms
- After updates to our transducer, we passed 77 out of 80 tests.
- We couldn't pass the remaining tests because of a complex rule that we don't have the tools to write yet, involving the form of the prior word in a sentence.
- We added 3 twol rules.
- Final coverage: 31.06% - 4,442/14,148