Difference between revisions of "Biak/Transducer"

From LING073
Jump to: navigation, search
(Evaluation)
 
(23 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==Code==
 
==Code==
 
Our github repository can be found here: [https://github.swarthmore.edu/Ling073-sp21/ling073-bhw Our Github]
 
Our github repository can be found here: [https://github.swarthmore.edu/Ling073-sp21/ling073-bhw Our Github]
==Evaluation==
+
 
 +
==Analyser Evaluation==
 
*As of now, our Transducer passes 76/76 tests generated from our Wikipedia page.
 
*As of now, our Transducer passes 76/76 tests generated from our Wikipedia page.
 
*Our initial transducer was capable of analyzing approximately 20% of our corpus file.
 
*Our initial transducer was capable of analyzing approximately 20% of our corpus file.
 
*Additions following initial analysis
 
*Additions following initial analysis
** Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus. {{morphTest|ido<cnjsub>|ido}}, which translates to when, and {{morphTest|inja<cnjcoo>|inja}} which translates to so.
+
** Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus.  
 +
** Added common adverbs and the noun for village, which brought coverage to 23.1%
 +
** Added verb for "to say," which improved coverage to 23.9%
 +
{| class="wikitable"
 +
|+ Common words
 +
|-
 +
! Form !! Meaning
 +
|-
 +
| {{morphTest|ido{{tag|cnjsub}}|ido}} || when
 +
|-
 +
| {{morphTest|inja{{tag|cnjcoo}}|inja}} || so
 +
|-
 +
| {{morphTest|mnu{{tag|n}}|mnu}} || village
 +
|-
 +
| {{morphTest|wer{{tag|adv}}|wer}} || again
 +
|-
 +
| {{morphTest|kwar{{tag|adv}}|kwar}} || already
 +
|-
 +
| {{morphTest|kaku{{tag|adv}}|kaku}} || very
 +
|-
 +
| {{morphTest|obe{{tag|v}}{{tag|tv}}{{tag|p3}}{{tag|sg}}|dobe}} || say
 +
|}
 +
* Our transducer has 69 stems.
 +
* Current coverage: 23.9%, 2,985 words.
 +
* We pass 76 tests of bhw.yaml
 +
* We pass 6 tests of commonwords.yaml
 +
 
 +
==Generator Evaluation==
 +
===Initial Evaluation of Morphological Generation===
 +
*On our initial run of the generation tests, 107 tests were generated, of which we passed 75 and failed 32
 +
*Initial coverage: 23.9%, 2,985 words.
 +
 
 +
===Final Evaluation of Morphological Generation===
 +
*Some of the tests were removed as we honed in on grammatical forms
 +
*After updates to our transducer, we passed 77 out of 80 tests.
 +
*We couldn't pass the remaining tests because of a complex rule that we don't have the tools to write yet, involving the form of the prior word in a sentence.
 +
*We added 3 twol rules.
 +
*Final coverage: 31.06% - 4,442/14,148
 +
 
  
==Notes==
 
  
[[Category:Sp21_Transducers]]
+
[[Category:Sp21_Transducers]][[Category:Biak]]

Latest revision as of 14:30, 4 April 2021

Code

Our github repository can be found here: Our Github

Analyser Evaluation

  • As of now, our Transducer passes 76/76 tests generated from our Wikipedia page.
  • Our initial transducer was capable of analyzing approximately 20% of our corpus file.
  • Additions following initial analysis
    • Added common conjunctions, allowing our transducer to cover 21.86%, then 22.6% of our original corpus.
    • Added common adverbs and the noun for village, which brought coverage to 23.1%
    • Added verb for "to say," which improved coverage to 23.9%
Common words
Form Meaning
ido<cnjsub> ↔ ido when
inja<cnjcoo> ↔ inja so
mnu<n> ↔ mnu village
wer<adv> ↔ wer again
kwar<adv> ↔ kwar already
kaku<adv> ↔ kaku very
obe<v><tv><p3><sg> ↔ dobe say
  • Our transducer has 69 stems.
  • Current coverage: 23.9%, 2,985 words.
  • We pass 76 tests of bhw.yaml
  • We pass 6 tests of commonwords.yaml

Generator Evaluation

Initial Evaluation of Morphological Generation

  • On our initial run of the generation tests, 107 tests were generated, of which we passed 75 and failed 32
  • Initial coverage: 23.9%, 2,985 words.

Final Evaluation of Morphological Generation

  • Some of the tests were removed as we honed in on grammatical forms
  • After updates to our transducer, we passed 77 out of 80 tests.
  • We couldn't pass the remaining tests because of a complex rule that we don't have the tools to write yet, involving the form of the prior word in a sentence.
  • We added 3 twol rules.
  • Final coverage: 31.06% - 4,442/14,148