Difference between revisions of "Waray/Final Project"

From LING073
Jump to: navigation, search
(Later Anaylser Evaluation)
 
(7 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
===Initial Anaylser Evaluation===
 
===Initial Anaylser Evaluation===
 
* Coverage: 742 / 1239 (~0.59887005649717514124)
 
* Coverage: 742 / 1239 (~0.59887005649717514124)
* Remaining Unknown Words: 487
+
* Remaining Unknown Forms: 487
  
 
====Lexical Information====
 
====Lexical Information====
Line 13: Line 13:
 
* Pattern entries: 9
 
* Pattern entries: 9
  
====Counts for individual lexicon====
+
=====Counts for individual lexicon=====
 
* NounRoot: 23
 
* NounRoot: 23
 
* Determiners: 13
 
* Determiners: 13
Line 26: Line 26:
 
* All anonymous lexicons: 6
 
* All anonymous lexicons: 6
  
====Tests====
+
=====Tests=====
 
* '''war.yaml''': Total passes: 57, Total fails: 14, Total: 71
 
* '''war.yaml''': Total passes: 57, Total fails: 14, Total: 71
 
* '''commonWords.yaml''': Total passes 4, Total fails: 16, Total: 20
 
* '''commonWords.yaml''': Total passes 4, Total fails: 16, Total: 20
  
====Current Unknown Words====
+
=====Current Unknown Words=====
 
TOP UNKNOWN WORDS:
 
TOP UNKNOWN WORDS:
 
     15 ^nagkaada/*nagkaada$
 
     15 ^nagkaada/*nagkaada$
Line 53: Line 53:
 
       6 ^aga/*aga$
 
       6 ^aga/*aga$
  
===Notes===
+
====Notes====
 
* Tests for verbalized nouns not implemented yet
 
* Tests for verbalized nouns not implemented yet
  
==Later Generator Evaluation==
+
===Initial Generator Evaluation===
 
Analyzer:
 
Analyzer:
 
* Total passes: 57
 
* Total passes: 57
Line 70: Line 70:
  
 
===Later Anaylser Evaluation===
 
===Later Anaylser Evaluation===
* Coverage: 935 / 1401 (~0.66738044254104211278)
+
* Coverage: 1025 / 1494 (~0.68607764390896921017)
* Remaining Unknown Words: 466
+
* Remaining unknown forms: 469
  
===Lexical Information===
+
====Lexical Information====
* Lexicons: 28
+
* Lexicons: 29
* Lexicon entries: 269
+
* Lexicon entries: 525
* Patterns: 4
+
* Patterns: 19
* Pattern entries: 18
+
* Pattern entries: 19
  
Counts for individual lexicons:
+
====Counts for individual lexicons====
* NounRoot: 62
+
* NounRoot: 194
 
* Determiners: 13
 
* Determiners: 13
 
* PluralDet: 2
 
* PluralDet: 2
 
* VerbPrefixes: 7
 
* VerbPrefixes: 7
* VerbStems: 24
+
* VerbStems: 94
 
* Pronouns: 34
 
* Pronouns: 34
* ProperNouns: 5
+
* ProperNouns: 15
 
* Adverbs: 11
 
* Adverbs: 11
 
* Auxiliary: 1
 
* Auxiliary: 1
Line 92: Line 92:
 
* Num-Lex: 23
 
* Num-Lex: 23
 
* Conjunctions: 2
 
* Conjunctions: 2
* Adjectives: 18
+
* Adjectives: 61
 
* Numeral: 10
 
* Numeral: 10
 
* Num-SecondLex: 23
 
* Num-SecondLex: 23
 
* All anonymous lexicons: 12
 
* All anonymous lexicons: 12
  
====Current Unknown Words====
+
=====Current Unknown Words=====
TTOP UNKNOWN WORDS:
+
TOP UNKNOWN WORDS:
 
     15 ^nagkaada/*nagkaada$
 
     15 ^nagkaada/*nagkaada$
 
     12 ^hito/*hito$
 
     12 ^hito/*hito$
    11 ^uyon/*uyon$
 
 
     11 ^linarang/*linarang$
 
     11 ^linarang/*linarang$
 
     10 ^may/*may$
 
     10 ^may/*may$
Line 107: Line 106:
 
       7 ^tagsa/*tagsa$
 
       7 ^tagsa/*tagsa$
 
       7 ^ngatanan/*ngatanan$
 
       7 ^ngatanan/*ngatanan$
 +
      7 ^man/*man$
 
       7 ^kalamrag/*kalamrag$
 
       7 ^kalamrag/*kalamrag$
 
       7 ^basi/*basi$
 
       7 ^basi/*basi$
Line 118: Line 118:
 
       5 ^Jehova/*Jehova$
 
       5 ^Jehova/*Jehova$
 
       4 ^tanom/*tanom$
 
       4 ^tanom/*tanom$
       4 ^os/*os$
+
       4 ^nagkikiwa/*nagkikiwa$
  
===Notes===
+
====Notes====
* Added two more twol rules to handle verb infixes.
+
* Added more twol rules  
 +
* Added 215 more stems
  
 
[[Category:Sp21_FinalProjects]] [[Category:Waray]]
 
[[Category:Sp21_FinalProjects]] [[Category:Waray]]

Latest revision as of 02:28, 29 May 2021

Developed Resources

Expanding Our Morphological Transducer

Initial Anaylser Evaluation

  • Coverage: 742 / 1239 (~0.59887005649717514124)
  • Remaining Unknown Forms: 487

Lexical Information

  • Lexicons: 17
  • Lexicon entries: 120
  • Patterns: 1
  • Pattern entries: 9
Counts for individual lexicon
  • NounRoot: 23
  • Determiners: 13
  • PluralDet: 2
  • VerbPrefixes: 5
  • VerbStems: 5
  • Pronouns: 31
  • ProperNouns: 1
  • Adverbs: 11
  • Auxiliary: 1
  • Punctuaion: 22
  • All anonymous lexicons: 6
Tests
  • war.yaml: Total passes: 57, Total fails: 14, Total: 71
  • commonWords.yaml: Total passes 4, Total fails: 16, Total: 20
Current Unknown Words

TOP UNKNOWN WORDS:

    15 ^nagkaada/*nagkaada$
    12 ^hito/*hito$
    11 ^uyon/*uyon$
    11 ^linarang/*linarang$
    10 ^may/*may$
    10 ^langit/*langit$
     9 ^klase/*klase$
     9 ^hayop/*hayop$
     8 ^katubigan/*katubigan$
     7 ^tagsa/*tagsa$
     7 ^ngatanan/*ngatanan$
     7 ^kalamrag/*kalamrag$
     7 ^basi/*basi$
     6 ^nalupad/*nalupad$
     6 ^liso/*liso$
     6 ^iton/*iton$
     6 ^haluag/*haluag$
     6 ^espasyo/*espasyo$
     6 ^bawbaw/*bawbaw$
     6 ^aga/*aga$

Notes

  • Tests for verbalized nouns not implemented yet

Initial Generator Evaluation

Analyzer:

  • Total passes: 57
  • Total fails: 14
  • Total tests: 71

Generator:

  • Total passes: 56
  • Total fails: 18
  • Total tests: 74
  • Currently, we have 4 rules in our twol file to handle verb conjugation.

Later Anaylser Evaluation

  • Coverage: 1025 / 1494 (~0.68607764390896921017)
  • Remaining unknown forms: 469

Lexical Information

  • Lexicons: 29
  • Lexicon entries: 525
  • Patterns: 19
  • Pattern entries: 19

Counts for individual lexicons

  • NounRoot: 194
  • Determiners: 13
  • PluralDet: 2
  • VerbPrefixes: 7
  • VerbStems: 94
  • Pronouns: 34
  • ProperNouns: 15
  • Adverbs: 11
  • Auxiliary: 1
  • Punctuation: 22
  • Num-Lex: 23
  • Conjunctions: 2
  • Adjectives: 61
  • Numeral: 10
  • Num-SecondLex: 23
  • All anonymous lexicons: 12
Current Unknown Words

TOP UNKNOWN WORDS:

    15 ^nagkaada/*nagkaada$
    12 ^hito/*hito$
    11 ^linarang/*linarang$
    10 ^may/*may$
     8 ^katubigan/*katubigan$
     7 ^tagsa/*tagsa$
     7 ^ngatanan/*ngatanan$
     7 ^man/*man$
     7 ^kalamrag/*kalamrag$
     7 ^basi/*basi$
     6 ^nalupad/*nalupad$
     6 ^liso/*liso$
     6 ^haluag/*haluag$
     6 ^espasyo/*espasyo$
     6 ^bawbaw/*bawbaw$
     5 ^kapawa/*kapawa$
     5 ^ginlarang/*ginlarang$
     5 ^Jehova/*Jehova$
     4 ^tanom/*tanom$
     4 ^nagkikiwa/*nagkikiwa$

Notes

  • Added more twol rules
  • Added 215 more stems