Difference between revisions of "Navajo and English"

From LING073
Jump to: navigation, search
(Sentence Evaluation)
(NAV -> ENG Evaluation)
Line 18: Line 18:
==NAV -> ENG Evaluation==
==NAV -> ENG Evaluation==
=== Coverage Analysis ===
=== Coverage Analysis ===
* Monolingual transducer coverage of small corpus:  407 / 1216 (~33.47%)
* Monolingual transducer coverage of small corpus:  407 / 1216 (~33.47%)
* Bilingual transducer coverage of small corpus:    533 / 1345 (~39.63%)
* Bilingual transducer coverage of small corpus:    533 / 1345 (~39.63%)
=== Sentence Evaluation ===
=== Sentence Evaluation ===

Revision as of 11:05, 6 May 2022

Resources for machine translation between Navajo and English

Developed Resources

External Resources


NAV -> ENG Evaluation

Coverage Analysis

  • Monolingual transducer coverage of small corpus: 407 / 1216 (~33.47%)
  • Bilingual transducer coverage of small corpus: 533 / 1345 (~39.63%)

Sentence Evaluation


 Original sentence: Dibé bikééʼ dínááh.
 Intended Translation: Go after the sheep.
 Biltrans Output: ^Rusa<n>/Deer<n>$ ^i<prn><pers><p3><sg><spc><giv>/the<det><def><sp><giv>$ ^ne<det><dem>/@ne<det><dem>$ ^ores<v><iv><p3><sg>/stand<vblex><iv><p3><sg>
 Translation Output: #Wood *bikééʼ #go.


 Original sentence: Nimá dóó nizhéʼé bíighah nídaah.
 Intended Translation: Sit beside your mother and father.
 Biltrans Output: ^Ras<n>/Day<n>$ ^i<prn><pers><p3><sg><spc><giv>/the<det><def><sp><giv>$ ^dares<n>/sun<n>$ ^i<prn><pers><p3><sg><spc>/the<det><def><sp>$ ^sam<v><iv><p3><sg>/hot<vblex><iv><p3><sg>$ ^kaku<adv>/very<adv>$ ^inja<cnjcoo>/so<cnjadv>$ ^mbrow<v><iv><p3><du>/thirsty<vblex><iv><p3><du>
 Translation Output: #Mother and #father *bíighah #sit.


 Original sentence: Chidí biyiʼ ayóo deesdoi.  
 Intended Translation: It is very hot inside the vehicle.
 Biltrans Output: ^Ikak<n>/Snake<n>$ ^i<prn><pers><p3><sg><spc><giv>/the<det><def><sp><giv>$ ^yasne<det><dem>/@yasne<det><dem>$ ^nas<v><tv><p3><sg>/smell<vblex><tv><p3><sg>$ ^i<prn><pers><p3><sg>/prpers<prn><subj><p3><m><sg>
 Translation Output: #Automobile *biyiʼ *ayóo #hot.


 Original sentence: Kodi atooʼ hólǫ́.
 Intended Translation: Here is some stew.
 Biltrans Output: ^Vark<v><iv><p3><pc>/Lie<vblex><iv><p3><pc>/Live<vblex><iv><p3><pc>$ ^ro<pr>/at<pr>$ ^mnu<n>/village<n>$ ^i<prn><pers><p3><sg><spc>/the<det><def><sp>$ ^ne<det><dem>/@ne<det><dem>$^.<sent>/.<sent>$
 Translation Output: Here #stew *hólǫ́.


 Original sentence: Atooʼ łaʼ naa deeshkááł. 
 Intended Translation: I will give you some stew.
 Biltrans Output: ^Mnai<v><iv><p3><sg>/Stop<vblex><iv><p3><sg>$ ^kwar<adv>/already<adv>$^?<sent>/?<sent>$^.<sent>/.<sent>$
 Translation Output: #Stew #some #around #give.


 Original sentence: Wóláchííʼ bighan binaa ałhéénílyeed. 
 Intended Translation: Go around the ant mound.
 Biltrans Output: ^Sampe<adv>/Then<adv>$ ^fur<v><tv><p1><pl><ex>/build<vblex><tv><p1><pl><ex>$ ^rum<n>/house<n>$ ^na<prn><pers><p3><pl><inan><spc>/the<det><def><sp>$ ^ne<det><dem>/@ne<det><dem>$ ^ra<pr>/until<pr>$ ^bro<v><iv><p3><pl><inan>/empty<adj><p3><pl><inan>$ ^romawa<n>/boy<n>$ ^si<prn><pers><p3><pl><an><spc>/the<det><def><sp>$ ^farkor<v><iv><p3><pl><an>/study<vblex><iv><p3><pl><an>$^.<sent>/.<sent>$^.<sent>/.<sent>$
 Translation Output: #Ant #house *binaa #go. 


 Original sentence: Jooł nikídílniihí tsáskʼeh biyaa íímááz. 
 Intended Translation: The basketball rolled underneath the bed.
 Biltrans Output: ^Kain<v><iv><p1><pl><ex>/Sit<vblex><iv><p1><pl><ex>$ ^do<pr>/at<pr>$ ^Sepse<n>/Sepse<n>$ ^fa<pr>/to<pr>$ ^fafyar<v><iv><p1><pl><ex>/tell<vblex><iv><p1><pl><ex>$^.<sent>/.<sent>$^.<sent>/.<sent>$
 Translation Output: #basketball #Ball #bed *biyaa #roll. 


 Original sentence: Shiyázhí, hoghandi naanishísh ałtso íinilaa?
 Intended Translation: My child, did you finish your homework?
 Biltrans Output:^Snewar<n>/Belly<n>$ ^det<det><pos><px3sg><sg><spc>/prpers<det><pos><px3sg><sg><spc>$ ^ba<v><iv><p3><sg>/big<adj><p3><sg>$^.<sent>/.<sent>$^.<sent>/.
 Translation Output: #Little, #home *naanishísh #completed #finish?


 Original sentence: Shiyázhí, nízhiʼ naaltsoos bikááʼ íníleeh.
 Intended Translation: My child, write your name on the paper.
 Biltrans Output: ^Snai<v><tv><p3><sg>/Enlighten<vblex><tv><p3><sg>$ ^aya<prn><pers><p1><sg>/prpers<prn><subj><p1><mf><sg>$ ^ro<pr>/at<pr>$ ^marandan<n>/trip<n>$ ^det<det><pos><px1sg><sg><spc>/prpers<det><pos><px1sg><sg><spc>$^.<sent>/.<sent>$^.<sent>/.<sent>
 Translation Output: #Little, *nízhiʼ #paper *bikááʼ #write.


 Original sentence: Naaltsoos tsitsʼaaʼ naaltsoos atseedzį́ biiʼ hadéébįįd.
 Intended Translation: The cardboard box is filled with newspapers.
 Biltrans Output: ^Ikak<n>/Snake<n>$ ^i<prn><pers><p3><sg><spc><giv>/the<det><def><sp><giv>$ ^ne<det><dem>/@ne<det><dem>$ ^snonsnon<n>/name<n>$ ^det<det><pos><px3sg><sg><spc>/prpers<det><pos><px3sg><sg><spc>$ ^Kormsamba<n>/Kormsamba<n>$^.<sent>/.<sent>$^.<sent>/.<sent>$
 Translation Output: #Paper *tsitsʼaaʼ #paper *atseedzį́ *biiʼ #fill



  • Added a disambiguation rule, selecting the article form of the word "na" over the pronoun form when preceded by noun
    • Brought ambiguation in corpus from ~1.04 to ~1.03. Note an increase from our original ambiguation score due to more words being added.

Structural Transfer

  • Added a rule which added an implicit subject in the absence of an explicit one.
  • Changed rule to specify the type (subject/object) of pronoun.
  • Added a rule which specified def/dem determiner phrases
  • Added a rule that correctly translated the adverb endings on determiners
  • These changes brought our WER from 72.22% to 27.78% and our PER from 63.89% to 22.22%

Adding Stems

  • Added ~new 100 new stems

Polished RBMT System

  • Precision: 87.77293%
  • Recall: 94.81132%
  • Coverage over large corpus: 7072 / 14287 (~0.49499545040946314832)
  • Stems in transducer: 382
  • Over bhw.longer.txt:
    • Word Error Rate (WER): 80.28 %
    • Position-independent word error rate (PER): 71.83 %
    • Percentage of unknown words: 15.54 %
    • Number of position-independent correct words: 81/284
    • Coverage: 211 / 250 (0.844)
  • Over bhw.corpus.large.txt
    • Coverage: 5817 / 13768 (~0.42250145264381173736)