Difference between revisions of "Miskito and English"

From LING073
Jump to: navigation, search
(Evaluations)
(Evaluations)
 
(3 intermediate revisions by the same user not shown)
Line 4: Line 4:
  
 
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i>
 
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i>
 +
 +
==Developed Resources==
 +
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng Miq-Eng Lexican transfer repo]
 +
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq Miq transducer repo]
 +
*[https://github.com/apertium/apertium-eng apertium-eng repo]
 +
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng-corpus Corpus repo]
 +
 
== Initial Miq → Eng Evaluation ==
 
== Initial Miq → Eng Evaluation ==
 
===Corpus Coverage===
 
===Corpus Coverage===
Line 75: Line 82:
  
 
===Evaluations===
 
===Evaluations===
* Precision and Recall
+
*Precision: 100.00000%
 +
*Recall: 87.00787%
  
 
* Coverage over large corpus
 
* Coverage over large corpus
Line 101: Line 109:
 
Trimmed coverage over eng.longer.txt  
 
Trimmed coverage over eng.longer.txt  
 
* coverage: 67 / 109 (~0.61467889908256880734)
 
* coverage: 67 / 109 (~0.61467889908256880734)
 
==Developed Resources==
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng Miq-Eng Lexican transfer repo]
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq Miq transducer repo]
 
*[https://github.com/apertium/apertium-eng apertium-eng repo]
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng-corpus Corpus repo]
 

Latest revision as of 03:20, 23 May 2021


Note: Resources for machine translation between Miskito and English

Developed Resources

Initial Miq → Eng Evaluation

Corpus Coverage

Monolingual coverage: 52 / 116 (~0.44827586206896551724)

Bilingual coverage: coverage: 37 / 118 (~0.31355932203389830508)

Sentence Analysis

Sentence 1

Yang wal wîna kalila piras

  • Intended: I don’t eat meat and chicken.
    • ^Yang<prn><p1>/I<prn><p1><sg>$ ^wal<post>/with<pr>$ ^wîna<n>/meat<n>$ ^kalila<n>/chicken<n>$ ^piaia<v><pres><neg>/eat<vblex><pres><neg>$^.<sent>/.<sent>$
    • #I with #meat #chicken #eat

Sentence 2

Yang siknis brisna mango.

  • Intended: I’m allergic to mangos.
    • ^Yang<prn><p1>/I<prn><p1><sg>$ ^siknis<n>/allergy<n>$ ^briaia<v><pres><p1>/have<vbhaver><pres><p1>$ ^mango<n>/mango<n>$^.<sent>/.<sent>$
    • #I #allergy #have #mango

Sentence 3

Dia prais brisma cuartocana timia kumi?

  • Intended: How much is a room?
    • ^Dia<det><itg>/What<det><itg>$ ^prais<n>/cost<n>$ ^briaia<v><pres><p2>/have<vbhaver><pres><p2>$ ^cuartocana<n>/room<n>$ ^timia<n>/night<n>$ ^kumi<adj>/one<adj>$^?<sent>/?<sent>$^.<sent>/.<sent>$
    • #What #cost #have #room #one #night?

Sentence 4

  • Intended: Is there a place to sleep?
  • Lexical transfer: *Manuapo *ansara #place *barsara *yapaya?
  • Miskito: Manuapo ansara pleiski barsara yapaya?

Sentence 5

  • Intended: Is there a place to wash clothes ?
  • Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya?
  • Miskito: Nara pleis kum apo cuala tuskaya?

Sentence 6

  • Intended: Tomorrow, I will stay at this lovely hotel.
  • Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain.
  • Miskito: Yang yauhka takash keisna naha hotel pain.

Sentence 7

  • Intended: Is there a place to rent bikes?
  • Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya?
  • Miskito: Pleeskum barsekei baisikil rin monaya?

Sentence 8

  • Intended: What are you doing tonight?
  • Lexical transfer: *Dia *dow *keisma *na #night.
  • Miskito: Dia dow keisma na timia.

Sentence 9

  • Intended: What do you think of the government?
  • Lexical transfer: #You *dia *lukisma *naho *gabament?
  • Miskito: Man dia lukisma naho gabament?

Sentence 10

  • Intended: Today is Friday.
  • Lexical transfer: *Naiwa *lika *praidi.
  • Miskito: Naiwa lika praidi.

Final Evaluation

Additions

  • 200+ stems added
  • 2 new lexical selection rules
    • sangni
    • taya
  • 3 new disambiguation rules in miq transducer for possessive nouns/verbs and object pronouns
  • 1 new transfer rule for possessive nouns
    • Also debugged plural nouns transfer rule

Evaluations

  • Precision: 100.00000%
  • Recall: 87.00787%
  • Coverage over large corpus
    • ~70.08%
    • Tokens in corpus: 13, 819
    • Stems in Transducer: 344

MT miq → eng

Tests run on miq.longer.txt from our bilingual repo

  • Word error rate (WER): 91.11 %
  • Position-independent word error rate (PER): 73.33 %
  • Percentage of unknown words: 3.53 %

Trimmed coverage over miq.longer.txt

  • coverage: 95 / 98 (~0.96938775510204081633)

MT eng → miq

Tests run on eng.longer.txt from our bilingual repo

  • Word error rate (WER): 100.00 %
  • Position-independent word error rate (PER): 96.55 %
  • Percentage of unknown words: 46.67 %

Trimmed coverage over eng.longer.txt

  • coverage: 67 / 109 (~0.61467889908256880734)