Difference between revisions of "Miskito and English"

From LING073
Jump to: navigation, search
(GitHub Repo Links)
(Evaluations)
 
(25 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
[[Category:Sp21_TranslationPairs]]
 
[[Category:Sp21_TranslationPairs]]
  
===GitHub Repo Links===
+
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i>
 +
 
 +
==Developed Resources==
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng Miq-Eng Lexican transfer repo]
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng Miq-Eng Lexican transfer repo]
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq Miq transducer repo]
 
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq Miq transducer repo]
 
*[https://github.com/apertium/apertium-eng apertium-eng repo]
 
*[https://github.com/apertium/apertium-eng apertium-eng repo]
 +
*[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng-corpus Corpus repo]
 +
 +
== Initial Miq → Eng Evaluation ==
 +
===Corpus Coverage===
 +
Monolingual coverage: 52 / 116 (~0.44827586206896551724)
 +
 +
Bilingual coverage: coverage: 37 / 118 (~0.31355932203389830508)
 +
 +
===Sentence Analysis===
 +
====Sentence 1====
 +
Yang wal wîna kalila piras
 +
* Intended: I don’t eat meat and chicken.
 +
** ^Yang<prn><p1>/I<prn><p1><sg>$ ^wal<post>/with<pr>$ ^wîna<n>/meat<n>$ ^kalila<n>/chicken<n>$ ^piaia<v><pres><neg>/eat<vblex><pres><neg>$^.<sent>/.<sent>$
 +
** #I with #meat #chicken #eat
 +
 +
====Sentence 2====
 +
Yang siknis brisna mango.
 +
* Intended: I’m allergic to mangos.
 +
** ^Yang<prn><p1>/I<prn><p1><sg>$ ^siknis<n>/allergy<n>$ ^briaia<v><pres><p1>/have<vbhaver><pres><p1>$ ^mango<n>/mango<n>$^.<sent>/.<sent>$
 +
** #I #allergy #have #mango
 +
 +
====Sentence 3====
 +
Dia prais brisma cuartocana timia kumi?
 +
* Intended: How much is a room?
 +
** ^Dia<det><itg>/What<det><itg>$ ^prais<n>/cost<n>$ ^briaia<v><pres><p2>/have<vbhaver><pres><p2>$ ^cuartocana<n>/room<n>$ ^timia<n>/night<n>$ ^kumi<adj>/one<adj>$^?<sent>/?<sent>$^.<sent>/.<sent>$
 +
** #What #cost #have #room #one #night?
 +
 +
====Sentence 4====
 +
* Intended: Is there a place to sleep?
 +
* Lexical transfer: *Manuapo *ansara #place *barsara *yapaya?
 +
* Miskito: Manuapo ansara pleiski barsara yapaya?
 +
 +
====Sentence 5====
 +
* Intended: Is there a place to wash clothes ?
 +
* Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya?
 +
* Miskito: Nara pleis kum apo cuala tuskaya?
 +
 +
====Sentence 6====
 +
* Intended: Tomorrow, I will stay at this lovely hotel.
 +
* Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain.
 +
* Miskito: Yang yauhka takash keisna naha hotel pain.
 +
 +
====Sentence 7====
 +
* Intended: Is there a place to rent bikes?
 +
* Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya?
 +
* Miskito: Pleeskum barsekei baisikil rin monaya?
 +
 +
====Sentence 8====
 +
* Intended: What are you doing tonight?
 +
* Lexical transfer: *Dia *dow *keisma *na #night.
 +
* Miskito: Dia dow keisma na timia.
 +
 +
====Sentence 9====
 +
* Intended: What do you think of the government?
 +
* Lexical transfer: #You *dia *lukisma *naho *gabament?
 +
* Miskito: Man dia lukisma naho gabament?
 +
 +
====Sentence 10====
 +
* Intended: Today is Friday.
 +
* Lexical transfer: *Naiwa *lika *praidi.
 +
* Miskito: Naiwa lika praidi.
 +
 +
==Final Evaluation==
 +
===Additions===
 +
* 200+ stems added
 +
* 2 new lexical selection rules
 +
** sangni
 +
** taya
 +
* 3 new disambiguation rules in miq transducer for possessive nouns/verbs and object pronouns
 +
* 1 new transfer rule for possessive nouns
 +
** Also debugged plural nouns transfer rule
 +
 +
===Evaluations===
 +
*Precision: 100.00000%
 +
*Recall: 87.00787%
 +
 +
* Coverage over large corpus
 +
** ~70.08%
 +
** Tokens in corpus: 13, 819
 +
** Stems in Transducer: 344
 +
 +
====MT miq → eng====
 +
<i>Tests run on miq.longer.txt from our bilingual repo</i>
 +
 +
* Word error rate (WER): 91.11 %
 +
* Position-independent word error rate (PER): 73.33 %
 +
* Percentage of unknown words: 3.53 %
 +
 +
Trimmed coverage over miq.longer.txt
 +
* coverage: 95 / 98 (~0.96938775510204081633)
 +
 +
====MT eng → miq====
 +
<i>Tests run on eng.longer.txt from our bilingual repo</i>
 +
 +
* Word error rate (WER): 100.00 %
 +
* Position-independent word error rate (PER): 96.55 %
 +
* Percentage of unknown words: 46.67 %
 +
 +
Trimmed coverage over eng.longer.txt
 +
* coverage: 67 / 109 (~0.61467889908256880734)

Latest revision as of 03:20, 23 May 2021


Note: Resources for machine translation between Miskito and English

Developed Resources

Initial Miq → Eng Evaluation

Corpus Coverage

Monolingual coverage: 52 / 116 (~0.44827586206896551724)

Bilingual coverage: coverage: 37 / 118 (~0.31355932203389830508)

Sentence Analysis

Sentence 1

Yang wal wîna kalila piras

  • Intended: I don’t eat meat and chicken.
    • ^Yang<prn><p1>/I<prn><p1><sg>$ ^wal<post>/with<pr>$ ^wîna<n>/meat<n>$ ^kalila<n>/chicken<n>$ ^piaia<v><pres><neg>/eat<vblex><pres><neg>$^.<sent>/.<sent>$
    • #I with #meat #chicken #eat

Sentence 2

Yang siknis brisna mango.

  • Intended: I’m allergic to mangos.
    • ^Yang<prn><p1>/I<prn><p1><sg>$ ^siknis<n>/allergy<n>$ ^briaia<v><pres><p1>/have<vbhaver><pres><p1>$ ^mango<n>/mango<n>$^.<sent>/.<sent>$
    • #I #allergy #have #mango

Sentence 3

Dia prais brisma cuartocana timia kumi?

  • Intended: How much is a room?
    • ^Dia<det><itg>/What<det><itg>$ ^prais<n>/cost<n>$ ^briaia<v><pres><p2>/have<vbhaver><pres><p2>$ ^cuartocana<n>/room<n>$ ^timia<n>/night<n>$ ^kumi<adj>/one<adj>$^?<sent>/?<sent>$^.<sent>/.<sent>$
    • #What #cost #have #room #one #night?

Sentence 4

  • Intended: Is there a place to sleep?
  • Lexical transfer: *Manuapo *ansara #place *barsara *yapaya?
  • Miskito: Manuapo ansara pleiski barsara yapaya?

Sentence 5

  • Intended: Is there a place to wash clothes ?
  • Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya?
  • Miskito: Nara pleis kum apo cuala tuskaya?

Sentence 6

  • Intended: Tomorrow, I will stay at this lovely hotel.
  • Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain.
  • Miskito: Yang yauhka takash keisna naha hotel pain.

Sentence 7

  • Intended: Is there a place to rent bikes?
  • Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya?
  • Miskito: Pleeskum barsekei baisikil rin monaya?

Sentence 8

  • Intended: What are you doing tonight?
  • Lexical transfer: *Dia *dow *keisma *na #night.
  • Miskito: Dia dow keisma na timia.

Sentence 9

  • Intended: What do you think of the government?
  • Lexical transfer: #You *dia *lukisma *naho *gabament?
  • Miskito: Man dia lukisma naho gabament?

Sentence 10

  • Intended: Today is Friday.
  • Lexical transfer: *Naiwa *lika *praidi.
  • Miskito: Naiwa lika praidi.

Final Evaluation

Additions

  • 200+ stems added
  • 2 new lexical selection rules
    • sangni
    • taya
  • 3 new disambiguation rules in miq transducer for possessive nouns/verbs and object pronouns
  • 1 new transfer rule for possessive nouns
    • Also debugged plural nouns transfer rule

Evaluations

  • Precision: 100.00000%
  • Recall: 87.00787%
  • Coverage over large corpus
    • ~70.08%
    • Tokens in corpus: 13, 819
    • Stems in Transducer: 344

MT miq → eng

Tests run on miq.longer.txt from our bilingual repo

  • Word error rate (WER): 91.11 %
  • Position-independent word error rate (PER): 73.33 %
  • Percentage of unknown words: 3.53 %

Trimmed coverage over miq.longer.txt

  • coverage: 95 / 98 (~0.96938775510204081633)

MT eng → miq

Tests run on eng.longer.txt from our bilingual repo

  • Word error rate (WER): 100.00 %
  • Position-independent word error rate (PER): 96.55 %
  • Percentage of unknown words: 46.67 %

Trimmed coverage over eng.longer.txt

  • coverage: 67 / 109 (~0.61467889908256880734)