Difference between revisions of "Miskito and English"
From LING073
(→Evaluations) |
|||
(18 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i> | <i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Miskito Miskito] and [https://wikis.swarthmore.edu/ling073/English English] </i> | ||
+ | ==Developed Resources== | ||
+ | *[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng Miq-Eng Lexican transfer repo] | ||
+ | *[https://github.swarthmore.edu/Ling073-sp21/ling073-miq Miq transducer repo] | ||
+ | *[https://github.com/apertium/apertium-eng apertium-eng repo] | ||
+ | *[https://github.swarthmore.edu/Ling073-sp21/ling073-miq-eng-corpus Corpus repo] | ||
+ | |||
+ | == Initial Miq → Eng Evaluation == | ||
===Corpus Coverage=== | ===Corpus Coverage=== | ||
Monolingual coverage: 52 / 116 (~0.44827586206896551724) | Monolingual coverage: 52 / 116 (~0.44827586206896551724) | ||
Line 11: | Line 18: | ||
===Sentence Analysis=== | ===Sentence Analysis=== | ||
− | Sentence 1 | + | ====Sentence 1==== |
+ | Yang wal wîna kalila piras | ||
* Intended: I don’t eat meat and chicken. | * Intended: I don’t eat meat and chicken. | ||
− | * | + | ** ^Yang<prn><p1>/I<prn><p1><sg>$ ^wal<post>/with<pr>$ ^wîna<n>/meat<n>$ ^kalila<n>/chicken<n>$ ^piaia<v><pres><neg>/eat<vblex><pres><neg>$^.<sent>/.<sent>$ |
− | + | ** #I with #meat #chicken #eat | |
− | Sentence 2 | + | ====Sentence 2==== |
+ | Yang siknis brisna mango. | ||
* Intended: I’m allergic to mangos. | * Intended: I’m allergic to mangos. | ||
− | * | + | ** ^Yang<prn><p1>/I<prn><p1><sg>$ ^siknis<n>/allergy<n>$ ^briaia<v><pres><p1>/have<vbhaver><pres><p1>$ ^mango<n>/mango<n>$^.<sent>/.<sent>$ |
− | * | + | ** #I #allergy #have #mango |
− | Sentence 3 | + | ====Sentence 3==== |
− | * Intended: | + | Dia prais brisma cuartocana timia kumi? |
− | * | + | * Intended: How much is a room? |
− | + | ** ^Dia<det><itg>/What<det><itg>$ ^prais<n>/cost<n>$ ^briaia<v><pres><p2>/have<vbhaver><pres><p2>$ ^cuartocana<n>/room<n>$ ^timia<n>/night<n>$ ^kumi<adj>/one<adj>$^?<sent>/?<sent>$^.<sent>/.<sent>$ | |
+ | ** #What #cost #have #room #one #night? | ||
− | Sentence 4 | + | ====Sentence 4==== |
* Intended: Is there a place to sleep? | * Intended: Is there a place to sleep? | ||
* Lexical transfer: *Manuapo *ansara #place *barsara *yapaya? | * Lexical transfer: *Manuapo *ansara #place *barsara *yapaya? | ||
* Miskito: Manuapo ansara pleiski barsara yapaya? | * Miskito: Manuapo ansara pleiski barsara yapaya? | ||
− | Sentence 5 | + | ====Sentence 5==== |
* Intended: Is there a place to wash clothes ? | * Intended: Is there a place to wash clothes ? | ||
* Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya? | * Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya? | ||
* Miskito: Nara pleis kum apo cuala tuskaya? | * Miskito: Nara pleis kum apo cuala tuskaya? | ||
− | Sentence 6 | + | ====Sentence 6==== |
* Intended: Tomorrow, I will stay at this lovely hotel. | * Intended: Tomorrow, I will stay at this lovely hotel. | ||
* Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain. | * Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain. | ||
* Miskito: Yang yauhka takash keisna naha hotel pain. | * Miskito: Yang yauhka takash keisna naha hotel pain. | ||
− | Sentence 7 | + | ====Sentence 7==== |
* Intended: Is there a place to rent bikes? | * Intended: Is there a place to rent bikes? | ||
* Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya? | * Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya? | ||
* Miskito: Pleeskum barsekei baisikil rin monaya? | * Miskito: Pleeskum barsekei baisikil rin monaya? | ||
− | Sentence 8 | + | ====Sentence 8==== |
* Intended: What are you doing tonight? | * Intended: What are you doing tonight? | ||
* Lexical transfer: *Dia *dow *keisma *na #night. | * Lexical transfer: *Dia *dow *keisma *na #night. | ||
* Miskito: Dia dow keisma na timia. | * Miskito: Dia dow keisma na timia. | ||
− | Sentence 9 | + | ====Sentence 9==== |
* Intended: What do you think of the government? | * Intended: What do you think of the government? | ||
* Lexical transfer: #You *dia *lukisma *naho *gabament? | * Lexical transfer: #You *dia *lukisma *naho *gabament? | ||
* Miskito: Man dia lukisma naho gabament? | * Miskito: Man dia lukisma naho gabament? | ||
− | Sentence 10 | + | ====Sentence 10==== |
* Intended: Today is Friday. | * Intended: Today is Friday. | ||
* Lexical transfer: *Naiwa *lika *praidi. | * Lexical transfer: *Naiwa *lika *praidi. | ||
* Miskito: Naiwa lika praidi. | * Miskito: Naiwa lika praidi. | ||
− | === | + | ==Final Evaluation== |
− | * | + | ===Additions=== |
− | * | + | * 200+ stems added |
− | * | + | * 2 new lexical selection rules |
− | * | + | ** sangni |
+ | ** taya | ||
+ | * 3 new disambiguation rules in miq transducer for possessive nouns/verbs and object pronouns | ||
+ | * 1 new transfer rule for possessive nouns | ||
+ | ** Also debugged plural nouns transfer rule | ||
+ | |||
+ | ===Evaluations=== | ||
+ | *Precision: 100.00000% | ||
+ | *Recall: 87.00787% | ||
+ | |||
+ | * Coverage over large corpus | ||
+ | ** ~70.08% | ||
+ | ** Tokens in corpus: 13, 819 | ||
+ | ** Stems in Transducer: 344 | ||
+ | |||
+ | ====MT miq → eng==== | ||
+ | <i>Tests run on miq.longer.txt from our bilingual repo</i> | ||
+ | |||
+ | * Word error rate (WER): 91.11 % | ||
+ | * Position-independent word error rate (PER): 73.33 % | ||
+ | * Percentage of unknown words: 3.53 % | ||
+ | |||
+ | Trimmed coverage over miq.longer.txt | ||
+ | * coverage: 95 / 98 (~0.96938775510204081633) | ||
+ | |||
+ | ====MT eng → miq==== | ||
+ | <i>Tests run on eng.longer.txt from our bilingual repo</i> | ||
+ | |||
+ | * Word error rate (WER): 100.00 % | ||
+ | * Position-independent word error rate (PER): 96.55 % | ||
+ | * Percentage of unknown words: 46.67 % | ||
+ | |||
+ | Trimmed coverage over eng.longer.txt | ||
+ | * coverage: 67 / 109 (~0.61467889908256880734) |
Latest revision as of 03:20, 23 May 2021
Note: Resources for machine translation between Miskito and English
Developed Resources
Initial Miq → Eng Evaluation
Corpus Coverage
Monolingual coverage: 52 / 116 (~0.44827586206896551724)
Bilingual coverage: coverage: 37 / 118 (~0.31355932203389830508)
Sentence Analysis
Sentence 1
Yang wal wîna kalila piras
- Intended: I don’t eat meat and chicken.
- ^Yang<prn><p1>/I<prn><p1><sg>$ ^wal<post>/with<pr>$ ^wîna<n>/meat<n>$ ^kalila<n>/chicken<n>$ ^piaia<v><pres><neg>/eat<vblex><pres><neg>$^.<sent>/.<sent>$
- #I with #meat #chicken #eat
Sentence 2
Yang siknis brisna mango.
- Intended: I’m allergic to mangos.
- ^Yang<prn><p1>/I<prn><p1><sg>$ ^siknis<n>/allergy<n>$ ^briaia<v><pres><p1>/have<vbhaver><pres><p1>$ ^mango<n>/mango<n>$^.<sent>/.<sent>$
- #I #allergy #have #mango
Sentence 3
Dia prais brisma cuartocana timia kumi?
- Intended: How much is a room?
- ^Dia<det><itg>/What<det><itg>$ ^prais<n>/cost<n>$ ^briaia<v><pres><p2>/have<vbhaver><pres><p2>$ ^cuartocana<n>/room<n>$ ^timia<n>/night<n>$ ^kumi<adj>/one<adj>$^?<sent>/?<sent>$^.<sent>/.<sent>$
- #What #cost #have #room #one #night?
Sentence 4
- Intended: Is there a place to sleep?
- Lexical transfer: *Manuapo *ansara #place *barsara *yapaya?
- Miskito: Manuapo ansara pleiski barsara yapaya?
Sentence 5
- Intended: Is there a place to wash clothes ?
- Lexical transfer: *Nara *pleis *kum *apo *cuala *tuskaya?
- Miskito: Nara pleis kum apo cuala tuskaya?
Sentence 6
- Intended: Tomorrow, I will stay at this lovely hotel.
- Lexical transfer: #I *yauhka *takash *keisna *naha *hotel *pain.
- Miskito: Yang yauhka takash keisna naha hotel pain.
Sentence 7
- Intended: Is there a place to rent bikes?
- Lexical transfer: *Pleeskum *barsekei *baisikil *rin *monaya?
- Miskito: Pleeskum barsekei baisikil rin monaya?
Sentence 8
- Intended: What are you doing tonight?
- Lexical transfer: *Dia *dow *keisma *na #night.
- Miskito: Dia dow keisma na timia.
Sentence 9
- Intended: What do you think of the government?
- Lexical transfer: #You *dia *lukisma *naho *gabament?
- Miskito: Man dia lukisma naho gabament?
Sentence 10
- Intended: Today is Friday.
- Lexical transfer: *Naiwa *lika *praidi.
- Miskito: Naiwa lika praidi.
Final Evaluation
Additions
- 200+ stems added
- 2 new lexical selection rules
- sangni
- taya
- 3 new disambiguation rules in miq transducer for possessive nouns/verbs and object pronouns
- 1 new transfer rule for possessive nouns
- Also debugged plural nouns transfer rule
Evaluations
- Precision: 100.00000%
- Recall: 87.00787%
- Coverage over large corpus
- ~70.08%
- Tokens in corpus: 13, 819
- Stems in Transducer: 344
MT miq → eng
Tests run on miq.longer.txt from our bilingual repo
- Word error rate (WER): 91.11 %
- Position-independent word error rate (PER): 73.33 %
- Percentage of unknown words: 3.53 %
Trimmed coverage over miq.longer.txt
- coverage: 95 / 98 (~0.96938775510204081633)
MT eng → miq
Tests run on eng.longer.txt from our bilingual repo
- Word error rate (WER): 100.00 %
- Position-independent word error rate (PER): 96.55 %
- Percentage of unknown words: 46.67 %
Trimmed coverage over eng.longer.txt
- coverage: 67 / 109 (~0.61467889908256880734)