Difference between revisions of "Tiwi and English"

From LING073
Jump to: navigation, search
(Tiwi → English Evaluation)
(Final Tiw → Eng Evaluation)
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Tiwi Tiwi] and [https://wikis.swarthmore.edu/ling073/English English]" </i>
+
<i> Note: Resources for machine translation between [https://wikis.swarthmore.edu/ling073/Tiwi Tiwi] and [https://wikis.swarthmore.edu/ling073/English English] </i>
  
 
== External Resources ==
 
== External Resources ==
Line 13: Line 13:
 
[https://github.swarthmore.edu/Ling073-sp21/ling073-tiw-eng-corpus Bilingual Corpus]
 
[https://github.swarthmore.edu/Ling073-sp21/ling073-tiw-eng-corpus Bilingual Corpus]
  
== Tiw → Eng Evaluation ==
+
[https://wikis.swarthmore.edu/ling073/Tiwi_and_English/Contrastive_Grammar Contrastive Grammar]
 +
 
 +
[https://wikis.swarthmore.edu/ling073/Tiwi_and_English/Lexical_selection Lexical selection]
 +
 
 +
[https://wikis.swarthmore.edu/ling073/Tiwi_and_English/Structural_transfer Structural Transfer]
 +
 
 +
== Initial Tiw → Eng Evaluation ==
 
The coverage of our monolingual transducer:  
 
The coverage of our monolingual transducer:  
 
coverage: 388 / 1133 (~0.34245366284201235658)
 
coverage: 388 / 1133 (~0.34245366284201235658)
Line 38: Line 44:
 
==== Sentence 4 ====
 
==== Sentence 4 ====
 
* kamini naki?: what is this?
 
* kamini naki?: what is this?
** ^kami<n><prn><itg><m>/what<n><prn><itg><m>/which<n><prn><itg><m>$ ^naki<n><prn><dem><sg>/this<n><prn><dem><sg>$^?<sent>/?<sent>$^.<sent>/.<sent>$
+
** ^kami<prn><itg><m>/what<prn><itg><m>/which<prn><itg><m>$ ^naki<prn><dem><sg>/this<prn><dem><sg>$^?<sent>/?<sent>$^.<sent>/.<sent>$
 
** "#what #this?"
 
** "#what #this?"
  
 
==== Sentence 5 ====
 
==== Sentence 5 ====
 
* ngarra minimarti: he is generous
 
* ngarra minimarti: he is generous
** ^ngarra<n><prn><m><p3><sg>/he<n><prn><m><p3><sg>$ ^minimarti<adj><m>/generous<adj>$^.<sent>/.<sent>$
+
** ^ngarra<prn><m><p3><sg>/he<prn><m><p3><sg>$ ^minimarti<adj><m>/generous<adj>$^.<sent>/.<sent>$
 
** "#generous #he"
 
** "#generous #he"
  
Line 53: Line 59:
 
==== Sentence 7 ====
 
==== Sentence 7 ====
 
* ngarra kijinga: he is small
 
* ngarra kijinga: he is small
** ^ngarra<n><prn><m><p3><sg>/he<n><prn><m><p3><sg>$ ^kiji<n><f>/small<n><f>$^.<sent>/.<sent>$
+
** ^ngarra<prn><m><p3><sg>/he<n><prn><m><p3><sg>$ ^kiji<adj><f>/small<n><f>$^.<sent>/.<sent>$
 
** "#he #small"
 
** "#he #small"
  
 
==== Sentence 8 ====
 
==== Sentence 8 ====
 
* yirrikipayi ngarra tuwara: the crocodile's tail
 
* yirrikipayi ngarra tuwara: the crocodile's tail
** ^yirrikipayi<n><m>/crocodile<n><m>$ ^ngarra<n><prn><m><p3><sg>/he<n><prn><m><p3><sg>$ ^tuwara<n>/tail<n>$^.<sent>/.<sent>$
+
** ^yirrikipayi<n><m>/crocodile<n><m>$ ^ngarra<prn><m><p3><sg>/he<prn><m><p3><sg>$ ^tuwara<n>/tail<n>$^.<sent>/.<sent>$
 
** "#crocodile #he #tail"
 
** "#crocodile #he #tail"
  
 
==== Sentence 9 ====
 
==== Sentence 9 ====
 
* awurra wawurruwi: those are men
 
* awurra wawurruwi: those are men
** ^awurra<n><prn><dem><pl>/those<n><prn><dem><pl>$ ^wawurru<n><pl>/man<n><pl>$^.<sent>/.<sent>$
+
** ^awurra<prn><dem><pl>/those<prn><dem><pl>$ ^wawurru<n><pl>/man<n><pl>$^.<sent>/.<sent>$
 
** "#those men"
 
** "#those men"
  
 
==== Sentence 10 ====
 
==== Sentence 10 ====
* ngiya paruwan: i am hungry  
+
* ngiya paruwani: i am hungry  
** ^ngiya<n><prn><p1><sg>/i<n><prn><p1><sg>$ ^paruwani<adj>/hungry<adj>$^.<sent>/.<sent>$
+
** ^ngiya<prn><p1><sg>/i<prn><p1><sg>$ ^paruwani<adj>/hungry<adj>$^.<sent>/.<sent>$
 
** "#hungry #i"
 
** "#hungry #i"
 +
 +
== Additions ==
 +
=== Expanded Morphological coverage ===
 +
* Added temporal prefix:
 +
** watu-
 +
*** awatupirni: He fights in the morning
 +
** ki-
 +
*** akipirni: He fights in the evening
 +
* Added reciprocal and reflexive suffixes:
 +
** -ajirri
 +
*** ngaripirnajirri: They hit eachother
 +
** -amiya
 +
***ngaripirnamiya: I hit myself
 +
 +
=== Developed Lexical Selection Rules===
 +
==== Case 1 ====
 +
moyila → unlucky, non-pay week
 +
 +
Example sentences:
 +
 +
{| class="wikitable"
 +
|-
 +
! Tiwi Sentence !! English Translation
 +
|-
 +
|  Ngiya '''moyila''' naki awarra jurra api ngiya karrikamini kunawini ||  This is my '''non-pay week''' so I have no money.
 +
|-
 +
| Ngiya '''moyila''' ngirimi kapi jupuluwu japini. ||  I had '''bad luck''' at cards last night.
 +
|}
 +
 +
==== Case 2 ====
 +
pijara → eye, bullet
 +
 +
Example sentences:
 +
 +
{| class="wikitable"
 +
|-
 +
! Tiwi Sentence !! English Translation
 +
|-
 +
|  Ngiya ngurru-wuriyi kularlaga, api ngi-ri-marruriyi jurruwarli yukurri '''pijara''' ||  I went hunting and took a gun and four '''bullets'''.
 +
|-
 +
| Ngarra yi-pirraya '''pijara''' pili jan ||  He washed his '''eyes''' because they were sore.
 +
|}
 +
=== Additional Structural Transfer Rules ===
 +
 +
==== Insert "Be" Verbs =====
 +
===== Tiw-Eng =====
 +
(tiw) kamini naki? → (eng) what is this?
 +
===== Tagger =====
 +
{{transferMorphTest|tiw|eng|kami{{tag|prn}}{{tag|itg}}{{tag|m}} naki{{tag|prn}}{{tag|dem}}{{tag|sg}} | what {{tag|prn}}{{tag|itg}}{{tag|mf}}{{tag|sp}} be {{tag|vbser}}{{tag|pres}}{{tag|p3}}{{tag|sg}} this{{tag|prn}}{{tag|dem}}{{tag|mf}}{{tag|sg}}    }}
 +
 +
===== Biltrans =====
 +
^kami<prn><itg><m>/what<prn><itg><m>/which<prn><itg><m>$ ^naki<prn><dem><sg>/this><prn><dem><sg>$^.<sent>/.<sent>$
 +
==== Transfer ====
 +
^what<prn><itg><m>$ ^be<vbser><pres><p3><sg>$ ^this><prn><dem><sg>$^.<sent>$
 +
 +
===== Word to Word Phrase =====
 +
===== Tiw-Eng =====
 +
{{transferTest|tiw|eng|jupijupi awi kiyija mirrawu | soup and a little bit of tobacco}}
 +
===== Tagger =====
 +
{{transferMorphTest|tiw|eng|jupijupi{{tag|n}} awi {{tag|cnjcoo}} kiyija{{tag|prn}}{{tag|qnt}} mirrawu {{tag|n}} | soup{{tag|n}}{{tag|sg}} and {{tag|cnjcoo}} a little bit {{tag|adv}} of {{tag|pr}} tobacco {{tag|n}}{{tag|sg}} }}
 +
===== Biltrans =====
 +
^jupijupi<n>/soup<n>$ ^awi<cnjcoo>/and<cnjcoo>$ ^kiyija<prn><qnt>/a little bit<adv>$ ^mirrawu<n>/tobacco<n>$^.<sent>/.<sent>$
 +
===== Transfer =====
 +
^soup<n><sg>$ ^and<cnjcoo>$ ^a little bit<adv>$ ^of<pr>$ ^tobacco<n><sg>$^.<sent>$
 +
 +
== Final Tiw → Eng Evaluation ==
 +
=== Transducer Coverage ===
 +
* Precision and recall against the annotated.basic corpus: Getting a division by zero error
 +
* Number of words in large text: 3754
 +
* Coverage over large text: 0.39
 +
* Total number of stems: 118
 +
 +
=== Translator Coverage ===
 +
* Word error rate (WER): 94.64 %
 +
* Position-independent word error rate (PER): 93.45 %
 +
* Trimmed coverage
 +
** longer corpora: 0.44
 +
** large corpora: 0.30
 +
* Number of tokens:
 +
** longer: 154
 +
** large: 3501
  
 
[[Category: Tiwi]][[Category:English]][[Category:Sp21_TranslationPairs]]
 
[[Category: Tiwi]][[Category:English]][[Category:Sp21_TranslationPairs]]

Latest revision as of 14:37, 8 May 2021

Note: Resources for machine translation between Tiwi and English

External Resources

Github Repo for Language Pair

Tiwi Transducer

English Transducer

Developed Resources

Bilingual Corpus

Contrastive Grammar

Lexical selection

Structural Transfer

Initial Tiw → Eng Evaluation

The coverage of our monolingual transducer: coverage: 388 / 1133 (~0.34245366284201235658)

The coverage of our bilingual transducer: coverage: 20 / 30 (~0.66666666666666666667)

Sentence Analysis

Sentence 1

  • pirripakijiti: They floated
    • ^akijiti<vblex><iv><past><s_pl3>/float<vblex><iv><past><s_pl3>$^.<sent>/.<sent>$
    • "#float"

Sentence 2

  • pirripakirlumurri: They were tired
    • ^akirlumurri<vblex><iv><past><s_pl3>/be tired<vblex><iv><past><s_pl3>$^.<sent>/.<sent>$
    • "#be tired"

Sentence 3

  • pirriwapa: They ate
    • ^wapa<vblex><iv><past><s_pl3>/eat<vblex><iv><past><s_pl3>$^.<sent>/.<sent>$
    • "#eat"

Sentence 4

  • kamini naki?: what is this?
    • ^kami<prn><itg><m>/what<prn><itg><m>/which<prn><itg><m>$ ^naki<prn><dem><sg>/this<prn><dem><sg>$^?<sent>/?<sent>$^.<sent>/.<sent>$
    • "#what #this?"

Sentence 5

  • ngarra minimarti: he is generous
    • ^ngarra<prn><m><p3><sg>/he<prn><m><p3><sg>$ ^minimarti<adj><m>/generous<adj>$^.<sent>/.<sent>$
    • "#generous #he"

Sentence 6

  • jupijupi awi kiyija mirrawu: soup and a little bit of tobacco
    • ^jupijupi<n>/soup<n>$ ^awi<cnjcoo>/and<cnjcoo>$ ^kiyija<n><prn><qnt>/little<n><prn><qnt>$ ^mirrawu<n>/tobacco<n>$^.<sent>/.<sent>$
    • "#soup and #little #tobacco"

Sentence 7

  • ngarra kijinga: he is small
    • ^ngarra<prn><m><p3><sg>/he<n><prn><m><p3><sg>$ ^kiji<adj><f>/small<n><f>$^.<sent>/.<sent>$
    • "#he #small"

Sentence 8

  • yirrikipayi ngarra tuwara: the crocodile's tail
    • ^yirrikipayi<n><m>/crocodile<n><m>$ ^ngarra<prn><m><p3><sg>/he<prn><m><p3><sg>$ ^tuwara<n>/tail<n>$^.<sent>/.<sent>$
    • "#crocodile #he #tail"

Sentence 9

  • awurra wawurruwi: those are men
    • ^awurra<prn><dem><pl>/those<prn><dem><pl>$ ^wawurru<n><pl>/man<n><pl>$^.<sent>/.<sent>$
    • "#those men"

Sentence 10

  • ngiya paruwani: i am hungry
    • ^ngiya<prn><p1><sg>/i<prn><p1><sg>$ ^paruwani<adj>/hungry<adj>$^.<sent>/.<sent>$
    • "#hungry #i"

Additions

Expanded Morphological coverage

  • Added temporal prefix:
    • watu-
      • awatupirni: He fights in the morning
    • ki-
      • akipirni: He fights in the evening
  • Added reciprocal and reflexive suffixes:
    • -ajirri
      • ngaripirnajirri: They hit eachother
    • -amiya
      • ngaripirnamiya: I hit myself

Developed Lexical Selection Rules

Case 1

moyila → unlucky, non-pay week

Example sentences:

Tiwi Sentence English Translation
Ngiya moyila naki awarra jurra api ngiya karrikamini kunawini This is my non-pay week so I have no money.
Ngiya moyila ngirimi kapi jupuluwu japini. I had bad luck at cards last night.

Case 2

pijara → eye, bullet

Example sentences:

Tiwi Sentence English Translation
Ngiya ngurru-wuriyi kularlaga, api ngi-ri-marruriyi jurruwarli yukurri pijara I went hunting and took a gun and four bullets.
Ngarra yi-pirraya pijara pili jan He washed his eyes because they were sore.

Additional Structural Transfer Rules

Insert "Be" Verbs =

Tiw-Eng

(tiw) kamini naki? → (eng) what is this?

Tagger

(tiw) kami<prn><itg><m> naki<prn><dem><sg> → (eng) what <prn><itg><mf><sp> be <vbser><pres><p3><sg> this<prn><dem><mf><sg>

Biltrans

^kami<prn><itg><m>/what<prn><itg><m>/which<prn><itg><m>$ ^naki<prn><dem><sg>/this><prn><dem><sg>$^.<sent>/.<sent>$

Transfer

^what<prn><itg><m>$ ^be<vbser><pres><p3><sg>$ ^this><prn><dem><sg>$^.<sent>$

Word to Word Phrase
Tiw-Eng

(tiw) jupijupi awi kiyija mirrawu → (eng) soup and a little bit of tobacco

Tagger

(tiw) jupijupi<n> awi <cnjcoo> kiyija<prn><qnt> mirrawu <n> → (eng) soup<n><sg> and <cnjcoo> a little bit <adv> of <pr> tobacco <n><sg>

Biltrans

^jupijupi<n>/soup<n>$ ^awi<cnjcoo>/and<cnjcoo>$ ^kiyija<prn><qnt>/a little bit<adv>$ ^mirrawu<n>/tobacco<n>$^.<sent>/.<sent>$

Transfer

^soup<n><sg>$ ^and<cnjcoo>$ ^a little bit<adv>$ ^of<pr>$ ^tobacco<n><sg>$^.<sent>$

Final Tiw → Eng Evaluation

Transducer Coverage

  • Precision and recall against the annotated.basic corpus: Getting a division by zero error
  • Number of words in large text: 3754
  • Coverage over large text: 0.39
  • Total number of stems: 118

Translator Coverage

  • Word error rate (WER): 94.64 %
  • Position-independent word error rate (PER): 93.45 %
  • Trimmed coverage
    • longer corpora: 0.44
    • large corpora: 0.30
  • Number of tokens:
    • longer: 154
    • large: 3501