Difference between revisions of "Kaingang and Portuguese/Structural Transfer"

From LING073
Jump to: navigation, search
(Examples for implementation)
Line 58: Line 58:
 
  Translation time: 0.0028295516967773438 seconds
 
  Translation time: 0.0028295516967773438 seconds
 
==Examples for implementation==
 
==Examples for implementation==
'''Sentence:'''
+
'''Sentence:'''  
 +
“Inh kã’u ã,” he tóg. “Inh mỹ ã tóg tỹ, ũ nĩ ven nĩ vẽ, hã ra ã tóg tỹ, ã hã nĩ”, he tóg.
  
'''Portuguese Translation:'''
+
'''Portuguese Translation:'''  
 +
“Você me assustou, pensei que você era um outro mas é você mesmo”, ele falou para mim.
  
'''Tagger output:'''
+
'''Tagger output:'''  
  
'''Biltrans output:'''
+
'''Biltrans output:'''  
 +
“^Inh<prn><pes>/Eu<prn><tn><pes>/Meu<prn><tn><pes>/De mim<prn><tn><pes>$ ^kã’u<v><tv>/assustar<vblex>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$^,<cm>/,<cm>$” ^he<v><tv>/dizer<vblex>$ ^tóg<su>/ele<prn><tn>$^.<sent>/.<sent>$ “^Inh<prn><pes>/Eu<prn><tn><pes>/Meu<prn><tn><pes>/De mim<prn><tn><pes>$ ^mỹ<cir>/para<pr>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$ ^tóg<su>/ele<prn><tn>$ ^tỹ<cir>/por<pr>/com<pr>$^,<cm>/,<cm>$ ^ũ<prn>/alguém<prn><tn>$ ^nĩ<a>/no momento<adv>$ ^ven<v><tv><sg>/mostrar<vblex>$ ^nĩ<a>/no momento<adv>$ ^vẽ<a>/é<vblex>/era<vblex>/ser<vblex>$^,<cm>/,<cm>$ ^hã<o>/igual<adj>/parecido<adj>$ ^ra<cir>/para<pr>/apesar do<cnjadv>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$ ^tóg<su>/ele<prn><tn>$ ^tỹ<cir>/por<pr>/com<pr>$^,<cm>/,<cm>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$ ^hã<o>/igual<adj>/parecido<adj>$ ^nĩ<a>/no momento<adv>$”^,<cm>/,<cm>$ ^he<v><tv>/dizer<vblex>$ ^tóg<su>/ele<prn><tn>$^.<sent>/.<sent>$^.<sent>/.<sent>
  
 
'''Chunker output:'''
 
'''Chunker output:'''
 +
apertium-transfer: Rule 1 .<sent>/.<sent>
 +
 +
apertium-transfer: Rule 2 mỹ<cir>/para<pr>
 +
 +
apertium-transfer: Rule 3 ã<prn>/tu<prn><tn><p2><mf><sg> tóg<su>/ele<prn><tn>
 +
 +
apertium-transfer: Rule 2 tỹ<cir>/por<pr>/com<pr>
 +
 +
apertium-transfer: Rule 2 ra<cir>/para<pr>/apesar do<cnjadv>
 +
 +
apertium-transfer: Rule 3 ã<prn>/tu<prn><tn><p2><mf><sg> tóg<su>/ele<prn><tn>
 +
 +
apertium-transfer: Rule 2 tỹ<cir>/por<pr>/com<pr>
 +
 +
apertium-transfer: Rule 1 .<sent>/.<sent>
 +
 +
apertium-transfer: Rule 1 .<sent>/.<sent>
 +
 +
“^default<default>{^Eu<prn><tn><pes>$}$ ^default<default>{^assustar<vblex>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$^default<default>{^,<cm>$}$” ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$ “^default<default>{^Eu<prn><tn><pes>$}$ ^pr<SP>{^para<pr>$}$ ^prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}$ ^pr<SP>{^por<pr>$}$^default<default>{^,<cm>$}$ ^default<default>{^alguém<prn><tn>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^mostrar<vblex>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^é<vblex>$}$^default<default>{^,<cm>$}$ ^default<default>{^igual<adj>$}$ ^pr<SP>{^para<pr>$}$ ^prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}$ ^pr<SP>{^por<pr>$}$^default<default>{^,<cm>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$ ^default<default>{^igual<adj>$}$ ^default<default>{^no momento<adv>$}$”^default<default>{^,<cm>$}$ ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$^sent<SENT>{^.<sent>$}$
 +
  
 
'''Interchunk output:'''
 
'''Interchunk output:'''
 +
 +
apertium-interchunk: Rule 1 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}
 +
 +
apertium-interchunk: Rule 2 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$} pr<SP>{^por<pr>$}
 +
 +
apertium-interchunk: Rule 1 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}
 +
 +
apertium-interchunk: Rule 2 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$} pr<SP>{^por<pr>$}
 +
“^default<default>{^Eu<prn><tn><pes>$}$ ^default<default>{^assustar<vblex>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$^default<default>{^,<cm>$}$” ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$ “^default<default>{^Eu<prn><tn><pes>$}$ ^pr<SP>{^para<pr>$}$ ^pr<SP>{^por<pr>$}$ ^prn<SN><obj>{^tu<prn><tn><p2><mf><sg><2>$}$^default<default>{^,<cm>$}$ ^default<default>{^alguém<prn><tn>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^mostrar<vblex>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^é<vblex>$}$^default<default>{^,<cm>$}$ ^default<default>{^igual<adj>$}$ ^pr<SP>{^para<pr>$}$ ^pr<SP>{^por<pr>$}$ ^prn<SN><obj>{^tu<prn><tn><p2><mf><sg><2>$}$^default<default>{^,<cm>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$ ^default<default>{^igual<adj>$}$ ^default<default>{^no momento<adv>$}$”^default<default>{^,<cm>$}$ ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$^sent<SENT>{^.<sent>$}$
 +
  
 
'''Postchunk output:'''
 
'''Postchunk output:'''

Revision as of 14:08, 19 April 2019

Pre-evaluation

Statistics about input files
-------------------------------------------------------
Number of words in reference: 63
Number of words in test: 63 
Number of unknown words (marked with a star) in test: 26
Percentage of unknown words: 41.27 %

Results when removing unknown-word marks (stars)
-------------------------------------------------------
Edit distance: 38
Word error rate (WER): 60.32 %
Number of position-independent correct words: 25
Position-independent word error rate (PER): 60.32 %

Results when unknown-word marks (stars) are not removed
-------------------------------------------------------
Edit distance: 63
Word Error Rate (WER): 100.00 %
Number of position-independent correct words: 0
Position-independent word error rate (PER): 100.00 % 
Statistics about the translation of unknown words
-------------------------------------------------------
Number of unknown words which were free rides: 25
Percentage of unknown words that were free rides: 96.15 %

  • WER: 60.32%
  • PER: 60.32%
  • Coverage:
$ aq-covtest ling073-kgp-por-corpus/kgp.tests.txt ling073-kgp-por/kgp-por.automorf.bin
Number of tokenised words in the corpus: 82
Coverage: 64.63%
Top unknown words in the corpus:
3	 vỹ
2	 fi
1	 kafã
1	 Nũgnũj
1	 kur
1	 ẽgno
1	 tũg
1	 São
1	 Pau
1	 o
1	 rã
1	 jur
1	 tá
1	 Téj
1	 ki
1	 panh
1	 kãfór
1	 kyrũ
1	 jãmré
1	 ũn
Translation time: 0.0028295516967773438 seconds

Examples for implementation

Sentence:

“Inh kã’u ã,” he tóg. “Inh mỹ ã tóg tỹ, ũ nĩ ven nĩ vẽ, hã ra ã tóg tỹ, ã hã nĩ”, he tóg.

Portuguese Translation:

“Você me assustou, pensei que você era um outro mas é você mesmo”, ele falou para mim.

Tagger output:

Biltrans output:

“^Inh<prn><pes>/Eu<prn><tn><pes>/Meu<prn><tn><pes>/De mim<prn><tn><pes>$ ^kã’u<v><tv>/assustar<vblex>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$^,<cm>/,<cm>$” ^he<v><tv>/dizer<vblex>$ ^tóg<su>/ele<prn><tn>$^.<sent>/.<sent>$ “^Inh<prn><pes>/Eu<prn><tn><pes>/Meu<prn><tn><pes>/De mim<prn><tn><pes>$ ^mỹ<cir>/para<pr>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$ ^tóg<su>/ele<prn><tn>$ ^tỹ<cir>/por<pr>/com<pr>$^,<cm>/,<cm>$ ^ũ<prn>/alguém<prn><tn>$ ^nĩ<a>/no momento<adv>$ ^ven<v><tv><sg>/mostrar<vblex>$ ^nĩ<a>/no momento<adv>$ ^vẽ<a>/é<vblex>/era<vblex>/ser<vblex>$^,<cm>/,<cm>$ ^hã<o>/igual<adj>/parecido<adj>$ ^ra<cir>/para<pr>/apesar do<cnjadv>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$ ^tóg<su>/ele<prn><tn>$ ^tỹ<cir>/por<pr>/com<pr>$^,<cm>/,<cm>$ ^ã<prn>/teu<prn><tn>/seu<prn><tn>/você<prn><tn>/tu<prn><tn><p2><mf><sg>$ ^hã<o>/igual<adj>/parecido<adj>$ ^nĩ<a>/no momento<adv>$”^,<cm>/,<cm>$ ^he<v><tv>/dizer<vblex>$ ^tóg<su>/ele<prn><tn>$^.<sent>/.<sent>$^.<sent>/.<sent>

Chunker output:

apertium-transfer: Rule 1 .<sent>/.<sent>
apertium-transfer: Rule 2 mỹ<cir>/para<pr>
apertium-transfer: Rule 3 ã<prn>/tu<prn><tn><p2><mf><sg> tóg<su>/ele<prn><tn>
apertium-transfer: Rule 2 tỹ<cir>/por<pr>/com<pr>
apertium-transfer: Rule 2 ra<cir>/para<pr>/apesar do<cnjadv>
apertium-transfer: Rule 3 ã<prn>/tu<prn><tn><p2><mf><sg> tóg<su>/ele<prn><tn>
apertium-transfer: Rule 2 tỹ<cir>/por<pr>/com<pr>
apertium-transfer: Rule 1 .<sent>/.<sent>
apertium-transfer: Rule 1 .<sent>/.<sent>
“^default<default>{^Eu<prn><tn><pes>$}$ ^default<default>{^assustar<vblex>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$^default<default>{^,<cm>$}$” ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$ “^default<default>{^Eu<prn><tn><pes>$}$ ^pr<SP>{^para<pr>$}$ ^prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}$ ^pr<SP>{^por<pr>$}$^default<default>{^,<cm>$}$ ^default<default>{^alguém<prn><tn>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^mostrar<vblex>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^é<vblex>$}$^default<default>{^,<cm>$}$ ^default<default>{^igual<adj>$}$ ^pr<SP>{^para<pr>$}$ ^prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}$ ^pr<SP>{^por<pr>$}$^default<default>{^,<cm>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$ ^default<default>{^igual<adj>$}$ ^default<default>{^no momento<adv>$}$”^default<default>{^,<cm>$}$ ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$^sent<SENT>{^.<sent>$}$


Interchunk output:

apertium-interchunk: Rule 1 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}
apertium-interchunk: Rule 2 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$} pr<SP>{^por<pr>$}
apertium-interchunk: Rule 1 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$}
apertium-interchunk: Rule 2 prn<SN><CD>{^tu<prn><tn><p2><mf><sg><2>$} pr<SP>{^por<pr>$}
“^default<default>{^Eu<prn><tn><pes>$}$ ^default<default>{^assustar<vblex>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$^default<default>{^,<cm>$}$” ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$ “^default<default>{^Eu<prn><tn><pes>$}$ ^pr<SP>{^para<pr>$}$ ^pr<SP>{^por<pr>$}$ ^prn<SN><obj>{^tu<prn><tn><p2><mf><sg><2>$}$^default<default>{^,<cm>$}$ ^default<default>{^alguém<prn><tn>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^mostrar<vblex>$}$ ^default<default>{^no momento<adv>$}$ ^default<default>{^é<vblex>$}$^default<default>{^,<cm>$}$ ^default<default>{^igual<adj>$}$ ^pr<SP>{^para<pr>$}$ ^pr<SP>{^por<pr>$}$ ^prn<SN><obj>{^tu<prn><tn><p2><mf><sg><2>$}$^default<default>{^,<cm>$}$ ^default<default>{^tu<prn><tn><p2><mf><sg>$}$ ^default<default>{^igual<adj>$}$ ^default<default>{^no momento<adv>$}$”^default<default>{^,<cm>$}$ ^default<default>{^dizer<vblex>$}$ ^default<default>{^ele<prn><tn>$}$^sent<SENT>{^.<sent>$}$^sent<SENT>{^.<sent>$}$


Postchunk output:

kgp-por output:

Post-evaluation

  • WER:
  • PER:
  • Coverage: