Difference between revisions of "Latin and Mandarin Chinese/Structural transfer"

From LING073
Jump to: navigation, search
Line 68: Line 68:
  
 
PER: 93.55%
 
PER: 93.55%
 +
 +
== Implementation ==
 +
 +
=== lat → zho ===
 +
 +
For translating into Chinese, we implemented macros that take case off of nouns and add the associative particle 的 to nouns that, in Latin, had the genitive case.
 +
 +
''Maria caput pueri videt.''
 +
 +
"Maria saw the boy's head."
 +
 +
tagger: ^Maria<np><ant><f><sg><nom>$ ^caput<n><nt><sg><nom>$ ^puer<n><m><sg><gen>$ ^videre<vblex><pri><act><p3><sg>$^.<sent>$
 +
 +
biltrans: ^Maria<np><ant><f><sg><nom>/小红<np><ant><f><sg><nom>$ ^caput<n><nt><sg><nom>/头<n><nt><sg><nom>$ ^puer<n><m><sg><gen>/男孩<n><m><sg><gen>$ ^videre<vblex><pri><act><p3><sg>/看到<vblex><pri><act><p3><sg>$^.<sent>/。<sent>$
 +
 +
chunker: apertium-transfer: Rule 2 caput<n><nt><sg><nom>/头<n><nt><sg><nom>
 +
 +
apertium-transfer: Rule 2 puer<n><m><sg><gen>/男孩<n><m><sg><gen>
 +
 +
apertium-transfer: Rule 1 .<sent>/。<sent>
 +
^default<default>{^小红<np><ant><f><sg><nom>$}$ ^noun<SN>{^头<n>$ }$ ^noun<SN>{^男孩<n>$ ^的<pr>$}$ ^default<default>{^看到<vblex><pri><act><p3><sg>$}$^sent<SENT>{^。<sent>$}$
 +
 +
interchunk: Rule 1 noun<SN>{^男孩<n>$ ^的<pr>$}
 +
^default<default>{^小红<np><ant><f><sg><nom>$}$ ^noun<SN>{^头<n>$ }$ ^noun<SN>{^男孩<n>$ ^的<pr>$}$ ^default<default>{^看到<vblex><pri><act><p3><sg>$}$^sent<SENT>{^。<sent>$}$
 +
 +
postchunk: ^小红<np><ant><f><sg><nom>$ ^头<n>$ ^男孩<n>$ ^的<pr>$ ^看到<vblex><pri><act><p3><sg>$^。<sent>$
 +
 +
lat-zho: #小红 头 男孩 的 #看到。
  
 
[[Category:Sp17_StructuralTransfer]]
 
[[Category:Sp17_StructuralTransfer]]

Revision as of 21:34, 14 April 2018

This is the page for the structural transfer of Latin and Mandarin Chinese. The main page for this language pair can be found here.

Pre-evaluation

Latin corpus coverage

Number of tokenised words in the corpus: 380

Coverage: 88.68%

Top unknown words in the corpus:

2 potest

2 facet

2 possit

2 quo

1 tibi

1 Mariaene

1 quid

1 audit

1 possum

1 matrae

1 James

1 loquent

1 audire

1 not

1 Videbasne

1 duo

1 poterunt

1 eae

1 aliquid

1 posset

Chinese corpus coverage

Number of tokenised words in the corpus: 447

Coverage: 100.00%

lat → zho

WER: 586.79%

PER: 586.79%

zho → lat

WER: 97.42%

PER: 93.55%

Implementation

lat → zho

For translating into Chinese, we implemented macros that take case off of nouns and add the associative particle 的 to nouns that, in Latin, had the genitive case.

Maria caput pueri videt.

"Maria saw the boy's head."

tagger: ^Maria<np><ant><f><sg><nom>$ ^caput<n><nt><sg><nom>$ ^puer<n><m><sg><gen>$ ^videre<vblex><pri><act><p3><sg>$^.<sent>$

biltrans: ^Maria<np><ant><f><sg><nom>/小红<np><ant><f><sg><nom>$ ^caput<n><nt><sg><nom>/头<n><nt><sg><nom>$ ^puer<n><m><sg><gen>/男孩<n><m><sg><gen>$ ^videre<vblex><pri><act><p3><sg>/看到<vblex><pri><act><p3><sg>$^.<sent>/。<sent>$

chunker: apertium-transfer: Rule 2 caput<n><nt><sg><nom>/头<n><nt><sg><nom>

apertium-transfer: Rule 2 puer<n><m><sg><gen>/男孩<n><m><sg><gen>

apertium-transfer: Rule 1 .<sent>/。<sent> ^default<default>{^小红<np><ant><f><sg><nom>$}$ ^noun<SN>{^头<n>$ }$ ^noun<SN>{^男孩<n>$ ^的<pr>$}$ ^default<default>{^看到<vblex><pri><act><p3><sg>$}$^sent<SENT>{^。<sent>$}$

interchunk: Rule 1 noun<SN>{^男孩<n>$ ^的<pr>$} ^default<default>{^小红<np><ant><f><sg><nom>$}$ ^noun<SN>{^头<n>$ }$ ^noun<SN>{^男孩<n>$ ^的<pr>$}$ ^default<default>{^看到<vblex><pri><act><p3><sg>$}$^sent<SENT>{^。<sent>$}$

postchunk: ^小红<np><ant><f><sg><nom>$ ^头<n>$ ^男孩<n>$ ^的<pr>$ ^看到<vblex><pri><act><p3><sg>$^。<sent>$

lat-zho: #小红 头 男孩 的 #看到。