Difference between revisions of "Latin and Mandarin Chinese/Structural transfer"
From LING073
Line 6: | Line 6: | ||
Number of tokenised words in the corpus: 380 | Number of tokenised words in the corpus: 380 | ||
+ | |||
Coverage: 88.68% | Coverage: 88.68% | ||
+ | |||
Top unknown words in the corpus: | Top unknown words in the corpus: | ||
+ | |||
2 potest | 2 potest | ||
+ | |||
2 facet | 2 facet | ||
− | 2 | + | |
+ | 2 possit | ||
+ | |||
2 quo | 2 quo | ||
− | 1 | + | |
+ | 1 tibi | ||
+ | |||
1 Mariaene | 1 Mariaene | ||
+ | |||
1 quid | 1 quid | ||
+ | |||
1 audit | 1 audit | ||
+ | |||
1 possum | 1 possum | ||
+ | |||
1 matrae | 1 matrae | ||
+ | |||
1 James | 1 James | ||
+ | |||
1 loquent | 1 loquent | ||
+ | |||
1 audire | 1 audire | ||
+ | |||
1 not | 1 not | ||
+ | |||
1 Videbasne | 1 Videbasne | ||
+ | |||
1 duo | 1 duo | ||
+ | |||
1 poterunt | 1 poterunt | ||
+ | |||
1 eae | 1 eae | ||
+ | |||
1 aliquid | 1 aliquid | ||
+ | |||
1 posset | 1 posset | ||
Line 32: | Line 54: | ||
Number of tokenised words in the corpus: 447 | Number of tokenised words in the corpus: 447 | ||
+ | |||
Coverage: 100.00% | Coverage: 100.00% | ||
Revision as of 19:53, 11 April 2018
This is the page for the structural transfer of Latin and Mandarin Chinese. The main page for this language pair can be found here.
Contents
Pre-evaluation
Latin corpus coverage
Number of tokenised words in the corpus: 380
Coverage: 88.68%
Top unknown words in the corpus:
2 potest
2 facet
2 possit
2 quo
1 tibi
1 Mariaene
1 quid
1 audit
1 possum
1 matrae
1 James
1 loquent
1 audire
1 not
1 Videbasne
1 duo
1 poterunt
1 eae
1 aliquid
1 posset
Chinese corpus coverage
Number of tokenised words in the corpus: 447
Coverage: 100.00%
lat → zho
WER: 586.79%
PER: 586.79%
zho → lat
WER: 97.42%
PER: 93.55%