Difference between revisions of "Nuosu and Mandarin/Structural transfer"

From LING073
Jump to: navigation, search
(Pre-evaluation)
Line 1: Line 1:
 
= Pre-evaluation =
 
= Pre-evaluation =
 +
 +
Test file: 'iii-zho.tests.txt'
 +
Reference file 'zho.tests.txt'
 +
 +
Statistics about input files
 +
-------------------------------------------------------
 +
Number of words in reference: 57
 +
Number of words in test: 61
 +
Number of unknown words (marked with a star) in test: 17
 +
Percentage of unknown words: 27.87 %
 +
 +
Results when removing unknown-word marks (stars)
 +
-------------------------------------------------------
 +
Edit distance: 44
 +
Word error rate (WER): 77.19 %
 +
Number of position-independent correct words: 33
 +
Position-independent word error rate (PER): 49.12 %
 +
 +
Results when unknown-word marks (stars) are not removed
 +
-------------------------------------------------------
 +
Edit distance: 44
 +
Word Error Rate (WER): 77.19 %
 +
Number of position-independent correct words: 33
 +
Position-independent word error rate (PER): 49.12 %
 +
 +
Statistics about the translation of unknown words
 +
-------------------------------------------------------
 +
Number of unknown words which were free rides: 0
 +
Percentage of unknown words that were free rides: 0.00 %
 +
 +
 +
= Implementation =
 +
We addded most of the unknown lexicon into the Nuosu transducer, the Mandarin transducer, and our bilingual dictionary. We implemented #1, #2, #4, #5 rules from the Nuosu and Mandarin [https://wikis.swarthmore.edu/ling073/Nuosu_and_Mandarin/Contrastive_grammar contrastive grammar] page.
 +
  
  
 
[[Category:Sp22_StructuralTransfer]] [[category:Nuosu]]
 
[[Category:Sp22_StructuralTransfer]] [[category:Nuosu]]

Revision as of 15:39, 29 April 2022

Pre-evaluation

Test file: 'iii-zho.tests.txt' Reference file 'zho.tests.txt'

Statistics about input files


Number of words in reference: 57 Number of words in test: 61 Number of unknown words (marked with a star) in test: 17 Percentage of unknown words: 27.87 %

Results when removing unknown-word marks (stars)


Edit distance: 44 Word error rate (WER): 77.19 % Number of position-independent correct words: 33 Position-independent word error rate (PER): 49.12 %

Results when unknown-word marks (stars) are not removed


Edit distance: 44 Word Error Rate (WER): 77.19 % Number of position-independent correct words: 33 Position-independent word error rate (PER): 49.12 %

Statistics about the translation of unknown words


Number of unknown words which were free rides: 0 Percentage of unknown words that were free rides: 0.00 %


Implementation

We addded most of the unknown lexicon into the Nuosu transducer, the Mandarin transducer, and our bilingual dictionary. We implemented #1, #2, #4, #5 rules from the Nuosu and Mandarin contrastive grammar page.