Difference between revisions of "Fijian and English"

From LING073
Jump to: navigation, search
(fij → eng one-to-many mapping)
(eng → fij evaluation)
Line 67: Line 67:
  
 
</pre>
 
</pre>
*Notes:
 
:-Unknown words in English: ''my'' and ''your'' in ''my eye'' and ''your father''. (For Fijian bound nouns like 'eye' and 'father', possessive pronouns are suffixes attached to the noun stems instead of as separate words. How should the possessive pronouns in English be translated?)
 
 
:-The indefinite determiner ''a'' in English is not translated correctly. For most cases, it corresponds to ''e dua a'' in Fijian (gloss: it one ART); for example, ''a village''=''e dua a ’oro''. How to add the ''e dua a'' as a whole in the dictionary?
 
  
 
==Lexical Selection==
 
==Lexical Selection==

Revision as of 21:17, 4 April 2018

Resources for machine translation between Fijian and English

fij → eng evaluation

Current WER and PER:

Test file: 'fij-eng.tests.txt'
Reference file 'eng.tests.txt'

Statistics about input files
-------------------------------------------------------
Number of words in reference: 56
Number of words in test: 59
Number of unknown words (marked with a star) in test:
Percentage of unknown words: 0.00 %

Results when removing unknown-word marks (stars)
-------------------------------------------------------
Edit distance: 50
Word error rate (WER): 89.29 %
Number of position-independent correct words: 12
Position-independent word error rate (PER): 83.93 %

Results when unknown-word marks (stars) are not removed
-------------------------------------------------------
Edit distance: 50
Word Error Rate (WER): 89.29 %
Number of position-independent correct words: 12
Position-independent word error rate (PER): 83.93 %

Statistics about the translation of unknown words
-------------------------------------------------------
Number of unknown words which were free rides: 0
Percentage of unknown words that were free rides: 0%

eng → fij evaluation

Current WER and PER :

Test file: 'eng-fij.tests.txt'
Reference file 'fij.tests.txt'

Statistics about input files
-------------------------------------------------------
Number of words in reference: 62
Number of words in test: 56
Number of unknown words (marked with a star) in test: 2
Percentage of unknown words: 3.57 %

Results when removing unknown-word marks (stars)
-------------------------------------------------------
Edit distance: 52
Word error rate (WER): 83.87 %
Number of position-independent correct words: 13
Position-independent word error rate (PER): 79.03 %

Results when unknown-word marks (stars) are not removed
-------------------------------------------------------
Edit distance: 52
Word Error Rate (WER): 83.87 %
Number of position-independent correct words: 13
Position-independent word error rate (PER): 79.03 %

Statistics about the translation of unknown words
-------------------------------------------------------
Number of unknown words which were free rides: 0
Percentage of unknown words that were free rides: 0.00 %

Lexical Selection

eng → fij one-to-many mapping

  • Case 1: Pelu and lo’i describe two different kinds of bending action.

(eng) bend → (fij) pelu (e.g. bend of metal)

(eng) bend → (fij) lo’i (e.g. bend at a joint)

  • Case 2:

(This is a disambiguation problem) The third person singular pronoun in English does not distinguish between nominative and accusative case.

(eng) it → (fij) e (subj)

(eng) it → (fij) koya (obj)

fij → eng one-to-many mapping

  • Case 1:
  • Case 2:

(These are disambiguation problems) The word levu can be used either as an adjective meaning "big", or a verb meaning "be a lot".

(fij) levu → (eng) big (adj)

(fij) levu → (eng) be a lot (v)

Fijian does not distinguish genders on pronouns.

(fij) koya → (eng) him

(fij) koya → (eng) her

(fij) koya → (eng) it

Contrastive Grammar

https://wikis.swarthmore.edu/ling073/Fijian_and_English/Contrastive_Grammar

Developed Resources for Machine Translation

https://github.swarthmore.edu/hwang11/ling073-fij-eng

https://github.swarthmore.edu/hwang11/ling073-fij-eng-corpus