Ladino and English
Note: Resources for machine translation between Ladino and English
Contents
External Resources
Developed Resources
Lad → Eng Evaluation
Sentence Analysis
Sentence 1
El mirava en el cielo y en la estrellería: He was looking at heaven and at the stars
^El<prn><pers><p3><m><sg><nom>/Prpers<prn><subj><p3><m><sg>$ ^mirar<v><iv><pii><p1><sg>/look<vblex><pii><p1><sg>$ ^en<pr>/at<pr>/on<pr>/in<pr>$ ^el<det><def><m><sg>/the<det><def><sp>$ ^cielo<n><m><sg>/heavens<n><sg>$ ^y<cnjcoo>/and<cnjcoo>$ ^en<pr>/at<pr>/on<pr>/in<pr>$ ^el<det><def><f><sg>/the<det><def><sp>$ ^estrellería<n><f><sg>/stars<n><sg>$^.<sent>/.<sent>$
He #look at the #heavens and at the #stars
Sentence 2
No me mires: Don't look at me
^No<adv>/Not<adv>$ ^me<prn><pers><p1><sg><pro>/prpers<prn><obj><p1><mf><sg>$ ^mirar<v><iv><prs><p2><sg>/look<vblex><prs><p2><sg>$^.<sent>/.<sent>$
Not me #look
Sentence 3
Yo mirí en el korason de la estrellería: I looked at the heart of the stars
^Yo<prn><pers><p1><sg><nom>/Prpers<prn><subj><p1><mf><sg>$ ^mirar<v><iv><pret><p1><sg>/look<vblex><pret><p1><sg>$ ^en<pr>/at<pr>/on<pr>/in<pr>$ ^el<det><def><m><sg>/the<det><def><sp>$ ^korason<n><m><sg>/heart<n><sg>$ ^de<pr>/of<pr>/from<pr>$ ^*la/*la$ ^estrellería<n><f><sg>/stars<n><sg>$^.<sent>/.<sent>$
I #look at the heart of the #stars
Sentence 4
Eyas no somportaría la dolor: They(fem) do not bear the pain
^Eyas<prn><pers><p3><f><pl><nom>/Prpers<prn><subj><p3><f><pl>$ ^no<adv>/not<adv>$ ^somportar<v><tv><cni><p1><sg>/bear<vblex><cni><p1><sg>$ ^*la/*la$ ^dolor<n><f><sg>/pain<n><sg>$^.<sent>/.<sent>$
They not #bear the pain
Sentence 5
Eyos kantan: They(masculine) sing
^Eyos<prn><pers><p3><m><pl><nom>/Prpers<prn><subj><p3><m><pl>$ ^kantar<v><iv><pres><p3><pl>/sing<vblex><pres><p3><pl>$^.<sent>/.<sent>$
They #sing
Sentence 6
Eya kantó: She sang
^Eya<prn><pers><p3><f><sg><nom>/Prpers<prn><subj><p3><f><sg>$ ^kantar<v><iv><pret><p3><sg>/sing<vblex><pret><p3><sg>$^.<sent>/.<sent>$
She #sing
Sentence 7
Yo bivire en Yisrael: I will live in Israel
^Yo<prn><pers><p1><sg><nom>/Prpers<prn><subj><p1><mf><sg>$ ^bivir<v><iv><fut><p1><sg>/live<vblex><fut><p1><sg>$ ^en<pr>/at<pr>/on<pr>/in<pr>$ ^Yisrael<np>/Israel<np>$^.<sent>/.<sent>$
I #live at #Israel
Sentence 8
Nozotros komeriamos en la kavané: We would eat in the coffeehouse
^Nozotros<prn><pers><p1><m><pl><nom>/Prpers<prn><subj><p1><mf><pl>$ ^komer<v><tv><cni><p1><pl>/eat<vblex><cni><p1><pl>$ ^en<pr>/at<pr>/on<pr>/in<pr>$ ^el<det><def><f><sg>/the<det><def><sp>$ ^kavané<n><f><sg>/coffeehouse<n><sg>$^.<sent>/.<sent>$
We #eat at the #coffeehouse
Sentence 9
Tu biviras kuatro mezes: You will live four months
^Tu<prn><pers><p2><sg><nom>/Prpers<prn><subj><p2><mf><sg>$ ^bivir<v><iv><fut><p2><sg>/live<vblex><fut><p2><sg>$ ^kuatro<num>/four<num><pl>$ ^mes<n><m><pl>/month<n><pl>$^.<sent>/.<sent>$
You #live four months
Sentence 10
Eyos no komieron el limón: They did not eat the lemon
^Eyos<prn><pers><p3><m><pl><nom>/Prpers<prn><subj><p3><m><pl>$ ^no<adv>/not<adv>$ ^komer<v><tv><pret><p3><pl>/eat<vblex><pret><p3><pl>$ ^el<det><def><m><sg>/the<det><def><sp>$ ^limón<n><m><sg>/lemon<n><sg>$^.<sent>/.<sent>$
They not #eat the lemon.
Intitial Overall Analysis
The coverage of the monolingual transducer on the lad.sentences.txt file (which has more than the 10 sentences listed) is ~0.39583. The coverage of the bilingual transducer on the same file is ~0.33568. Further adaptation is done here.
Final Evaluation
Additions
I added four rules to verb morphology and added adjective inflection, I added four rules to the morphological disambiguator (differentiating the verb meanings from the preposition meanings for komo, de, and para, and differentiating the adjective meaning of querido from the verb and noun meanings) and I added two new transfer rules to the lad-eng.rtx file (making 'te dio' analyze as 'gave you' and making 'la Espanya' analyze as 'Spain').
Precision and Recall
Totals: 162 forms, 182 tp, 9 fp, 0 tn, 162 fn
Precision: 95.28796%
Recall: 52.90698%
Monolingual Transducer Coverage
Coverage over lad.corpus.large.txt: 289488 / 649936 (~0.44541000960094532385)
remaining unknown forms: 360448
649936 words in the corpus
393 stems, including 22 punctuation.
MT Coverage
Word error rate (WER) on lad.longer.txt: 86.55 %
Position-independent word error rate (PER) on lad.longer.txt: 76.82 %
Number of position-independent correct words: 150