Difference between revisions of "Magahi/Disambiguation"
From LING073
(→Sentences) |
(→Sentences) |
||
Line 23: | Line 23: | ||
= Sentences = | = Sentences = | ||
− | *<code>^i/<b><prn><dem><b>/<prn><dem><att>/<prn><pers><prx><p3><sg>$ ^laṛikā/<b>laṛikā<n><b>/laṛikā<n><obl>/laṛikā<n><obl>$ ^likho/<b>likh<v><pres><s_p3></b>/likh<v><pres><o_p2><hi>$^./.<sent>$</code> | + | *<code>^i/<b><prn><dem></b>/<prn><dem><att>/<prn><pers><prx><p3><sg>$ ^laṛikā/<b>laṛikā<n></b>/laṛikā<n><obl>/laṛikā<n><obl>$ ^likho/<b>likh<v><pres><s_p3></b>/likh<v><pres><o_p2><hi>$^./.<sent>$</code> |
* | * | ||
[[Category:Sp21_Disambiguation]][[Category:Magahi]] | [[Category:Sp21_Disambiguation]][[Category:Magahi]] |
Revision as of 21:26, 4 April 2021
Initial Evaluation of Ambiguity
./disambiguation-test.sh
Ambiguity before disambiguation: ~1.11354466858789625360
Ambiguity after disambiguation: ~1.04495677233429394813
Corpus
Forms in corpus with more than one analysis
cat corpus.txt | lt-proc /path/to/mag.automorf.bin | sed 's/$\W*\^/$\n^/g' | grep '\/.*\/'
Transducer
Number of all forms: 4293
hfst-expand mag.automorf.hfst | wc -l
Number of unique forms: 2985
hfst-expand mag.automorf.hfst | cut -f1 -d':' | sort -u | wc -l
Ambiguity of 1.43 (4293/2985).
Multiple Analyses:
hfst-expand mag.automorf.hfst | cut -f1 -d':' | sort > /tmp/totalforms hfst-expand mag.automorf.hfst | cut -f1 -d':' | sort -u > /tmp/uniqforms diff /tmp/totalforms /tmp/uniqforms
Sentences
^i/<prn><dem>/<prn><dem><att>/<prn><pers><prx><p3><sg>$ ^laṛikā/laṛikā<n>/laṛikā<n><obl>/laṛikā<n><obl>$ ^likho/likh<v><pres><s_p3>/likh<v><pres><o_p2><hi>$^./.<sent>$