Difference between revisions of "Magahi/Disambiguation"

From LING073
Jump to: navigation, search
(Sentences)
(Sentences)
Line 23: Line 23:
  
 
= Sentences =
 
= Sentences =
* ^i/<prn><dem>/<prn><dem><att>/<prn><pers><prx><p3><sg>$ ^laṛikā/laṛikā<n>/laṛikā<n><obl>/laṛikā<n><obl>$ ^likho/likh<v><pres><s_p3>/likh<v><pres><o_p2><hi>$^./.<sent>$
+
*<code>^i/<prn><dem>/<prn><dem><att>/<prn><pers><prx><p3><sg>$ ^laṛikā/laṛikā<n>/laṛikā<n><obl>/laṛikā<n><obl>$ ^likho/likh<v><pres><s_p3>/likh<v><pres><o_p2><hi>$^./.<sent>$</code>
 
*  
 
*  
 
[[Category:Sp21_Disambiguation]][[Category:Magahi]]
 
[[Category:Sp21_Disambiguation]][[Category:Magahi]]

Revision as of 20:18, 4 April 2021

Initial Evaluation of Ambiguity

./disambiguation-test.sh

Ambiguity before disambiguation: ~1.11354466858789625360
Ambiguity after disambiguation: ~1.04495677233429394813

Corpus

Forms in corpus with more than one analysis

cat corpus.txt | lt-proc /path/to/mag.automorf.bin | sed 's/$\W*\^/$\n^/g' | grep '\/.*\/'

Transducer

Number of all forms: 4293

hfst-expand mag.automorf.hfst | wc -l

Number of unique forms: 2985

hfst-expand mag.automorf.hfst | cut -f1 -d':' | sort -u | wc -l

Ambiguity of 1.43 (4293/2985).

Multiple Analyses:

hfst-expand mag.automorf.hfst | cut -f1 -d':' | sort > /tmp/totalforms
hfst-expand mag.automorf.hfst | cut -f1 -d':' | sort -u > /tmp/uniqforms
diff /tmp/totalforms /tmp/uniqforms

Sentences

  • ^i/<prn><dem>/<prn><dem><att>/<prn><pers><prx><p3><sg>$ ^laṛikā/laṛikā<n>/laṛikā<n><obl>/laṛikā<n><obl>$ ^likho/likh<v><pres><s_p3>/likh<v><pres><o_p2><hi>$^./.<sent>$