Tongan/Disambiguation
Contents
Initial Evaluation of Ambiguity
number of tokens: 1698
number of analyses: 3768
ambiguity: 2.219
Example of Ambiguity
In Tongan, pluralization of nouns occur with the use of plural noun markers - or plural classifiers as I have identified them. There are a number of different plural markers that are used with a variety of different items. Due to this form of pluralization, the noun that is undergoing pluralization remains the same. Therefore the word <fale> (house), could indeed either be analyzed as <fale><n><sg>, <fale><n><pl>, or <fale><n><du>. The only indication it is one and not the others is by looking at the neighboring words, specifically the words that come before it. My ambiguity was very high because all nouns were classified as singular, plural, and dual, but now are being classified correctly based on the plural markers (or lack of plural markers) that come before it.
Originally it would analyze the following sentence "ou kai puaka" (I eat pig) as this:
<ou> <prn><prepd><p1><pres><exl><sg> <kai> <v> <puaka> <n><sg> <n><pl> <n><du>
After disambiguation rules, the <n><pl> and <n><du> tags are removed, and only the <n><sg> form is analyzed.
Disambiguation
SET PL = (n sg) OR (n du) ; # A set that matches either noun sg or noun dual SET DU = (n sg) OR (n pl) ; # A set that matches either noun sg or noun plural SET SG = (n du) OR (n pl) ; # A set that matches either noun dual or noun plural SECTION #If there is a plural noun marker before a noun, remove the (n sg) and (n du) tags. REMOVE $$PL IF (-1 (cl pl)) ; REMOVE $$DU IF (-1 (cl du)) ; REMOVE $$SG IF (not -1 (cl pl) or (cl du)) ;
Evaluation of Ambiguity after Disambiguation
Due to the disambiguation affecting all of the nouns, the overall ambiguity of my corpus decreased a significant amount.
number of analyses: 1903
ambiguity: 1.121