Tongan/Disambiguation

From LING073
Jump to: navigation, search

Initial Evaluation of Ambiguity

number of tokens: 1698

number of analyses: 3768

ambiguity: 2.219

Example of Ambiguity

In Tongan, pluralization of nouns occur with the use of plural noun markers - or plural classifiers as I have identified them. There are a number of different plural markers that are used with a variety of different items. Due to this form of pluralization, the noun that is undergoing pluralization remains the same. Therefore the word <fale> (house), could indeed either be analyzed as <fale><n><sg>, <fale><n><pl>, or <fale><n><du>. The only indication it is one and not the others is by looking at the neighboring words, specifically the words that come before it. My ambiguity was very high because all nouns were classified as singular, plural, and dual, but now are being classified correctly based on the plural markers (or lack of plural markers) that come before it.

Originally it would analyze the following sentence "ou kai puaka" (I eat pig) as this:

<ou>
     <prn><prepd><p1><pres><exl><sg>
<kai>
      <v>
<puaka>
      <n><sg>
      <n><pl>
      <n><du>

After disambiguation rules, the <n><pl> and <n><du> tags are removed, and only the <n><sg> form is analyzed.

Disambiguation

SET PL = (n sg) OR (n du) ; # A set that matches either noun sg or noun dual
SET DU = (n sg) OR (n pl) ; # A set that matches either noun sg or noun plural
SET SG = (n du) OR (n pl) ; # A set that matches either noun dual or noun plural

SECTION

#If there is a plural noun marker before a noun, remove the (n sg) and (n du) tags.
REMOVE $$PL IF (-1 (cl pl)) ;

REMOVE $$DU IF (-1 (cl du)) ;

REMOVE $$SG IF (not -1 (cl pl) or (cl du)) ;


Evaluation of Ambiguity after Disambiguation

Due to the disambiguation affecting all of the nouns, the overall ambiguity of my corpus decreased a significant amount.

number of analyses: 1903

ambiguity: 1.121