Difference between revisions of "Waray/Disambiguation"

From LING073
Jump to: navigation, search
(Initial Level(s) of Ambiguity)
(Morphological Disambiguation)
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
==Morphological Disambiguation==
 
==Morphological Disambiguation==
===Initial Level(s) of Ambiguity===
+
===Github Repository===
Ambiguity before disambiguation: ~1.05008077544426494346
+
* [https://github.swarthmore.edu/Ling073-sp21/ling073-war/blob/master/apertium-war.war.rlx Link to <code>.rlx</code> file]
 +
* [https://github.swarthmore.edu/Ling073-sp21/ling073-war Link to parent repository]
 +
 
 +
===Initial Evaluation of Ambiguity===
 +
Level of ambiguity before disambiguation: ~1.05008077544426494346
  
 
===Differentiating a Pronoun from a Determiner===
 
===Differentiating a Pronoun from a Determiner===
There are two readings of 'nira': while common nouns usually proceed the word 'nira', the following words can help determine whether or not the word is a pronoun or a determiner. In the determiner case, 'nira' is usually followed by at least two—usually proper—nouns (a compound subject).  
+
There are two readings of 'hira' ('they') and its Class II possessive variation 'nira' ('their'): while common nouns usually proceed the word 'hira', the following words can help determine whether or not the word is a pronoun or a determiner. In the determiner case, 'hira' is usually followed by at least two—usually proper—nouns (a compound subject).  
 +
 
 
====Pronoun Reading====
 
====Pronoun Reading====
* Kwarta nira [Their money]
+
; Kwarta nira [''Their money''] *
 +
: Before Disambiguation:
 +
::* ^Kwarta/kwarta<n>$ '''^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$'''^./.<sent>$
 +
: After Disambiguation:
 +
::* ^Kwarta<n>$ ^'''hira<prn><pers><p3><pl><pos>$'''^.<sent>$
 +
 
 +
; Ngan amo ini an gintikangan pagbuhat nira [''They have begun to do this'']
 +
: Before Disambiguation:
 +
::* ^Ngan/ngan<det><pl>$ ^amo/*amo$ ^ini/ini<prn><det><dem>$ ^an/an<det><nom>$ ^gintikangan/*gintikangan$ ^pagbuhat/*pagbuhat$ '''^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$'''^./.<sent>$
 +
: After Disambiguation:
 +
::* ^Ngan<det><pl>$ ^*amo$ ^ini<prn><det><dem>$ ^an<det><nom>$ ^*gintikangan$ ^*pagbuhat$ '''^hira<prn><pers><p3><pl><pos>$'''^.<sent>$
  
 
====Determiner Reading====
 
====Determiner Reading====
* Kwarta nira Rico ngan Bobong [Rico and Bobong's money OR Money of Rico and Bobong]
+
; Kwarta nira Rico ngan Bobong [''Rico and Bobong's money OR Money of Rico and Bobong''] *
* Mgbantay pa hiya ha balay nira ni Rhoda ngan Romulo. [He will still guard Rhoda and Romulo’s house.]
+
: Before Disambiguation:
 +
::* ^Kwarta/kwarta<n>$ '''^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$''' ^Rico/Rico<np><ant><m>$ ^ngan/ngan<det><pl>$ ^Bobong/Bobong<np><ant><m>$^./.<sent>$
 +
:; After Disambiguation:
 +
::* ^Kwarta<n>$ '''^ni<det><pos>+ra<det><pl>$''' ^Rico<np><ant><m>$ ^ngan<det><pl>$ ^Bobong<np><ant><m>$^.<sent>$
 +
 
 +
; Mgbantay pa hiya ha balay nira ni Rhoda ngan Romulo [''He will still guard Rhoda and Romulo’s house'']
 +
: Before Disambiguation:
 +
::* ^Mgbantay/*Mgbantay$ ^pa/pa<adv>$ ^hiya/hiya<prn><pers><p3><sg><nom>$ ^ha/ha<det><obl>$ ^balay/balay<n>$ '''^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$''' ^ni/ni<det><pos>$ ^Rhoda/Rhoda<np><ant><f>$ ^ngan/ngan<det><pl>$ ^Romulo/Romulo<np><ant><man>$^./.<sent>$
 +
: After Disambiguation:
 +
::* ^*Mgbantay$ ^pa<adv>$ ^hiya<prn><pers><p3><sg><nom>$ ^ha<det><obl>$ ^balay<n>$ '''^ni<det><pos>+ra<det><pl>$''' ^ni<det><pos>$ ^Rhoda<np><ant><f>$ ^ngan<det><pl>$ ^Romulo<np><ant><man>$^.<sent>$
 +
 
 +
''* Not in corpus''
 +
 
 +
===Added Rules===
 +
# If word precedes at least one proper noun, select determiner reading
 +
#: SELECT Determiner IF (0 (prn)) (1 (np)) ;
 +
# If word precedes a determiner and at least one proper noun, select determiner reading
 +
#:SELECT Determiner IF (0 (prn)) (1 (det)) (2 (np)) ;
 +
# Pronoun case
 +
#: REMOVE Determiner IF (0 (prn)) (1 EOS) ;
 +
 
 +
===Final Evaluation of Ambiguity===
 +
''(As of Apr 27, 2021)''
 +
Level of ambiguity after disambiguation: ~1.05008077544426494346 **
 +
 
 +
* Total number of forms: 136
 +
* Number of unique forms: 133
 +
 
 +
''** The ambiguity is the same because there is not yet enough context in the lexicon for the disambiguation to properly disambiguate''
 +
 
  
 
[[Category:Sp21_Disambiguation]] [[Category:Waray]]
 
[[Category:Sp21_Disambiguation]] [[Category:Waray]]

Latest revision as of 15:30, 29 April 2021

Morphological Disambiguation

Github Repository

Initial Evaluation of Ambiguity

Level of ambiguity before disambiguation: ~1.05008077544426494346

Differentiating a Pronoun from a Determiner

There are two readings of 'hira' ('they') and its Class II possessive variation 'nira' ('their'): while common nouns usually proceed the word 'hira', the following words can help determine whether or not the word is a pronoun or a determiner. In the determiner case, 'hira' is usually followed by at least two—usually proper—nouns (a compound subject).

Pronoun Reading

Kwarta nira [Their money] *
Before Disambiguation:
  • ^Kwarta/kwarta<n>$ ^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$^./.<sent>$
After Disambiguation:
  • ^Kwarta<n>$ ^hira<prn><pers><p3><pl><pos>$^.<sent>$
Ngan amo ini an gintikangan pagbuhat nira [They have begun to do this]
Before Disambiguation:
  • ^Ngan/ngan<det><pl>$ ^amo/*amo$ ^ini/ini<prn><det><dem>$ ^an/an<det><nom>$ ^gintikangan/*gintikangan$ ^pagbuhat/*pagbuhat$ ^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$^./.<sent>$
After Disambiguation:
  • ^Ngan<det><pl>$ ^*amo$ ^ini<prn><det><dem>$ ^an<det><nom>$ ^*gintikangan$ ^*pagbuhat$ ^hira<prn><pers><p3><pl><pos>$^.<sent>$

Determiner Reading

Kwarta nira Rico ngan Bobong [Rico and Bobong's money OR Money of Rico and Bobong] *
Before Disambiguation:
  • ^Kwarta/kwarta<n>$ ^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$ ^Rico/Rico<np><ant><m>$ ^ngan/ngan<det><pl>$ ^Bobong/Bobong<np><ant><m>$^./.<sent>$
After Disambiguation
  • ^Kwarta<n>$ ^ni<det><pos>+ra<det><pl>$ ^Rico<np><ant><m>$ ^ngan<det><pl>$ ^Bobong<np><ant><m>$^.<sent>$
Mgbantay pa hiya ha balay nira ni Rhoda ngan Romulo [He will still guard Rhoda and Romulo’s house]
Before Disambiguation:
  • ^Mgbantay/*Mgbantay$ ^pa/pa<adv>$ ^hiya/hiya<prn><pers><p3><sg><nom>$ ^ha/ha<det><obl>$ ^balay/balay<n>$ ^nira/hira<prn><pers><p3><pl><pos>/ni<det><pos>+ra<det><pl>$ ^ni/ni<det><pos>$ ^Rhoda/Rhoda<np><ant><f>$ ^ngan/ngan<det><pl>$ ^Romulo/Romulo<np><ant><man>$^./.<sent>$
After Disambiguation:
  • ^*Mgbantay$ ^pa<adv>$ ^hiya<prn><pers><p3><sg><nom>$ ^ha<det><obl>$ ^balay<n>$ ^ni<det><pos>+ra<det><pl>$ ^ni<det><pos>$ ^Rhoda<np><ant><f>$ ^ngan<det><pl>$ ^Romulo<np><ant><man>$^.<sent>$

* Not in corpus

Added Rules

  1. If word precedes at least one proper noun, select determiner reading
    SELECT Determiner IF (0 (prn)) (1 (np)) ;
  2. If word precedes a determiner and at least one proper noun, select determiner reading
    SELECT Determiner IF (0 (prn)) (1 (det)) (2 (np)) ;
  3. Pronoun case
    REMOVE Determiner IF (0 (prn)) (1 EOS) ;

Final Evaluation of Ambiguity

(As of Apr 27, 2021) Level of ambiguity after disambiguation: ~1.05008077544426494346 **

  • Total number of forms: 136
  • Number of unique forms: 133

** The ambiguity is the same because there is not yet enough context in the lexicon for the disambiguation to properly disambiguate