Difference between revisions of "Tibetan/Transducer"

From LING073
Jump to: navigation, search
Line 6: Line 6:
  
 
  The suffix «pa» is added when the final letter of the root is any consonant except for «r» or «l». On the other hand, the suffix «wa» is used if the verb root ends in «r», «l» or any vowel.
 
  The suffix «pa» is added when the final letter of the root is any consonant except for «r» or «l». On the other hand, the suffix «wa» is used if the verb root ends in «r», «l» or any vowel.
 +
 +
So right now, if "pa" or "ba" is added to the end of an infinitive, it will analyze either form.  This is undesirable, as there is only one correct form for any given verb.  This is a problem I will be able to fix with the "Rules" section of the '''twol''' file.

Revision as of 08:30, 6 March 2018

This page is dedicated to describing the Tibetan morphological analyzer I created as part of this class. The code for this project is located on Github.

Notes

The first form I implemented was the infinitive. In Tibetan, an infinitive is formed by adding པ (pa) or བ (ba) to the verbal root. For example, the verb ལོཀ (lok) meaning "read" is changed to ལོཀ་པ (lok-pa) to form the verb "to read." At first, I was confused about why this verb wouldn't be transliterated as "loka-pa," since the character ཀ is transliterated as "ka" rather than "k." However, I learned that in Tibetan, words are broken up into different syllables and are separated by a tsek, which is the small dot separating ལོཀ and པ in ལོཀ་པ. The tsek almost serves as a space in Tibetan, since spaces are not used to break words. Right now, some incorrect forms will analyze without giving an error. From my grammar page:

The suffix «pa» is added when the final letter of the root is any consonant except for «r» or «l». On the other hand, the suffix «wa» is used if the verb root ends in «r», «l» or any vowel.

So right now, if "pa" or "ba" is added to the end of an infinitive, it will analyze either form. This is undesirable, as there is only one correct form for any given verb. This is a problem I will be able to fix with the "Rules" section of the twol file.