Tongan/Transducer
Contents
Notes
Non Passing Tests
- Verb tenses
- Past
- Present
- Future
- Imperative
- Pluralization
- Plural Noun Markers
- Irregular Noun Pluralization
- Adjectivizer Suffix
- Articles
These tests do not pass currently because I have not implemented their respective lexicon classes yet.
Evaluation
Initial aq-cov test
Coverage: 12.15%
Top unknown words in the corpus:
Word | Word Count |
---|---|
e | 76 |
he | 70 |
naʻe | 47 |
ʻae | 47 |
pea | 44 |
ʻi | 43 |
Pea | 33 |
ʻOtua | 32 |
p | 30 |
ʻe | 25 |
ki | 22 |
ʻoku | 19 |
hono | 19 |
ʻoe | 17 |
ʻo | 17 |
ngaahi | 17 |
kuo | 16 |
ai | 14 |
meʻa | 14 |
Analysis of certain unknown words
e<adj><def>: e
he<adj><sdef>: he
ngaahi<cl><pl>: ngaahi
pea<n><sg>: pea
Current Evaluation
After defining the noun pea<n>:pea (English gloss is either bear, pear, or and), coverage increased from 12.15% to 17.29%.
Current number of stems in my transducer is 75.
Current top unknown words
Now the top unknown words are as follows:
Word | Word Count |
---|---|
e | 76 |
he | 70 |
naʻe | 47 |
ʻae | 47 |
pea | 44 |
ʻi | 43 |
Pea | 33 |
ʻOtua | 32 |
p | 30 |
ʻe | 25 |
ki | 22 |
ʻoku | 19 |
hono | 19 |
ʻoe | 17 |
ʻo | 17 |
ngaahi | 17 |
kuo | 16 |
ai | 14 |
vai | 13 |
ʻaho | 12 |
yaml Test Files
yaml tests | tests passing (passing tests/total tests) |
---|---|
ton.yaml | 75/146 |
commonwords.yaml | 1/20 |
Generator Evaluation
initial evaluation of morphological generation
Tests | tests passing (passing tests/total tests) |
---|---|
ton.yaml | 75/146 (51%) |
Coverage | 17.72% |
Generational Tests | tests passing (passing tests/total tests) |
---|---|
ton.yaml | 75/147 (51%) |
final evaluation of morphological generation
Notes: Tongan's morphology does not feature much affix morphology. There are a limited number of suffixes and prefixes, and they only surface in certain words. Instead, Tongan relies heavily on "markers", such as tense markers (auxiliary verbs) and plural markers (plural classifiers), that appear before a given noun or verb. So most of the work I did for this morphological transducer involved direct mappings for all words. This also means there were no twol tests that I could do. The number of total tests have also changed since the first evaluation of the morphological generator, because a number of my tests were formatted incorrectly. They were formatted as entire phrases and sentences, which this software cannot analyze. So I removed these tests, and added additional possessive pronouns because there were many I had not included in the lexicon yet. I have 85% of my tests passing. I began working on a twol rule - when a noun is preceded by a definite article (he or e), stress falls on the ultimate syllable of that noun, or the last letter which is always a vowel. The issue I found with this rule, is that the accent marker (á) , is not always included across the orthographies of Tongan. In fact, after cross referencing this information in my corpus, I could not find an instance of this rule taking affect. This may be because only semi-definite articles were used, which also take the form he or e. Therefore the number of working twol rules as of now is 0, although I hope to finish it as well as prove that it is necessary to have this rule in place soon.
Tests | tests passing (passing tests/total tests) |
---|---|
ton.yaml | 122/142 (51%) |
Coverage | 11.75% *(a little unsure as to why the coverage is less than earlier) |
Generational Tests | tests passing (passing tests/total tests) |
---|---|
ton.yaml | 75/147 (51%) |