Difference between revisions of "User:Jmalin1/Final Project"
(Created page with "=Transducer for Innu= For my final project I began a transducer for Innu (also known as Montagnais). It is available on [https://github.swarthmore.edu/jmalin1/ling073-moe Swa...") |
(→Next Steps) |
||
Line 16: | Line 16: | ||
===Disambiguation=== | ===Disambiguation=== | ||
− | There are several ambiguous stems in the transducer - for example, many Innu adverbs are also interjections or conjunctions with different meanings. Variable word order means that disambiguation rules are difficult to | + | There are several ambiguous stems in the transducer - for example, many Innu adverbs are also interjections or conjunctions with different meanings. Variable word order means that precise disambiguation rules are difficult to come up with. |
===Noun and Verb Formation=== | ===Noun and Verb Formation=== | ||
As a polysynthetic language, Innu has several productive derivational processes. Currently, these are handled with different stems in the transducer, but future versions of the transducer could perhaps handle them automatically. | As a polysynthetic language, Innu has several productive derivational processes. Currently, these are handled with different stems in the transducer, but future versions of the transducer could perhaps handle them automatically. | ||
+ | |||
+ | ==Dialectical Variation== | ||
+ | Innu has several different dialects. The transducer functions in a modern, standardized orthography but could be modified to analyze text written in variant orthographies as well. |
Revision as of 21:38, 10 May 2017
Contents
Transducer for Innu
For my final project I began a transducer for Innu (also known as Montagnais). It is available on Swarthmore's github.
Evaluation
Coverage was tested over a preliminary corpus (8476 characters without spaces) composed of 4 myths taken from the Innu-Aimun website. The corpus is available in a repository on Swarthmore's github, which is private due to copyright.
Current corpus coverage is 54.58% (although approximately 22% of analyzed tokens are punctuation).
Next Steps
Double Obviation
Suffixes for obviation may change when an obviative noun is possessed by another obviative noun. Currently, this is implemented in the transducer, but messily, with some overgeneration of forms for possessed obviative nouns. It should be updated with greater understanding of the morphological processes involved.
More Verb Paradigms
While all major verb paradigms have been implemented in the present indicative, other tenses and moods have yet to be implemented. The future tense in particular has the issue that morphemes generally analyzed as verb prefixes are instead written as a separate word, meaning some reformatting of the other verb paradigms may be required to accommodate this.
Disambiguation
There are several ambiguous stems in the transducer - for example, many Innu adverbs are also interjections or conjunctions with different meanings. Variable word order means that precise disambiguation rules are difficult to come up with.
Noun and Verb Formation
As a polysynthetic language, Innu has several productive derivational processes. Currently, these are handled with different stems in the transducer, but future versions of the transducer could perhaps handle them automatically.
Dialectical Variation
Innu has several different dialects. The transducer functions in a modern, standardized orthography but could be modified to analyze text written in variant orthographies as well.