User:Jmalin1/Final Project

From LING073
< User:Jmalin1
Revision as of 21:33, 10 May 2017 by Jmalin1 (talk | contribs) (Created page with "=Transducer for Innu= For my final project I began a transducer for Innu (also known as Montagnais). It is available on [https://github.swarthmore.edu/jmalin1/ling073-moe Swa...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Transducer for Innu

For my final project I began a transducer for Innu (also known as Montagnais). It is available on Swarthmore's github.

Evaluation

Coverage was tested over a preliminary corpus (8476 characters without spaces) composed of 4 myths taken from the Innu-Aimun website. The corpus is available in a repository on Swarthmore's github, which is private due to copyright.

Current corpus coverage is 54.58% (although approximately 22% of analyzed tokens are punctuation).

Next Steps

Double Obviation

Suffixes for obviation may change when an obviative noun is possessed by another obviative noun. Currently, this is implemented in the transducer, but messily, with some overgeneration of forms for possessed obviative nouns. It should be updated with greater understanding of the morphological processes involved.

More Verb Paradigms

While all major verb paradigms have been implemented in the present indicative, other tenses and moods have yet to be implemented. The future tense in particular has the issue that morphemes generally analyzed as verb prefixes are instead written as a separate word, meaning some reformatting of the other verb paradigms may be required to accommodate this.

Disambiguation

There are several ambiguous stems in the transducer - for example, many Innu adverbs are also interjections or conjunctions with different meanings. Variable word order means that disambiguation rules are difficult to write.

Noun and Verb Formation

As a polysynthetic language, Innu has several productive derivational processes. Currently, these are handled with different stems in the transducer, but future versions of the transducer could perhaps handle them automatically.