Difference between revisions of "Biak/Final project"
From LING073
(→Evaluation) |
(→Evaluation) |
||
Line 29: | Line 29: | ||
|+ Evalulation | |+ Evalulation | ||
|- | |- | ||
− | | Size of Parallel Corpus | + | | Size of Parallel Corpus || 13,968 words |
|- | |- | ||
| Transducer Coverage || 58.5% | | Transducer Coverage || 58.5% |
Revision as of 11:04, 20 May 2021
Contents
Additions
- Inalienable Possession
- Expanded contrastive grammar page to add more tests
- Added inalienable stems to the multilingual dictionary
- Added structural transfer rule
- Alienable Possession
- Expanded structural transfer rule, signifying possession through "'s" or a possessive pronoun depending on the sentence with the presence of the alienable possession form.
- Interjections
- Added support for standalone filler words as well as filler letters added to the end of words such as e, u, etc.
- Added disambiguation rule to differentiate article and verb ambiguation that arose as a result of interjection pattern.
- Adjectives - exist only as verbs in Biak
- Used .lsx file to resolve verbs that exist as "be" + adj
- Nouns with multiword meanings
- Used .lsx file to differentiate between the adjective and noun part of the word, such as in the case of "old woman"
- Plurals
- In Biak, nouns are edited by the conjugation of the verb. Used structural transfer to added plurality to the English noun based on the number of the verb.
- Subject pronoun inference
- Standalone verbs in Biak imply the presence of a subject pronoun. Added structural transfer rule that supports this pattern.
- Differentiating subject/object pronouns
- Using verb placement, can determine the correct English tagging of pronouns
- Complex articles/demonstratives
- Complex articles in Biak can contain specificity and adverbs, used structural transfer to handle different cases, and correctly order terms.
Code
Evaluation
Size of Parallel Corpus | 13,968 words |
Transducer Coverage | 58.5% |
Stems in Transducer | 435 |
WER | 88.86% |
PER | 70.99% |
Precision | 88.21% |
Recall | 92.24% |
Moving Forward
- Anaphora resolution for pronouns
- Extending morphology for transducer--e.g., full/partial reduplication
- Working with community
- Future direction based on community needs and wants in terms of computational linguistic tools
- Evaluation of current tools, ensuring that our tools our an accurate representative of the language
- Expanding dictionary