Biak/Final project

From LING073
Revision as of 10:04, 20 May 2021 by Mfergus3 (talk | contribs) (Evaluation)

Jump to: navigation, search

Additions

  • Inalienable Possession
    • Expanded contrastive grammar page to add more tests
    • Added inalienable stems to the multilingual dictionary
    • Added structural transfer rule
  • Alienable Possession
    • Expanded structural transfer rule, signifying possession through "'s" or a possessive pronoun depending on the sentence with the presence of the alienable possession form.
  • Interjections
    • Added support for standalone filler words as well as filler letters added to the end of words such as e, u, etc.
    • Added disambiguation rule to differentiate article and verb ambiguation that arose as a result of interjection pattern.
  • Adjectives - exist only as verbs in Biak
    • Used .lsx file to resolve verbs that exist as "be" + adj
  • Nouns with multiword meanings
    • Used .lsx file to differentiate between the adjective and noun part of the word, such as in the case of "old woman"
  • Plurals
    • In Biak, nouns are edited by the conjugation of the verb. Used structural transfer to added plurality to the English noun based on the number of the verb.
  • Subject pronoun inference
    • Standalone verbs in Biak imply the presence of a subject pronoun. Added structural transfer rule that supports this pattern.
  • Differentiating subject/object pronouns
    • Using verb placement, can determine the correct English tagging of pronouns
  • Complex articles/demonstratives
    • Complex articles in Biak can contain specificity and adverbs, used structural transfer to handle different cases, and correctly order terms.

Code

Evaluation

Evalulation
Size of Parallel Corpus !! 13,968 words
Transducer Coverage 58.5%
Stems in Transducer 435
WER 88.86%
PER 70.99%
Precision 88.21%
Recall 92.24%

Moving Forward

  • Anaphora resolution for pronouns
  • Extending morphology for transducer--e.g., full/partial reduplication
  • Working with community
    • Future direction based on community needs and wants in terms of computational linguistic tools
    • Evaluation of current tools, ensuring that our tools our an accurate representative of the language
  • Expanding dictionary