Difference between revisions of "Biak/Final project"

From LING073
Jump to: navigation, search
(Evaluation)
(Code)
 
(One intermediate revision by the same user not shown)
Line 24: Line 24:
 
== Code ==
 
== Code ==
  
 +
[https://github.com/tmoux/ling073-bhw Biak Transducer]
 +
 +
[https://github.com/tmoux/ling073-bhw-eng Biak-English Machine Translation]
  
 
== Evaluation ==
 
== Evaluation ==

Latest revision as of 10:07, 20 May 2021

Additions

  • Inalienable Possession
    • Expanded contrastive grammar page to add more tests
    • Added inalienable stems to the multilingual dictionary
    • Added structural transfer rule
  • Alienable Possession
    • Expanded structural transfer rule, signifying possession through "'s" or a possessive pronoun depending on the sentence with the presence of the alienable possession form.
  • Interjections
    • Added support for standalone filler words as well as filler letters added to the end of words such as e, u, etc.
    • Added disambiguation rule to differentiate article and verb ambiguation that arose as a result of interjection pattern.
  • Adjectives - exist only as verbs in Biak
    • Used .lsx file to resolve verbs that exist as "be" + adj
  • Nouns with multiword meanings
    • Used .lsx file to differentiate between the adjective and noun part of the word, such as in the case of "old woman"
  • Plurals
    • In Biak, nouns are edited by the conjugation of the verb. Used structural transfer to added plurality to the English noun based on the number of the verb.
  • Subject pronoun inference
    • Standalone verbs in Biak imply the presence of a subject pronoun. Added structural transfer rule that supports this pattern.
  • Differentiating subject/object pronouns
    • Using verb placement, can determine the correct English tagging of pronouns
  • Complex articles/demonstratives
    • Complex articles in Biak can contain specificity and adverbs, used structural transfer to handle different cases, and correctly order terms.

Code

Biak Transducer

Biak-English Machine Translation

Evaluation

Evalulation
Size of Parallel Corpus 13,968 words
Transducer Coverage 58.5%
Stems in Transducer 435
WER 88.86%
PER 70.99%
Precision 88.21%
Recall 92.24%

Moving Forward

  • Anaphora resolution for pronouns
  • Extending morphology for transducer--e.g., full/partial reduplication
  • Working with community
    • Future direction based on community needs and wants in terms of computational linguistic tools
    • Evaluation of current tools, ensuring that our tools our an accurate representative of the language
  • Expanding dictionary