Difference between revisions of "Biak/Final project"

From LING073
Jump to: navigation, search
(Created page with "Category:sp21_FinalProjects")
 
(Code)
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
== Additions ==
 +
* Inalienable Possession
 +
** Expanded contrastive grammar page to add more tests
 +
** Added inalienable stems to the multilingual dictionary
 +
** Added structural transfer rule
 +
* Alienable Possession
 +
** Expanded structural transfer rule, signifying possession through "'s" or a possessive pronoun depending on the sentence with the presence of the alienable possession form.
 +
* Interjections
 +
** Added support for standalone filler words as well as filler letters added to the end of words such as e, u, etc.
 +
** Added disambiguation rule to differentiate article and verb ambiguation that arose as a result of interjection pattern.
 +
* Adjectives - exist only as verbs in Biak
 +
** Used .lsx file to resolve verbs that exist as "be" + adj
 +
* Nouns with multiword meanings
 +
** Used .lsx file to differentiate between the adjective and noun part of the word, such as in the case of "old woman"
 +
* Plurals
 +
** In Biak, nouns are edited by the conjugation of the verb. Used structural transfer to added plurality to the English noun based on the number of the verb.
 +
* Subject pronoun inference
 +
** Standalone verbs in Biak imply the presence of a subject pronoun. Added structural transfer rule that supports this pattern.
 +
* Differentiating subject/object pronouns
 +
** Using verb placement, can determine the correct English tagging of pronouns
 +
* Complex articles/demonstratives
 +
** Complex articles in Biak can contain specificity and adverbs, used structural transfer to handle different cases, and correctly order terms.
 +
 +
== Code ==
 +
 +
[https://github.com/tmoux/ling073-bhw Biak Transducer]
 +
 +
[https://github.com/tmoux/ling073-bhw-eng Biak-English Machine Translation]
 +
 +
== Evaluation ==
 +
{| class="wikitable"
 +
|+ Evalulation
 +
|-
 +
| Size of Parallel Corpus || 13,968 words
 +
|-
 +
| Transducer Coverage || 58.5%
 +
|-
 +
| Stems in Transducer || 435
 +
|-
 +
| WER || 88.86%
 +
|-
 +
| PER || 70.99%
 +
|-
 +
| Precision || 88.21%
 +
|-
 +
| Recall || 92.24%
 +
|}
 +
 +
== Moving Forward ==
 +
* Anaphora resolution for pronouns
 +
* Extending morphology for transducer--e.g., full/partial reduplication
 +
* Working with community
 +
** Future direction based on community needs and wants in terms of computational linguistic tools
 +
** Evaluation of current tools, ensuring that our tools our an accurate representative of the language
 +
*Expanding dictionary
 +
 
[[Category:sp21_FinalProjects]]
 
[[Category:sp21_FinalProjects]]

Latest revision as of 10:07, 20 May 2021

Additions

  • Inalienable Possession
    • Expanded contrastive grammar page to add more tests
    • Added inalienable stems to the multilingual dictionary
    • Added structural transfer rule
  • Alienable Possession
    • Expanded structural transfer rule, signifying possession through "'s" or a possessive pronoun depending on the sentence with the presence of the alienable possession form.
  • Interjections
    • Added support for standalone filler words as well as filler letters added to the end of words such as e, u, etc.
    • Added disambiguation rule to differentiate article and verb ambiguation that arose as a result of interjection pattern.
  • Adjectives - exist only as verbs in Biak
    • Used .lsx file to resolve verbs that exist as "be" + adj
  • Nouns with multiword meanings
    • Used .lsx file to differentiate between the adjective and noun part of the word, such as in the case of "old woman"
  • Plurals
    • In Biak, nouns are edited by the conjugation of the verb. Used structural transfer to added plurality to the English noun based on the number of the verb.
  • Subject pronoun inference
    • Standalone verbs in Biak imply the presence of a subject pronoun. Added structural transfer rule that supports this pattern.
  • Differentiating subject/object pronouns
    • Using verb placement, can determine the correct English tagging of pronouns
  • Complex articles/demonstratives
    • Complex articles in Biak can contain specificity and adverbs, used structural transfer to handle different cases, and correctly order terms.

Code

Biak Transducer

Biak-English Machine Translation

Evaluation

Evalulation
Size of Parallel Corpus 13,968 words
Transducer Coverage 58.5%
Stems in Transducer 435
WER 88.86%
PER 70.99%
Precision 88.21%
Recall 92.24%

Moving Forward

  • Anaphora resolution for pronouns
  • Extending morphology for transducer--e.g., full/partial reduplication
  • Working with community
    • Future direction based on community needs and wants in terms of computational linguistic tools
    • Evaluation of current tools, ensuring that our tools our an accurate representative of the language
  • Expanding dictionary