Difference between revisions of "Kaingang and Portuguese/Final Project"

From LING073
Jump to: navigation, search
(The project: RBMT or NMT?)
(Idea)
Line 3: Line 3:
 
Kaingang is a language spoken in the southern Brazil region. It has roughly 20,000 speakers, many of whom are bilingual with Portuguese. Kaingang is one the most well-documented languages of indigenous peoples of Brazil; however, in our project, we were only able to use a [https://www.sil.org/resources/archives/42876 Kaingang-Portuguese dictionary] and a [https://s3.amazonaws.com/scriptureearth/data/kgp/PDF/00-WNTkgp-web.pdf Kaingang new testament bible] (the other resources were written in languages we didn't read).
 
Kaingang is a language spoken in the southern Brazil region. It has roughly 20,000 speakers, many of whom are bilingual with Portuguese. Kaingang is one the most well-documented languages of indigenous peoples of Brazil; however, in our project, we were only able to use a [https://www.sil.org/resources/archives/42876 Kaingang-Portuguese dictionary] and a [https://s3.amazonaws.com/scriptureearth/data/kgp/PDF/00-WNTkgp-web.pdf Kaingang new testament bible] (the other resources were written in languages we didn't read).
 
===Idea===
 
===Idea===
We hoped to find out how well a neural machine translation could work with a language that didn't have millions of parallel lines (in fact, we have ~10,000). We also wanted to compare it to our rule-based transducer, which we've been making additions to since the end of the semester.
+
We wanted to find out how well a neural machine translation could work with a language that didn't have millions of parallel lines (in fact, we have ~10,000). We also wanted to compare it to our rule-based transducer, which we've been making additions to since the end of the semester.
 +
 
 
===Motivation===
 
===Motivation===
 
This is the only (at least easily accessible when searching) machine translation available between Kaingang and another language. The motivation behind our project was to possibly allow the Kaingang people to not have to choose between teaching Kaingang or Portuguese to later generations. We hoped (if at some point one of the translators was state-of-the-art) any need for Portuguese could be accessed through the translator, which would be an incentive to keep the language Kaingang alive.
 
This is the only (at least easily accessible when searching) machine translation available between Kaingang and another language. The motivation behind our project was to possibly allow the Kaingang people to not have to choose between teaching Kaingang or Portuguese to later generations. We hoped (if at some point one of the translators was state-of-the-art) any need for Portuguese could be accessed through the translator, which would be an incentive to keep the language Kaingang alive.

Revision as of 22:45, 14 May 2019

The project: RBMT or NMT?

Background

Kaingang is a language spoken in the southern Brazil region. It has roughly 20,000 speakers, many of whom are bilingual with Portuguese. Kaingang is one the most well-documented languages of indigenous peoples of Brazil; however, in our project, we were only able to use a Kaingang-Portuguese dictionary and a Kaingang new testament bible (the other resources were written in languages we didn't read).

Idea

We wanted to find out how well a neural machine translation could work with a language that didn't have millions of parallel lines (in fact, we have ~10,000). We also wanted to compare it to our rule-based transducer, which we've been making additions to since the end of the semester.

Motivation

This is the only (at least easily accessible when searching) machine translation available between Kaingang and another language. The motivation behind our project was to possibly allow the Kaingang people to not have to choose between teaching Kaingang or Portuguese to later generations. We hoped (if at some point one of the translators was state-of-the-art) any need for Portuguese could be accessed through the translator, which would be an incentive to keep the language Kaingang alive.

Links to repositories

Kaingang transducer

Kaingang-Portuguese translator

Evaluation