Difference between revisions of "Structural transfer"
(→Example: English-to-Kyrgyz) |
(→Example: English-to-Kyrgyz) |
||
Line 29: | Line 29: | ||
Similarly to how nominals are dealt with above, verbs ({{tag|v}}, {{tag|vblex}}, etc.) and auxiliaries ({{tag|vaux}}) can combine in various ways to combine into a <code>vP</code> (first-level verb phrase) in both languages, although when English uses "do" auxiliaries, Kyrgyz does not use an auxiliary, and Kyrgyz encodes the equivalent of a "not" adverbial as {{tag|neg}} on the main verb. | Similarly to how nominals are dealt with above, verbs ({{tag|v}}, {{tag|vblex}}, etc.) and auxiliaries ({{tag|vaux}}) can combine in various ways to combine into a <code>vP</code> (first-level verb phrase) in both languages, although when English uses "do" auxiliaries, Kyrgyz does not use an auxiliary, and Kyrgyz encodes the equivalent of a "not" adverbial as {{tag|neg}} on the main verb. | ||
− | vP -> do@AuxP.$tense.$ | + | vP -> do@AuxP.$tense.$polarity v.inf.$lemh { %2 } ; |
− | Here the <code>.$tense</code>, <code>.$ | + | Here the <code>.$tense</code>, <code>.$polarity</code>, and <code>.$lemh</code> tell which element to get attributes (features) of those names from, acting together as <code>%</code>. Specifying <code>.inf</code> requires the {{tag|v}} to have also an {{tag|inf}} tag, and specifying <code>do@</code> requires the <code>AuxP</code> to have a lemma of <code>do</code>, which would have been obtained in the parse using a rule like this: |
AuxP -> %vaux { %1 } | | AuxP -> %vaux { %1 } | |
Revision as of 08:05, 26 April 2021
Contents
Background
The basic idea of structural transfer in RBMT
The idea of structural transfer in RBMT is to deal with the order and tag differences encountered in translation between two languages
How structural transfer works in Apertium
Transfer takes the output of the biltrans
mode (bilingual translation), matches series of words based on patterns you define, and performs operations on and output those things. It allows you to change the order of words, change tags, etc.
Syntactic Structures and Parsing
The way Apertium's recursive structural transfer system works is to parse and combine phrases, and then to output each parsed phrase. Rules specify what words or phrases are parsed together into phrases and how they're output.
Example: English-to-Kyrgyz
In the English-to-Kyrgyz example for "I did not see the houses", in both languages a noun (<n>) is parsed into an NP
(noun phrase), and an NP
can combine with a determiner (<det>) to form a DP
(determiner phrase), although Kyrgyz doesn't have a definite article. These rules can be written as follows:
NP -> %n { %1 } ; DP -> det %NP { 1 _ %2 } | det %NP { %2 } ;
The beginning of each rule describes what kind of phrase to build (NP
and DP
, respectively). The part after the arrow (->
) and the {}
s is the material to parse, in this case specified as POS tags (<n>, <det>) or other phrases (NP
). The part inside the {}
s shows the order to output the elements in; in the last rule, the first element is not output. The %
s on the input elements shows what element to copy features from to the phrase level (different features can come from different elements; see VP
example below), and the %
s on the output elements shows which element to copy missing features to from the phrase node on output. The _
is simply a blank (roughly a space) in the output.
To decide between the two DP rules, either specifying lemmas, weighting, or using conditionals would need to be used (specifying lemmas is discussed below, the rest are discussed later).
Similarly to how nominals are dealt with above, verbs (<v>, <vblex>, etc.) and auxiliaries (<vaux>) can combine in various ways to combine into a vP
(first-level verb phrase) in both languages, although when English uses "do" auxiliaries, Kyrgyz does not use an auxiliary, and Kyrgyz encodes the equivalent of a "not" adverbial as <neg> on the main verb.
vP -> do@AuxP.$tense.$polarity v.inf.$lemh { %2 } ;
Here the .$tense
, .$polarity
, and .$lemh
tell which element to get attributes (features) of those names from, acting together as %
. Specifying .inf
requires the <v> to have also an <inf> tag, and specifying do@
requires the AuxP
to have a lemma of do
, which would have been obtained in the parse using a rule like this:
AuxP -> %vaux { %1 } | %vbdo { %1 } ;
The first difference in ordering is that a vP
and a DP
, while both being parsed into a VP
(a top-level verb phrase), occur in different orders. If translating from English to Kyrgyz, a rule would need to parse a sequence of vP DP
into VP
, but output the components in the reverse order. This rule would look something like this:
VP -> vP DP { 2 _ 1 } ;
In both languages, a <prn> is parsed into a DP
, and a DP
and a VP
combine to form an S
(sentence) in the same order:
S -> DP VP { 1 _ 2 } ;
In addition to these rules, each POS and phrase type needs an output pattern. These patterns would look something like this for this English-to-Kyrgyz example:
S: _ ; DP: _.number.possession.case ; NP: _.number.possession.case ; n: _.number.possession.case ; VP: _.polarity.tense.person.number ; vP: _.polarity.tense.person.number ; v: _.polarity.tense.person.number ; AuxP: _.polarity.tense.person.number ; det: %;
These output patterns define what order to arrange attributes (sets of tags, which also need to be defined) in a particular order. The _
represents a lemma followed by the main POS. Attributes are defined as follows:
person = (PD p3) p1 p2 p3 PD ; number = (ND sg) sg pl sp ND ;
For the most part these definitions are simple lists, but the first element in ()
s defines a filler in parsing if no other information is available (e.g., ND
, for "Number to be Determined") and the default to replace it with in output if no value has been set.
This example is not complete, but it does a lot of the basic work. Everything else would need to use advanced features.
Example: English-to-Spanish
under construction
The structure of a transfer file
under construction
Some things to note
- weighting
under construction
Examples of implemented Apertium transfer systems
Some examples are available:
- eng-spa (in-class): a basic example from class showing how to transfer adjective+noun from English to Spanish ("big houses → casas largas": number and gender agreement and reordering) using chunking (chunker+interchunk).
- eng-kir (apertium)
- kaz-kir (apertium)
- br-fr
under construction
Writing rules
Documentation is available...
under construction
Evaluating
Scrape a mini test corpus
- First make sure you have
scrapeTransferTests
. Test that runningscrapeTransferTests
gives you information on using the tool. If not, clone the tools repo (orgit pull
to update it, if you already have it cloned from other assignments) and runsudo make
. Test again. - Scrape the transferTests from your contrastive grammar page into a small parallel corpus. E.g.,
scrapeTransferTests -p abc-xyz "Language1_and_Language2/Contrastive_Grammar"
will result in anabc.tests.txt
andxyz.tests.txt
file that contain the respective sides of any transferTests on your contrastive grammar page specified as being for abc-to-xyz translation. - Add these two files to your bilingual corpus repository and add mention of their origin (the wiki page) to the
MANIFEST
file.
WER and PER
WER or word error rate is a measure of how different two texts are. You will want to know how different the translation your translation pair performs (the "test translation") is from the known good translation of phrases in your parallel corpus (the "reference translation").
PER (position-independent error rate) is the same measurement, just not sensitive to position in a phrase. I.e., a correct translation of every word but in an entirely wrong word order will give you high (bad) WER but low (good) PER.
To test WER and PER:
- First make sure you have
apertium-eval-translator
. Test that runningapertium-eval-translator
gives you information on using the tool. If not, clone the tools repo (orgit pull
to update it, if you already have it cloned from other assignments) and runmake
. - You need two files: one test translation, and one reference translation. The reference translation is the parallel text in your corpus, e.g.
abc.tests.txt
. To get a test translation, run the source text through apertium and direct the output into a new file, e.g.cat xyz.tests.txt | apertium -d . xyz-abc > xyz-abc.tests.txt
. You should add the [final] test translation to your repository. - The following command should then give you WER and PER measures and some other useful numbers:
-
apertium-eval-translator -r abc.tests.txt -t xyz-abc.tests.txt
-
The assignment
This assignment is early in week 13 (this semester, noon on Monday, May 3, 2021).
Getting set up
- Add a page to the wiki called
Language1_and_Language2/Structural_transfer
, linking to it from the main page on the language pair.- Put the page in the category Category:Sp21_StructuralTransfer and the categories for the two languages.
- Perform WER, PER, and coverage tests on your short sentences corpus, and add this in to a
pre-evaluation
section.
Adding stems
- Add all the words for the transfer tests (from the last assignment) to analyse to bilingual dictionary.
- And make sure both analysers can analyse all sentences correctly, which includes adding the words to the relevant monolingual dictionaries as necessary.
Write structural transfer rules
- Implement at least one item from your contrastive grammar.
- Each person in each group should implement at least one item for the direction that translates into the language that they have been primarily working with. The same item does not need to be used for each direction.
- If the contrastive grammar item only involves relabelling or reordering tags within the same form, then please do at least two items.
Wrapping up
- Add to your structural transfer wiki page:
- Add at least one example sentence for each item you implement. Show the outputs of the following modes for your translation system: tagger, biltrans, transfer, and the pair itself (abc-xyz).
- Perform WER, PER, and coverage tests again, and add into a post-evaluation section on the wiki page.