Difference between revisions of "Structural transfer"

From LING073
Jump to: navigation, search
(Example: English-to-Kyrgyz)
(Example: Kyrgyz-to-English: working rule)
(29 intermediate revisions by 2 users not shown)
Line 3: Line 3:
  
 
=== The basic idea of structural transfer in RBMT ===
 
=== The basic idea of structural transfer in RBMT ===
The idea of structural transfer in RBMT is to deal with the order and tag differences encountered in translation between two languages
+
The idea of structural transfer in RBMT is to deal with the order and tag differences encountered in translation between two languages.
  
 
[[File:Transfer - basic idea.png|thumb|The arrows between the two tagged levels represent where structural transfer is needed.  Colour coding shows [rough] correspondences.]]
 
[[File:Transfer - basic idea.png|thumb|The arrows between the two tagged levels represent where structural transfer is needed.  Colour coding shows [rough] correspondences.]]
  
 
=== How structural transfer works in Apertium ===
 
=== How structural transfer works in Apertium ===
Transfer takes the output of the <code>biltrans</code> mode (bilingual translation), matches series of words based on patterns you define, and performs operations on and output those things.  It allows you to change the order of words, change tags, etc.
+
Transfer takes the output of the <code>lex</code> mode ([[lexical selection]]), matches series of words based on patterns you define, and performs operations on and outputs those things.  It allows you to change the order of words, change tags, add and remove words, etc.
  
 
=== Syntactic Structures and Parsing ===
 
=== Syntactic Structures and Parsing ===
The way Apertium's recursive structural transfer system works is to parse and combine phrases, and then to output each parsed phrase.  Rules specify what words or phrases are parsed together into phrases and how they're output.
+
The way [[:apertium:Apertium-recursive|Apertium's recursive structural transfer system]] works is to parse and combine phrases, and then to output each parsed phrase.  Rules specify what words or phrases are parsed together into phrases and how they're output.
 +
 
 +
In other words, at a basic level the recursive structural transfer system first turns analyses from the source languages into a tree according to parse rules, and then outputs that according to both output rules and patterns for each node.  You can think of it as building a tree, going up from from the words, and then working its way down from the top to output forms again.
  
 
==== Example: English-to-Kyrgyz ====
 
==== Example: English-to-Kyrgyz ====
  
[[File:Tree alignment eng-kir PP.png|thumb|The mapping between phrase-structure trees of the Kyrgyz and English sentence above ("I see the houses")]]
+
[[File:Tree alignment eng-kir PP.png|thumb|The mapping between phrase-structure trees of the Kyrgyz and English sentence above ("I did not see the houses")]]
  
In the English-to-Kyrgyz example for "I did not see the houses", in both languages a noun ({{tag|n}}) is parsed into an <code>NP</code> (noun phrase), and an <code>NP</code> can combine with a determiner ({{tag|det}}) to form a <code>DP</code> (determiner phrase), although Kyrgyz doesn't have a definite article.  These rules can be written as follows:
+
In the English-to-Kyrgyz example for "I did not see the houses", in both languages a noun ({{tag|n}}) is parsed into an <code>NP</code> (noun phrase), and an <code>NP</code> can combine with a determiner ({{tag|det}}) to form a <code>DP</code> (determiner phrase), although Kyrgyz doesn't have a definite article.  They can also both form a <code>DP</code> from just a pronoun.  These rules can be written as follows:
  
 
  NP -> %n { %1 } ;
 
  NP -> %n { %1 } ;
  DP -> det %NP { 1 _ %2 } |
+
  DP -> det %NP { %2 } |
       det %NP { %2 } ;
+
       det %NP { 1 _ %2 } |
 +
      %prn { %1 } ;
  
The beginning of each rule describes what kind of phrase to build (<code>NP</code> and <code>DP</code>, respectively).  The part after the arrow (<code>-&gt;</code>) and the <code>{}</code>s is the material to parse, in this case specified as POS tags ({{tag|n}}, {{tag|det}}) or other phrases (<code>NP</code>).  The part inside the <code>{}</code>s shows the order to output the elements in; in the last rule, the first element is not output.  The <code>%</code>s on the input elements shows what element to copy features from to the phrase level (different features can come from different elements; see <code>VP</code> example below), and the <code>%</code>s on the output elements shows which element to copy missing features to from the phrase node on output.  The <code>_</code> is simply a blank (roughly a space) in the output.
+
The beginning of each rule describes what kind of phrase to build (<code>NP</code> and <code>DP</code>, respectively).  The part after the arrow (<code>-&gt;</code>) and the <code>{}</code>s is the material to parse, in this case specified as POS tags ({{tag|n}}, {{tag|det}}) or other phrases (<code>NP</code>).  The part inside the <code>{}</code>s shows the order to output the elements in; in the last rule, the first element is not output.  The <code>%</code>s on the input elements shows what element to copy features from to the phrase level, and the <code>%</code>s on the output elements shows which element to copy missing features to from the phrase node on output.  Features are specified on phrases as tags (see information on attribute lists and output patterns below), so you can think of it as tag copying; also, different features can come from different elements (see <code>VP</code> example below).  The <code>_</code> is simply a blank (roughly a space) in the output.
  
To decide between the two DP rules, either specifying lemmas, weighting, or using conditionals would need to be used (specifying lemmas is discussed below, the rest are discussed later).
+
To decide between the first two DP rules, either specifying lemmas, weighting, or using conditionals would need to be used (specifying lemmas is discussed below, the rest are discussed later).  By default, the first one will be applied, which in this case is the one we want.
  
 
Similarly to how nominals are dealt with above, verbs ({{tag|v}}, {{tag|vblex}}, etc.) and auxiliaries ({{tag|vaux}}) can combine in various ways to combine into a <code>vP</code> (first-level verb phrase) in both languages, although when English uses "do" auxiliaries, Kyrgyz does not use an auxiliary, and Kyrgyz encodes the equivalent of a "not" adverbial as {{tag|neg}} on the main verb.
 
Similarly to how nominals are dealt with above, verbs ({{tag|v}}, {{tag|vblex}}, etc.) and auxiliaries ({{tag|vaux}}) can combine in various ways to combine into a <code>vP</code> (first-level verb phrase) in both languages, although when English uses "do" auxiliaries, Kyrgyz does not use an auxiliary, and Kyrgyz encodes the equivalent of a "not" adverbial as {{tag|neg}} on the main verb.
  
  vP -> do@AuxP.$tense.$polarity v.inf.$lemh { %2 } ;
+
  vP -> do@AuxP.$tense.$polarity v.*.inf.$lemh.$transitivity { %2 } ;
  
Here the <code>.$tense</code>, <code>.$polarity</code>, and <code>.$lemh</code> tell which element to get attributes (features) of those names from, acting together as <code>%</code>.  Specifying <code>.inf</code> requires the {{tag|v}} to have also an {{tag|inf}} tag, and specifying <code>do@</code> requires the <code>AuxP</code> to have a lemma of <code>do</code>, which would have been obtained in the parse using a rule like this:
+
Here the <code>.$tense</code>, <code>.$polarity</code>, <code>.$lemh</code>, and <code>.$transitivity</code> specifications tell which element to get each of those named attributes (features) from, acting together as <code>%</code> (which will get anything not specified with a <code>$</code> attribute).  Specifying <code>.*.inf</code> requires the {{tag|v}} to have also an {{tag|inf}} tag after any other tags, and specifying <code>do@</code> requires the <code>AuxP</code> to have a lemma of <code>do</code>, which would have been obtained in the parse using a rule like this:
  
  AuxP -> %vaux { %1 } |
+
  AuxP -> %vbdo.$lemh/sl not@adv [$polarity=neg] { } |  
         %vbdo { %1 } ;
+
         %vbdo.$lemh/sl { } ;
  
The first difference in ordering is that a <code>vP</code> and a <code>DP</code>, while both being parsed into a <code>VP</code> (a top-level verb phrase), occur in different orders.  If translating from English to Kyrgyz, a rule would need to parse a sequence of <code>vP DP</code> into <code>VP</code>, but output the components in the reverse order. This rule would look something like this:
+
Here the <code>.$lemh/sl</code> part ensures the <code>AuxP</code> gets its lemma from the source language (SL) lemma.  This <code>AuxP</code> rule also matches sequences of a do auxiliary and <code>not{{tag|adv}}</code> and sets the polarity of the <code>AuxP</code> to <code>neg</code> upon such a match.
  
VP -> %vP DP { %2 _ 1 } ;
+
The first difference in ordering is that a <code>vP</code> and a <code>DP</code>, while both being parsed into a <code>VP</code> (a top-level verb phrase), occur in different orders.  If translating from English to Kyrgyz, a rule would need to parse a sequence of <code>vP DP</code> into <code>VP</code>, but output the components in the reverse order, and set the case of the <code>DP</code> to {{tag|acc}} (how direct objects are marked in Kyrgyz).  This rule would look something like this:
  
In both languages, a {{tag|prn}} is parsed into a <code>DP</code>, and a <code>DP</code> and a <code>VP</code> combine to form an <code>S</code> (sentence) in the same order:
+
VP -> %vP DP { 2[case=acc] _ %1 } ;
  
  S -> DP VP { 1 _ 2 } ;
+
In both languages, a <code>DP</code> and a <code>VP</code> combine to form an <code>S</code> (sentence) in the same order:
 +
 
 +
  S -> DP.$person.$number VP { 1 _ %2 } ;
 +
 
 +
This rule gets the <code>person</code> and <code>number</code> attributes from the <code>DP</code>, and makes sure the verb (<code>2</code>) gets those attributes on output (using <code>%</code>).
  
 
In addition to these rules, each POS and phrase type needs an '''output pattern'''.  These patterns would look something like this for this English-to-Kyrgyz example:
 
In addition to these rules, each POS and phrase type needs an '''output pattern'''.  These patterns would look something like this for this English-to-Kyrgyz example:
  
  S: _ ;
+
  S: _.person.number ;
  DP: _.number.possession.case ;
+
  DP: _.person.number.possession.case ;
 
  NP: _.number.possession.case ;
 
  NP: _.number.possession.case ;
 
  n: _.number.possession.case ;
 
  n: _.number.possession.case ;
  VP: _.polarity.tense.person.number ;
+
prn: % ;
  vP: _.polarity.tense.person.number ;
+
  VP: _.transitivity.polarity.tense.person.number ;
  v: _.polarity.tense.person.number ;
+
  vP: _.transitivity.polarity.tense.person.number ;
 +
  v: _.transitivity.polarity.tense.person.number ;
 
  vaux: _.polarity.tense.person.number ;
 
  vaux: _.polarity.tense.person.number ;
 
  vbdo: _.polarity.tense.person.number ;
 
  vbdo: _.polarity.tense.person.number ;
Line 62: Line 70:
 
  person = (PD p3) p1 p2 p3 PD ;
 
  person = (PD p3) p1 p2 p3 PD ;
 
  number = (ND sg) sg pl sp ND ;
 
  number = (ND sg) sg pl sp ND ;
 +
polarity = (PolD "") neg PolD ;
 +
 +
For the most part these definitions are simple lists, but the first element in <code>()</code>s defines a filler in parsing if no other information is available (e.g., <code>ND</code>, for "Number to be Determined") and the default to replace it with in output if no value has been set (<code>""</code> for empty).
 +
 +
'''This example''' is essentially complete for this sentence, and can be [https://github.swarthmore.edu/Ling073-sp21/ling073-kir-eng/blob/master/apertium-kir-eng.eng-kir.rtx viewed in its entirety or tested in this repository].  To deal with other types of things related to this example or otherwise needed between these languages, additional patterns, modifications to the existing patterns, or advanced features would need to be used.
 +
 +
To see what the transfer stage is doing, you can do the following:
  
For the most part these definitions are simple lists, but the first element in <code>()</code>s defines a filler in parsing if no other information is available (e.g., <code>ND</code>, for "Number to be Determined") and the default to replace it with in output if no value has been set.
+
$ echo "I did not see the houses." | apertium -d . eng-kir-transfer
  
This example is not complete, but it does a lot of the basic workEverything else would need to use advanced features.
+
The last line of the output is what's being sent to the generator; in this case the following:
 +
 
 +
^Мен<prn><pers><p1><sg><nom>$ ^үй<n><pl><acc>$ ^көр<v><tv><neg><past><p1><sg>$^.<sent>$^.<sent>$
 +
 
 +
Which is generated correctly as
 +
 
 +
Мен үйлөрдү көргөн эмесмин.
 +
 
 +
To see the parse tree (before things are adjusted for output), you can get the output of lexical selection and feed it into <code>rtx-proc -T</code>:
 +
 
 +
$ echo "I did not see the houses" | apertium -d . eng-kir-lex | rtx-proc -T eng-kir.rtx.bin
 +
 
 +
In this case, the output looks like the following:
 +
<pre>
 +
^Default<S><p1><sg>{
 +
        ^Мен<DP><p1><sg><nom>{
 +
                ^Prpers<prn><subj><p1><mf><sg>/Мен<prn><pers><p1><sg><nom>$
 +
        }$
 +
        ^көр<VP><tv><neg><past><PD><ND>{
 +
                ^көр<vP><tv><neg><past><PD><ND>{
 +
                        ^do<AuxP><neg><past><PD><ND>{
 +
                                ^do<vbdo><past>/кыл<v><tv><past>$
 +
                                ^not<adv>/$
 +
                        }$
 +
                        ^see<v><tv><inf>/көр<v><tv><inf>$
 +
                }$
 +
                ^үй<DP><PD><pl>{
 +
                        ^the<det><def><sp>/$
 +
                        ^үй<NP><pl>{
 +
                                ^house<n><pl>/үй<n><pl>$
 +
                        }$
 +
                }$
 +
        }$
 +
}$
 +
^.<sent>/.<sent>$
 +
</pre>
 +
 
 +
==== Example: Kyrgyz-to-English ====
 +
 
 +
In the reverse of this example, if we wanted to add "did not".
 +
 
 +
These words need to be added manually based on the polarity (negative) and tense (past) attributes of the verbThey words need to be added at the <code>vP</code> level, since that's where they adjoin the <code>v</code> in the desired English version of the tree.
 +
 
 +
<pre>
 +
vP -> %v.*.neg.past.* { do@vbdo.past _ not@adv _ %1[tense=inf] } ;
 +
</pre>
  
 
==== Example: English-to-Spanish ====
 
==== Example: English-to-Spanish ====
  
[[File:Tree alignment eng-spa PP.png|thumb|The mapping between phrase-structure trees of "in the big beautiful houses" (English) and "en las casas largas y bonitas" (Spanish)]]
+
[[File:Tree alignment eng-spa PP.png|thumb|The mapping between phrase-structure trees of "in the beautiful spacious houses" (English) and "en las casas bonitas y amplias" (Spanish)]]
 
 
{{comment|under construction}}
 
  
 +
The English-to-Spanish phrase pair "in the beautiful spacious houses" = "en las casas bonitas y amplias" is shown in the image.
  
==== The structure of a transfer file ====
+
This phrase is a good example to walk through together in class or on your own.  The following will need to be accounted for:
{{comment|under construction}}
+
* The order of <code>AdjP</code> and <code>NP</code> within <code>DP</code>,
 +
* The number and gender agreement on <code>det</code> and <code>adj</code>s,
 +
* The addition of "y" between two adjectives in Spanish as compared to English.
  
 
==== Some things to note ====
 
==== Some things to note ====
 +
Some advanced features include
 
* weighting
 
* weighting
{{comment|under construction}}
+
* conditionals
 +
* macros
 +
 
 +
These let you do some useful things that aren't possible otherwise.  See the documentation or ask your prof or TA about it :)
  
 
==== Examples of implemented Apertium transfer systems ====
 
==== Examples of implemented Apertium transfer systems ====
 
Some examples are available:
 
Some examples are available:
* [https://github.swarthmore.edu/Ling073-sp21/apertium-eng-spa eng-spa] (in-class): a basic example from class showing how to transfer adjective+noun from English to Spanish ("big houses → casas largas": number and gender agreement and reordering) using chunking (chunker+interchunk).
+
* [https://github.swarthmore.edu/Ling073-sp21/ling073-eng-spa eng-spa] (in-class): a basic example from class showing how to transfer adjective+noun (etc.) from English to Spanish ("big houses → casas largas": number and gender agreement and reordering).
* [https://github.com/apertium/apertium-eng-kir/blob/master/apertium-eng-kir.kir-eng.rtx eng-kir] (apertium)
+
* [https://github.com/apertium/apertium-recursive/blob/master/eng-spa.rtx eng-spa]: a more extensive English-to-Spanish example.
* [https://github.com/apertium/apertium-kaz-kir/blob/master/apertium-kaz-kir.kaz-kir.rtx kaz-kir] (apertium)
+
* [https://github.com/apertium/apertium-eng-kir/blob/master/apertium-eng-kir.kir-eng.rtx eng-kir] (apertium): English-to-Kyrgyz, lots of conditionals
* br-fr
+
* [https://github.com/apertium/apertium-kaz-kir/blob/master/apertium-kaz-kir.kaz-kir.rtx kaz-kir] (apertium): Kazakh-to-Kyrgyz, lots of macros
{{comment|under construction}}
+
* [https://github.com/apertium/apertium-uzb-kaa/blob/master/apertium-uzb-kaa.uzb-kaa.rtx uzb-kaa] (apertium): Uzbek-to-Qaraqalpaq, lots of macros
 +
* [https://github.com/apertium/apertium-br-fr/blob/rtx/apertium-br-fr.br-fr.rtx br-fr] (apertium): Breton-to-French, uses lemma lists and weighting
  
 
==== Writing rules ====
 
==== Writing rules ====
Documentation is available...
+
Fairly extensive documentation is available on the Apertium wiki:
 
+
* [[:apertium:Apertium-recursive/Formalism|The formalism]]
{{comment|under construction}}
+
* [[:apertium:Apertium-recursive/Cookbook|Something of a HOWTO]]
  
 
== Evaluating ==
 
== Evaluating ==
 
==== Scrape a mini test corpus ====
 
==== Scrape a mini test corpus ====
# First make sure you have <code>scrapeTransferTests</code>.  Test that running <code>scrapeTransferTests</code> gives you information on using the tool. If not, clone the [[Misc tools|tools repo]] (or <code>git pull</code> to update it, if you already have it cloned from other assignments) and run <code>sudo make</code>.  Test again.
+
# First make sure you have <code>scrapeTransferTests</code>.  It should already be on the lab computers, but you can test that running <code>scrapeTransferTests</code> gives you information on using the tool.
 +
#* If you don't have it, clone the [[Misc tools|tools repo]] (or <code>git pull</code> to update it, if you already have it cloned from other assignments) and run <code>make</code> or <code>sudo make</code>.  Test again.
 
# Scrape the transferTests from your contrastive grammar page into a small parallel corpus.  E.g., <code>scrapeTransferTests -p abc-xyz "Language1_and_Language2/Contrastive_Grammar"</code> will result in an <code>abc.tests.txt</code> and <code>xyz.tests.txt</code> file that contain the respective sides of any transferTests on your contrastive grammar page specified as being for abc-to-xyz translation.
 
# Scrape the transferTests from your contrastive grammar page into a small parallel corpus.  E.g., <code>scrapeTransferTests -p abc-xyz "Language1_and_Language2/Contrastive_Grammar"</code> will result in an <code>abc.tests.txt</code> and <code>xyz.tests.txt</code> file that contain the respective sides of any transferTests on your contrastive grammar page specified as being for abc-to-xyz translation.
 
# '''Add these two files to your bilingual corpus repository''' and add mention of their origin (the wiki page) to the <code>MANIFEST</code> file.
 
# '''Add these two files to your bilingual corpus repository''' and add mention of their origin (the wiki page) to the <code>MANIFEST</code> file.
Line 106: Line 173:
  
 
To test WER and PER:
 
To test WER and PER:
# First make sure you have <code>apertium-eval-translator</code>.  Test that running <code>apertium-eval-translator</code> gives you information on using the tool. If not, clone the [[Misc tools|tools repo]] (or <code>git pull</code> to update it, if you already have it cloned from [[Morphological analyser/Setup|other]] [[MorphTests2yaml|assignments]]) and run <code>make</code>.
+
# First make sure you have <code>apertium-eval-translator-line</code>.  It should already be on the lab computers, but you can test that running <code>apertium-eval-translator-line</code> gives you information on using the tool.
 +
#* If you don't have it, clone the [[Misc tools|tools repo]] (or <code>git pull</code> to update it, if you already have it cloned from [[Morphological analyser/Setup|other]] [[MorphTests2yaml|assignments]]) and run <code>make</code>.
 
# You need two files: one '''test translation''', and one '''reference translation'''.  The reference translation is the parallel text in your corpus, e.g. <code>abc.tests.txt</code>.  To get a test translation, run the source text through apertium and direct the output into a new file, e.g. <code>cat xyz.tests.txt | apertium -d . xyz-abc > xyz-abc.tests.txt</code>.  You should '''add the [final] test translation to your repository'''.
 
# You need two files: one '''test translation''', and one '''reference translation'''.  The reference translation is the parallel text in your corpus, e.g. <code>abc.tests.txt</code>.  To get a test translation, run the source text through apertium and direct the output into a new file, e.g. <code>cat xyz.tests.txt | apertium -d . xyz-abc > xyz-abc.tests.txt</code>.  You should '''add the [final] test translation to your repository'''.
 
# The following command should then give you WER and PER measures and some other useful numbers:
 
# The following command should then give you WER and PER measures and some other useful numbers:
Line 115: Line 183:
  
 
=== Getting set up ===
 
=== Getting set up ===
# Add a page to the wiki called <code>Language1_and_Language2/Structural_transfer</code>, linking to it from the main page on the language pair.
+
'''Add a page to the wiki''' called <code>Language1_and_Language2/Structural_transfer</code>, linking to it from the main page on the language pair.
#* Put the page in the category [[:Category:Sp21_StructuralTransfer]] and the categories for the two languages.
+
* Put the page in the category [[:Category:Sp21_StructuralTransfer]] and the categories for the two languages.
#* Perform WER, PER, and coverage tests on your short sentences corpus, and add this in to a <code>pre-evaluation</code> section.
+
* '''Before you start development''' of structural transfer, perform WER, PER, and coverage tests on your short sentences corpus, and add this in to a <code>pre-evaluation</code> section.
  
 
=== Adding stems ===
 
=== Adding stems ===
# Add all the words for the transfer tests (from [[contrastive grammar|the last assignment]]) to analyse to bilingual dictionary.
+
'''Add all the words''' needed for the transfer tests (from [[contrastive grammar|the last assignment]]) to analyse to the bilingual dictionary.
#* And make sure both analysers can analyse all sentences correctly, which includes adding the words to the relevant monolingual dictionaries as necessary.
+
* And make sure both analysers can analyse all sentences correctly, which includes adding the words to the relevant monolingual dictionaries as necessary.
  
 
=== Write structural transfer rules ===
 
=== Write structural transfer rules ===
# Implement at least one item from your [[contrastive grammar]].
+
'''Implement at least one item''' from your [[contrastive grammar]].
#* Each person in each group should implement at least one item for the direction that translates into the language that they have been primarily working with.  The same item does not need to be used for each direction.
+
* (If the group is working on two translation directions, then each person in each group should implement at least one item for the direction that translates into the language that they have been primarily working with.  The same item does not need to be used for each direction.)
#* If the contrastive grammar item only involves relabelling or reordering tags within the same form, then please do at least two items.
+
<!-- * If the contrastive grammar item only involves relabelling or reordering tags within the same form, then please do at least two items. -->
  
 
=== Wrapping up ===
 
=== Wrapping up ===
# Add to your structural transfer wiki page:
+
'''Add to your structural transfer wiki page''':
#* Add at least one example sentence for each item you implement.  Show the outputs of the following modes for your translation system: tagger, biltrans, transfer, and the pair itself (abc-xyz).
+
* Add at least one example sentence for each item you implement.  Show the outputs of the following modes for your translation system: tagger, lex, transfer, and the pair itself (abc-xyz).
#* Perform WER, PER, and coverage tests again, and add into a post-evaluation section on the wiki page.
+
* Perform WER, PER, and coverage tests again, and add into a post-evaluation section on the wiki page.
  
  
 
[[Category:Assignments]]
 
[[Category:Assignments]]
 
[[Category:Structural transfer]]
 
[[Category:Structural transfer]]

Revision as of 14:18, 10 May 2021

Background

The basic idea of structural transfer in RBMT

The idea of structural transfer in RBMT is to deal with the order and tag differences encountered in translation between two languages.

The arrows between the two tagged levels represent where structural transfer is needed. Colour coding shows [rough] correspondences.

How structural transfer works in Apertium

Transfer takes the output of the lex mode (lexical selection), matches series of words based on patterns you define, and performs operations on and outputs those things. It allows you to change the order of words, change tags, add and remove words, etc.

Syntactic Structures and Parsing

The way Apertium's recursive structural transfer system works is to parse and combine phrases, and then to output each parsed phrase. Rules specify what words or phrases are parsed together into phrases and how they're output.

In other words, at a basic level the recursive structural transfer system first turns analyses from the source languages into a tree according to parse rules, and then outputs that according to both output rules and patterns for each node. You can think of it as building a tree, going up from from the words, and then working its way down from the top to output forms again.

Example: English-to-Kyrgyz

The mapping between phrase-structure trees of the Kyrgyz and English sentence above ("I did not see the houses")

In the English-to-Kyrgyz example for "I did not see the houses", in both languages a noun (<n>) is parsed into an NP (noun phrase), and an NP can combine with a determiner (<det>) to form a DP (determiner phrase), although Kyrgyz doesn't have a definite article. They can also both form a DP from just a pronoun. These rules can be written as follows:

NP -> %n { %1 } ;
DP -> det %NP { %2 } |
      det %NP { 1 _ %2 } |
      %prn { %1 } ;

The beginning of each rule describes what kind of phrase to build (NP and DP, respectively). The part after the arrow (->) and the {}s is the material to parse, in this case specified as POS tags (<n>, <det>) or other phrases (NP). The part inside the {}s shows the order to output the elements in; in the last rule, the first element is not output. The %s on the input elements shows what element to copy features from to the phrase level, and the %s on the output elements shows which element to copy missing features to from the phrase node on output. Features are specified on phrases as tags (see information on attribute lists and output patterns below), so you can think of it as tag copying; also, different features can come from different elements (see VP example below). The _ is simply a blank (roughly a space) in the output.

To decide between the first two DP rules, either specifying lemmas, weighting, or using conditionals would need to be used (specifying lemmas is discussed below, the rest are discussed later). By default, the first one will be applied, which in this case is the one we want.

Similarly to how nominals are dealt with above, verbs (<v>, <vblex>, etc.) and auxiliaries (<vaux>) can combine in various ways to combine into a vP (first-level verb phrase) in both languages, although when English uses "do" auxiliaries, Kyrgyz does not use an auxiliary, and Kyrgyz encodes the equivalent of a "not" adverbial as <neg> on the main verb.

vP -> do@AuxP.$tense.$polarity v.*.inf.$lemh.$transitivity { %2 } ;

Here the .$tense, .$polarity, .$lemh, and .$transitivity specifications tell which element to get each of those named attributes (features) from, acting together as % (which will get anything not specified with a $ attribute). Specifying .*.inf requires the <v> to have also an <inf> tag after any other tags, and specifying do@ requires the AuxP to have a lemma of do, which would have been obtained in the parse using a rule like this:

AuxP -> %vbdo.$lemh/sl not@adv [$polarity=neg] { } | 
        %vbdo.$lemh/sl { } ;

Here the .$lemh/sl part ensures the AuxP gets its lemma from the source language (SL) lemma. This AuxP rule also matches sequences of a do auxiliary and not<adv> and sets the polarity of the AuxP to neg upon such a match.

The first difference in ordering is that a vP and a DP, while both being parsed into a VP (a top-level verb phrase), occur in different orders. If translating from English to Kyrgyz, a rule would need to parse a sequence of vP DP into VP, but output the components in the reverse order, and set the case of the DP to <acc> (how direct objects are marked in Kyrgyz). This rule would look something like this:

VP -> %vP DP { 2[case=acc] _ %1 } ;

In both languages, a DP and a VP combine to form an S (sentence) in the same order:

S -> DP.$person.$number VP { 1 _ %2 } ;

This rule gets the person and number attributes from the DP, and makes sure the verb (2) gets those attributes on output (using %).

In addition to these rules, each POS and phrase type needs an output pattern. These patterns would look something like this for this English-to-Kyrgyz example:

S: _.person.number ;
DP: _.person.number.possession.case ;
NP: _.number.possession.case ;
n: _.number.possession.case ;
prn: % ;
VP: _.transitivity.polarity.tense.person.number ;
vP: _.transitivity.polarity.tense.person.number ;
v: _.transitivity.polarity.tense.person.number ;
vaux: _.polarity.tense.person.number ;
vbdo: _.polarity.tense.person.number ;
AuxP: _.polarity.tense.person.number ;
det: %;

These output patterns define what order to arrange attributes (sets of tags, which also need to be defined) in a particular order. The _ represents a lemma followed by the main POS. Attributes are defined as follows:

person = (PD p3) p1 p2 p3 PD ;
number = (ND sg) sg pl sp ND ;
polarity = (PolD "") neg PolD ;

For the most part these definitions are simple lists, but the first element in ()s defines a filler in parsing if no other information is available (e.g., ND, for "Number to be Determined") and the default to replace it with in output if no value has been set ("" for empty).

This example is essentially complete for this sentence, and can be viewed in its entirety or tested in this repository. To deal with other types of things related to this example or otherwise needed between these languages, additional patterns, modifications to the existing patterns, or advanced features would need to be used.

To see what the transfer stage is doing, you can do the following:

$ echo "I did not see the houses." | apertium -d . eng-kir-transfer

The last line of the output is what's being sent to the generator; in this case the following:

^Мен<prn><pers><p1><sg><nom>$ ^үй<n><pl><acc>$ ^көр<v><tv><neg><past><p1><sg>$^.<sent>$^.<sent>$

Which is generated correctly as

Мен үйлөрдү көргөн эмесмин.

To see the parse tree (before things are adjusted for output), you can get the output of lexical selection and feed it into rtx-proc -T:

$ echo "I did not see the houses" | apertium -d . eng-kir-lex | rtx-proc -T eng-kir.rtx.bin 

In this case, the output looks like the following:

^Default<S><p1><sg>{
        ^Мен<DP><p1><sg><nom>{
                ^Prpers<prn><subj><p1><mf><sg>/Мен<prn><pers><p1><sg><nom>$
        }$
        ^көр<VP><tv><neg><past><PD><ND>{
                ^көр<vP><tv><neg><past><PD><ND>{
                        ^do<AuxP><neg><past><PD><ND>{
                                ^do<vbdo><past>/кыл<v><tv><past>$
                                ^not<adv>/$
                        }$
                        ^see<v><tv><inf>/көр<v><tv><inf>$
                }$
                ^үй<DP><PD><pl>{
                        ^the<det><def><sp>/$
                        ^үй<NP><pl>{
                                ^house<n><pl>/үй<n><pl>$
                        }$
                }$
        }$
}$
^.<sent>/.<sent>$

Example: Kyrgyz-to-English

In the reverse of this example, if we wanted to add "did not".

These words need to be added manually based on the polarity (negative) and tense (past) attributes of the verb. They words need to be added at the vP level, since that's where they adjoin the v in the desired English version of the tree.

vP -> %v.*.neg.past.* { do@vbdo.past _ not@adv _ %1[tense=inf] } ;

Example: English-to-Spanish

The mapping between phrase-structure trees of "in the beautiful spacious houses" (English) and "en las casas bonitas y amplias" (Spanish)

The English-to-Spanish phrase pair "in the beautiful spacious houses" = "en las casas bonitas y amplias" is shown in the image.

This phrase is a good example to walk through together in class or on your own. The following will need to be accounted for:

  • The order of AdjP and NP within DP,
  • The number and gender agreement on det and adjs,
  • The addition of "y" between two adjectives in Spanish as compared to English.

Some things to note

Some advanced features include

  • weighting
  • conditionals
  • macros

These let you do some useful things that aren't possible otherwise. See the documentation or ask your prof or TA about it :)

Examples of implemented Apertium transfer systems

Some examples are available:

  • eng-spa (in-class): a basic example from class showing how to transfer adjective+noun (etc.) from English to Spanish ("big houses → casas largas": number and gender agreement and reordering).
  • eng-spa: a more extensive English-to-Spanish example.
  • eng-kir (apertium): English-to-Kyrgyz, lots of conditionals
  • kaz-kir (apertium): Kazakh-to-Kyrgyz, lots of macros
  • uzb-kaa (apertium): Uzbek-to-Qaraqalpaq, lots of macros
  • br-fr (apertium): Breton-to-French, uses lemma lists and weighting

Writing rules

Fairly extensive documentation is available on the Apertium wiki:

Evaluating

Scrape a mini test corpus

  1. First make sure you have scrapeTransferTests. It should already be on the lab computers, but you can test that running scrapeTransferTests gives you information on using the tool.
    • If you don't have it, clone the tools repo (or git pull to update it, if you already have it cloned from other assignments) and run make or sudo make. Test again.
  2. Scrape the transferTests from your contrastive grammar page into a small parallel corpus. E.g., scrapeTransferTests -p abc-xyz "Language1_and_Language2/Contrastive_Grammar" will result in an abc.tests.txt and xyz.tests.txt file that contain the respective sides of any transferTests on your contrastive grammar page specified as being for abc-to-xyz translation.
  3. Add these two files to your bilingual corpus repository and add mention of their origin (the wiki page) to the MANIFEST file.

WER and PER

WER or word error rate is a measure of how different two texts are. You will want to know how different the translation your translation pair performs (the "test translation") is from the known good translation of phrases in your parallel corpus (the "reference translation").

PER (position-independent error rate) is the same measurement, just not sensitive to position in a phrase. I.e., a correct translation of every word but in an entirely wrong word order will give you high (bad) WER but low (good) PER.

To test WER and PER:

  1. First make sure you have apertium-eval-translator-line. It should already be on the lab computers, but you can test that running apertium-eval-translator-line gives you information on using the tool.
    • If you don't have it, clone the tools repo (or git pull to update it, if you already have it cloned from other assignments) and run make.
  2. You need two files: one test translation, and one reference translation. The reference translation is the parallel text in your corpus, e.g. abc.tests.txt. To get a test translation, run the source text through apertium and direct the output into a new file, e.g. cat xyz.tests.txt | apertium -d . xyz-abc > xyz-abc.tests.txt. You should add the [final] test translation to your repository.
  3. The following command should then give you WER and PER measures and some other useful numbers:
    • apertium-eval-translator -r abc.tests.txt -t xyz-abc.tests.txt

The assignment

This assignment is early in week 13 (this semester, noon on Monday, May 3, 2021).

Getting set up

Add a page to the wiki called Language1_and_Language2/Structural_transfer, linking to it from the main page on the language pair.

  • Put the page in the category Category:Sp21_StructuralTransfer and the categories for the two languages.
  • Before you start development of structural transfer, perform WER, PER, and coverage tests on your short sentences corpus, and add this in to a pre-evaluation section.

Adding stems

Add all the words needed for the transfer tests (from the last assignment) to analyse to the bilingual dictionary.

  • And make sure both analysers can analyse all sentences correctly, which includes adding the words to the relevant monolingual dictionaries as necessary.

Write structural transfer rules

Implement at least one item from your contrastive grammar.

  • (If the group is working on two translation directions, then each person in each group should implement at least one item for the direction that translates into the language that they have been primarily working with. The same item does not need to be used for each direction.)

Wrapping up

Add to your structural transfer wiki page:

  • Add at least one example sentence for each item you implement. Show the outputs of the following modes for your translation system: tagger, lex, transfer, and the pair itself (abc-xyz).
  • Perform WER, PER, and coverage tests again, and add into a post-evaluation section on the wiki page.