Difference between revisions of "Fijian and English/Structural transfer"

From LING073
Jump to: navigation, search
(fij → eng)
(fij → eng)
Line 30: Line 30:
 
:-English(desired):<code>^this/this<det><dem><sg>$ ^big/big<adj><sint>$ ^village/village<n><sg>$^./.<sent>$</code>
 
:-English(desired):<code>^this/this<det><dem><sg>$ ^big/big<adj><sint>$ ^village/village<n><sg>$^./.<sent>$</code>
 
*Biltrans:<code>^a<art>/$ ^’oro<n>/village<n>$ ^levu<adj>/big<adj><sint>$ ^yai<det><dem>/this<det><dem>$^.<sent>/.<sent>$</code>
 
*Biltrans:<code>^a<art>/$ ^’oro<n>/village<n>$ ^levu<adj>/big<adj><sint>$ ^yai<det><dem>/this<det><dem>$^.<sent>/.<sent>$</code>
Step 1: Fijian nouns and demonstratives do not distinguish numbers, so I want to add a number tag for the English noun and link it to the demonstrative ''this''.  
+
Step 1: Fijian nouns and demonstratives do not distinguish numbers, so I want to add a number tag for the English noun and link it to the demonstrative ''this''. The code in fij-eng.t1x file:
 
<pre>
 
<pre>
 
<rule comment="REGLA: NPs with demonstratives">
 
<rule comment="REGLA: NPs with demonstratives">
Line 71: Line 71:
 
</rule>
 
</rule>
 
</pre>
 
</pre>
However, the code above does not work:
+
However, the codes above do not work:
*Chunker:
+
*Chunker
 
<pre>
 
<pre>
 
$ echo "a ’oro levu yai" | apertium -d . fij-eng-chunker
 
$ echo "a ’oro levu yai" | apertium -d . fij-eng-chunker
Line 81: Line 81:
 
^nom<SN>{^village<n>$ ^big<adj><sint>$ ^this<det><dem>$}$^sent<SENT>{^.<sent>$}$
 
^nom<SN>{^village<n>$ ^big<adj><sint>$ ^this<det><dem>$}$^sent<SENT>{^.<sent>$}$
 
</pre>
 
</pre>
 +
Neither do the codes for changing word orders:
 +
<pre>
 +
<rule comment="REGLA: n adj dem -> dem adj n">
 +
      <pattern>
 +
        <pattern-item n="article"/>
 +
        <pattern-item n="noun_reg"/>
 +
        <pattern-item n="adj"/>
 +
        <pattern-item n="dem"/>
 +
      </pattern>
 +
      <action>
 +
        <let>
 +
          <clip pos="4" part="a_num"/>
 +
          <clip pos="2" part="a_num"/>
 +
        </let>
 +
        <out>
 +
          <chunk>
 +
        <clip pos="4" part="lem"/>
 +
<clip pos="4" part="tags"/>
 +
<clip pos="4" part="chcontent"/>
 +
  </chunk>
 +
  <b/>
 +
  <chunk>
 +
<clip pos="3" part="lem"/>
 +
<clip pos="3" part="tags"/>
 +
<clip pos="3" part="chcontent"/>
 +
          </chunk>
 +
          <b/>
 +
          <chunk>
 +
                <clip pos="2" part="lem"/>
 +
<clip pos="2" part="tags"/>
 +
<clip pos="2" part="chcontent"/>
 +
          </chunk>
 +
        </out>
 +
      </action>
 +
</rule>
 +
</pre>
 +
*Interchunk:<code>^nom<SN>{^village<n>$ ^big<adj><sint>$ ^this<det><dem>$}$^sent<SENT>{^.<sent>$}$</code> The word order isn't changed.
  
 
==Post-evaluation==
 
==Post-evaluation==

Revision as of 13:14, 13 April 2018

Initial Evaluation

  • eng → fij
-WER:75.81%
-PER:70.97%
  • fij → eng
-WER:82.14%
-PER:76.79%

eng → fij

I am working on the first contrastive grammar point--word orders within NPs. An English NP has the demonstrative and adjectives preceding the head noun, while Fijian has all of the modifiers following the head noun. Besides, the article in front of the head noun is obligatory even if a demonstrative is present, but its corresponding translation in English--the determiner the--is not allowed to occur when there is a demonstrative in the NP. In terms of tags, Fijian does not have number tags on either nouns or demonstratives.

  • Example phrase: (eng) this big village → (fij) a ’oro levu yai
  • Current translation to Fijian:#yai levu #’oro
  • Taggers: ^this/this<det><dem><sg>$ ^big/big<adj><sint>$ ^village/village<n><sg>$^./.<sent>$
  • Biltrans: ^this<det><dem><sg>/yai<det><dem><sg>$ ^big<adj><sint>/levu<adj>$ ^village<n><sg>/’oro<n><sg>$^.<sent>/.<sent>$

Step 1: remove number tags and add a<art> in the nom chunk:

  • Chunker: ^dem<det><dem>{^yai<det><dem>$}$ ^adj<adj><sint>{^levu<adj>$}$ ^nom<n>{^a<art> ^’oro<n>$}$^sent<SENT>{^.<sent>$}$

Step 2: change word orders:

  • Interchunk: ^nom<n>{^a<art> ^’oro<n>$}$ ^adj<adj><sint>{^levu<adj>$}$ ^dem<det><dem>{^yai<det><dem>$}$^sent<SENT>{^.<sent>$}$
  • Postchunk: ^a<art>$ ^’oro<n>$ ^levu<adj>$ ^yai<det><dem>$^.<sent>$
  • Translation: a ’oro levu yai

fij → eng

Same grammar point as above.

The relationship between the Fijian article a and the English definite determiner the was complicated to implement. For now, I set the article a to translate to null all the time in the bilingual dictionary. This works here and for most complex NPs such as those with demonstratives and possessives, but not for NPs without any <det>, such as a ’oro levu (ART village big) or a ’oro (ART village); for these simple NPs, I will need to write separate rules with the two patterns in the fij-eng.t1x file.

  • Example phrase: (fij) a ’oro levu yai → (eng) this big village
  • Current translation from Fijian to English: #village big #this
  • Taggers:
-Fijian:^a/a<art>$ ^’oro/’oro<n>$ ^levu/levu<adj>$ ^yai/yai<det><dem>$^./.<sent>$
-English(desired):^this/this<det><dem><sg>$ ^big/big<adj><sint>$ ^village/village<n><sg>$^./.<sent>$
  • Biltrans:^a<art>/$ ^’oro<n>/village<n>$ ^levu<adj>/big<adj><sint>$ ^yai<det><dem>/this<det><dem>$^.<sent>/.<sent>$

Step 1: Fijian nouns and demonstratives do not distinguish numbers, so I want to add a number tag for the English noun and link it to the demonstrative this. The code in fij-eng.t1x file:

<rule comment="REGLA: NPs with demonstratives">
      <pattern>
        <pattern-item n="article"/>
        <pattern-item n="noun_reg"/>
        <pattern-item n="adj"/>
        <pattern-item n="dem"/>
      </pattern>
      <action>
        <let>
	  <var n="number"/>
	  <clip pos="2" side="tl" part="a_num"/>
        </let>
        <out>
          <chunk name="nom" case="caseFirstWord">
            <tags>
              <tag><lit-tag v="SN"/></tag>
              <tag><var n="number"/></tag>
            </tags>
            <lu>
              <clip pos="2" side="tl" part="lem"/>
              <clip pos="2" side="tl" part="a_pos"/>
              <clip pos="2" side="tl" part="a_num"/>
            </lu>
            <b/>
            <lu>
              <clip pos="3" side="tl" part="whole"/>
            </lu>
            <b/>
            <lu>
              <clip pos="4" side="tl" part="lem"/>
              <clip pos="4" side="tl" part="a_pos"/>
              <clip pos="4" side="tl" part="subcategories"/>
              <clip pos="2" side="tl" part="a_num"/>
            </lu>
          </chunk>
        </out>
      </action>
</rule>

However, the codes above do not work:

  • Chunker
$ echo "a ’oro levu yai" | apertium -d . fij-eng-chunker

apertium-transfer: Rule 2 a<art>/ ’oro<n>/village<n> levu<adj>/big<adj><sint> yai<det><dem>/this<det><dem>

apertium-transfer: Rule 1 .<sent>/.<sent>
^nom<SN>{^village<n>$ ^big<adj><sint>$ ^this<det><dem>$}$^sent<SENT>{^.<sent>$}$

Neither do the codes for changing word orders:

<rule comment="REGLA: n adj dem -> dem adj n">
      <pattern>
        <pattern-item n="article"/>
        <pattern-item n="noun_reg"/>
        <pattern-item n="adj"/>
        <pattern-item n="dem"/>
      </pattern>
      <action>
        <let>
          <clip pos="4" part="a_num"/>
          <clip pos="2" part="a_num"/>
        </let>
        <out>
          <chunk>
	        <clip pos="4" part="lem"/>
		<clip pos="4" part="tags"/>
		<clip pos="4" part="chcontent"/>
	  </chunk>
	  <b/>
	  <chunk>
		<clip pos="3" part="lem"/>
		<clip pos="3" part="tags"/>
		<clip pos="3" part="chcontent"/>
          </chunk>
          <b/>
          <chunk>
                <clip pos="2" part="lem"/>
		<clip pos="2" part="tags"/>
		<clip pos="2" part="chcontent"/>
          </chunk>
        </out>
      </action>
</rule>
  • Interchunk:^nom<SN>{^village<n>$ ^big<adj><sint>$ ^this<det><dem>$}$^sent<SENT>{^.<sent>$}$ The word order isn't changed.

Post-evaluation