Difference between revisions of "Apertium-init"
(→Creating a language module) |
|||
Line 1: | Line 1: | ||
− | You can the '''apertium-init''' tool to create ("bootstrap") a directory for quick development of a transducer. | + | You can the '''apertium-init''' tool to create ("bootstrap") a directory for quick development of a transducer or translation pair. |
== Installing apertium-init == | == Installing apertium-init == | ||
Line 17: | Line 17: | ||
'''For the first day assignment''', skip down to [[#Push to github]] | '''For the first day assignment''', skip down to [[#Push to github]] | ||
− | == Bootstrapping a | + | == Bootstrapping a translation pair == |
− | To | + | To bootstrap a translation pair whose primary function is to translate from language <code>xyz</code> to language <code>abc</code>, do the following: |
− | # Make sure you '''have a copy of both of the | + | # Make sure you '''have a copy of both of the language modules''' you'll need (one for each language). [https://www.github.com/apertium/ Apertium's github repository] has transducers for a lot of languages. |
#* I recommend that you '''fork the transducer on github''' (i.e., copy the project to your own github account) so that you can make changes to it easily as needed (and potentially submit those changes back to Apertium at some point). For this you'll need a github.com account, you'll need to set up an ssh key for it, and you'll need to make sure both members of your group have write access to it. | #* I recommend that you '''fork the transducer on github''' (i.e., copy the project to your own github account) so that you can make changes to it easily as needed (and potentially submit those changes back to Apertium at some point). For this you'll need a github.com account, you'll need to set up an ssh key for it, and you'll need to make sure both members of your group have write access to it. | ||
− | # '''Check which formalism each transducer is written in.''' | + | # '''Check which formalism each transducer is written in.''' |
− | # <code>apertium-init -a1 hfst -a2 hfst xyz-abc</code> | + | #* The transducer you've written so far in this class should be written using HFST (using <code>lexc</code> and <code>twol</code>). If the transducer that the module you cloned for the other language has a file like <code>apertium-abc.abc.dix</code>, then it's written using lttoolbox. |
+ | # The following command will initialise the directory for the translation pair: | ||
+ | #* <code>apertium-init -a1 hfst -a2 hfst xyz-abc</code> | ||
#* The <code>-a1</code> and <code>-a2</code> arguments tell <code>apertium-init</code> what formalism your transducers are written in. You may need to say "<code>lttoolbox</code>" instead of "<code>hfst</code>" for one or more of those options. | #* The <code>-a1</code> and <code>-a2</code> arguments tell <code>apertium-init</code> what formalism your transducers are written in. You may need to say "<code>lttoolbox</code>" instead of "<code>hfst</code>" for one or more of those options. | ||
# Rename the directory to <code>ling073-xyz-abc</code>. | # Rename the directory to <code>ling073-xyz-abc</code>. | ||
# Create a <code>ling073-xyz-abc</code> repository in the semester's github group (<code>Ling073-sp19</code>), set a remote origin in your repo, and push (for the last two, see [[#Push to github]] below). Make sure all members of your group have access to the repository. | # Create a <code>ling073-xyz-abc</code> repository in the semester's github group (<code>Ling073-sp19</code>), set a remote origin in your repo, and push (for the last two, see [[#Push to github]] below). Make sure all members of your group have access to the repository. | ||
− | # '''Initialise''' the compiler (needed once for each copy of the new directory) with the following command: <code>./autogen.sh --with-lang1=/path/to/ling073-xyz --with-lang2=/path/to/ | + | # '''Initialise''' the compiler (needed once for each copy of the new directory) with the following command: |
+ | #* <code>./autogen.sh --with-lang1=/path/to/ling073-xyz --with-lang2=/path/to/apertium-abc</code> | ||
+ | #* You'll need to substitute <code>/path/to/ling-xyz</code> and <code>/path/to/apertium-abc</code> with the paths to the source-language transducer and the target-language transducer, respectively. | ||
# Compile with <code>make</code> as always. | # Compile with <code>make</code> as always. | ||
Revision as of 13:57, 19 March 2019
You can the apertium-init tool to create ("bootstrap") a directory for quick development of a transducer or translation pair.
Contents
Installing apertium-init
If needed, download apertium-init and install it:
-
cd ~/ling073; git clone https://github.com/apertium/apertium-init; cd apertium-init; PREFIX=$HOME make install
Creating a language module
Create an hfst-based apertium language module (in your ~/ling073
directory), replacing xyz
with the ISO code of your language in all occurrences:
-
apertium-init -a hfst xyz
Rename the module ling073-xyz
if you want (so that it matches what will be in github later, and any further instructions):
-
mv apertium-xyz ling073-xyz
Notes:
- If get an error about SVN and the directory not being a working copy, then you have an old version of apertium-init.
- If something goes wrong (e.g., you make a typo), delete any directories/files that were created and try the step again.
For the first day assignment, skip down to #Push to github
Bootstrapping a translation pair
To bootstrap a translation pair whose primary function is to translate from language xyz
to language abc
, do the following:
- Make sure you have a copy of both of the language modules you'll need (one for each language). Apertium's github repository has transducers for a lot of languages.
- I recommend that you fork the transducer on github (i.e., copy the project to your own github account) so that you can make changes to it easily as needed (and potentially submit those changes back to Apertium at some point). For this you'll need a github.com account, you'll need to set up an ssh key for it, and you'll need to make sure both members of your group have write access to it.
- Check which formalism each transducer is written in.
- The transducer you've written so far in this class should be written using HFST (using
lexc
andtwol
). If the transducer that the module you cloned for the other language has a file likeapertium-abc.abc.dix
, then it's written using lttoolbox.
- The transducer you've written so far in this class should be written using HFST (using
- The following command will initialise the directory for the translation pair:
-
apertium-init -a1 hfst -a2 hfst xyz-abc
- The
-a1
and-a2
arguments tellapertium-init
what formalism your transducers are written in. You may need to say "lttoolbox
" instead of "hfst
" for one or more of those options.
-
- Rename the directory to
ling073-xyz-abc
. - Create a
ling073-xyz-abc
repository in the semester's github group (Ling073-sp19
), set a remote origin in your repo, and push (for the last two, see #Push to github below). Make sure all members of your group have access to the repository. - Initialise the compiler (needed once for each copy of the new directory) with the following command:
-
./autogen.sh --with-lang1=/path/to/ling073-xyz --with-lang2=/path/to/apertium-abc
- You'll need to substitute
/path/to/ling-xyz
and/path/to/apertium-abc
with the paths to the source-language transducer and the target-language transducer, respectively.
-
- Compile with
make
as always.
Pushing to github
If this is a language pair you would like to push to github, do the following ideally before modifying any files or compiling the module:
- Create an empty (no files) repository named
ling073-xyz
on github. - Make sure the repository really was created correctly by running
git log
. You should see a single commit named "initial commit". - Set the github repository you created as the remote origin:
-
git remote add origin git@github.swarthmore.edu:username/ling073-xyz.git
(replacing "username" and "xyz" below as appropriate)
-
- Push the bootstrapped module to origin:
-
git push --set-upstream origin master
-
- After this you should be able to see the same files from the github web interface and in the directory. You should also be able to commit, push, pull, etc. all normally.