Difference between revisions of "Morphological analyser"
(Created page with "== Morphological transducers == == The formalism we use (lexc) == == In-class exercise == == Evaluation == == The assignment == This assignment will be due on Thursday...") |
(→Morphological transducers) |
||
Line 1: | Line 1: | ||
== Morphological transducers == | == Morphological transducers == | ||
+ | A morphological transducer is just a directed graph. It consists of nodes (numbered below) and arcs (with labels), with a starting node (0 below) and an ending node (16 below). | ||
+ | |||
+ | [[Image:Simple transducer.png|1000px]] | ||
+ | |||
+ | You follow the arcs that are available from your input. The only acceptable paths are ones that start from starting node and end at the ending node. You may match your input to either side of the arc's label (separated by : above), and the other side is returned as output. | ||
+ | |||
+ | In the transducer above, the left side is the form and the right side is the analysis. If you match your input to the left side (the form), then your output will be the right side (the analysis)—this is '''morphological analysis'''. Likewise, if you follow the transducer by matching your input to the right side (the analysis) and output the left side (the form), then you are performing '''morphological generation'''. | ||
+ | |||
+ | An example of a complete path is <code>w:w o:o l:l v:f e:{{tag|n}} s:{{tag|pl}}</code>. The left/form side of this spells <code>wolves</code> and the right/analysis side of this spells <code>wolf{{tag|n}}{{tag|pl}}</code>. Mapping between one and the other is as simple as taking one as input and following the path—by outputting the other side of each arc, you will get the other as output! | ||
+ | |||
+ | '''Question''': What are all the possible paths provided by this transducer? | ||
== The formalism we use (lexc) == | == The formalism we use (lexc) == |
Revision as of 01:17, 9 February 2017
Contents
Morphological transducers
A morphological transducer is just a directed graph. It consists of nodes (numbered below) and arcs (with labels), with a starting node (0 below) and an ending node (16 below).
You follow the arcs that are available from your input. The only acceptable paths are ones that start from starting node and end at the ending node. You may match your input to either side of the arc's label (separated by : above), and the other side is returned as output.
In the transducer above, the left side is the form and the right side is the analysis. If you match your input to the left side (the form), then your output will be the right side (the analysis)—this is morphological analysis. Likewise, if you follow the transducer by matching your input to the right side (the analysis) and output the left side (the form), then you are performing morphological generation.
An example of a complete path is w:w o:o l:l v:f e:<n> s:<pl>
. The left/form side of this spells wolves
and the right/analysis side of this spells wolf<n><pl>
. Mapping between one and the other is as simple as taking one as input and following the path—by outputting the other side of each arc, you will get the other as output!
Question: What are all the possible paths provided by this transducer?
The formalism we use (lexc)
In-class exercise
Evaluation
The assignment
This assignment will be due on Thursday of the 5th week of class before class starts (this semester: 11:20am on Thursday, February 16th, 2017).
This assignment is to develop a morphological analyser that implements a good deal of the basic morphology of your language.
- Bootstrap a transducer for your language.
- Initialise the module (
./autogen.sh
), and compile it (make
).- If this is successful, you should have several "modes" available; run
apertium -d . -l
to see. - One mode should be an
xyz-moprh
mode; this is your analyser. Check it by runningecho "houses" | apertium -d . xyz-morph
, which should give you a morphological analysis of the word "houses".
- If this is successful, you should have several "modes" available; run
- Add all of the tags you came up with during the Grammar documentation assignment to the
Multicchar_Symbols
section of theapertium-xyz.xyz.lexc
file. Provide a symbol, and a brief comment explaining what the symbol means. - Add all of the stems from your Grammar documentation assignment, and categorise them correctly so that all of your tests pass.
- Housekeeping:
- Add yourself to the
AUTHORS
file. - Make sure the
COPYING
file contains an open-source license of your liking (default should be GPL3). - Add a link to the transducer to the list of resources you developed for your language on the language's page on this wiki.
- Add yourself to the