Morphological disambiguator
From LING073
Contents
Why we need disambiguation
Imagine you have a word that has two different tagsets, e.g.
^houses/house<n><pl>/house<v><tv><p3><sg>$ ^this/this<det><dem><sg>/this<pron><dem><sg>$
Normally your analyser will randomly choose one of these. Now imagine you have a sentence where the wrong analysis is chosen:
^The/The<det><def>$ ^motel/motel<n><sg>$ ^houses/house<n><pl>$ ^this/this<pron><dem><sg>$ ^dog/dog<n><sg>$^./.<sent>$
The goal of a disambiguator is to choose the correct analysis based on the surrounding words.
Using Constraint Grammar to disambiguate
Constraint Grammar (CG) is a formalism for making context sensitive rules to select or remove analyses from the list of possible analyses.
The structure of a CG file
Rules
Example above
Useful commands
Getting output before disambiguation
Getting output after disambiguation
Calculating ambiguity
Seeing which rules are doing what
The assignment
- Add words
- Measure ambiguity
- Identify ambiguous form
- Write rule
- Measure ambiguity
More resources
- Constraint Grammar on the Apertium wiki
- Apertium and Constraint Grammar on the Apertium wiki