Morphological disambiguator

From LING073
Revision as of 02:50, 23 February 2017 by Jwashin1 (talk | contribs) (Why we need disambiguation)

Jump to: navigation, search

Why we need disambiguation

Imagine you have a word that has two different tagsets, e.g.

^houses/house<n><pl>/house<v><tv><p3><sg>$
^this/this<det><dem><sg>/this<pron><dem><sg>$

Normally your analyser will randomly choose one of these. Now imagine you have a sentence where the wrong analysis is chosen (in this case, twice!):

^The/The<det><def>$ ^motel/motel<n><sg>$ ^houses/house<n><pl>$ ^this/this<pron><dem><sg>$ ^dog/dog<n><sg>$^./.<sent>$

The goal of a disambiguator is to choose the correct analysis based on the surrounding words.

Using Constraint Grammar to disambiguate

Constraint Grammar (CG) is a formalism for making context sensitive rules to select or remove analyses from the list of possible analyses.

The structure of a CG file

Rules

Example above

Useful commands

Getting output before disambiguation

Getting output after disambiguation

Calculating ambiguity

Seeing which rules are doing what

The assignment

  1. Add words
  2. Measure ambiguity
  3. Identify ambiguous form
  4. Write rule
  5. Measure ambiguity

More resources