Dependency syntax

From LING073
Revision as of 02:53, 17 April 2017 by Jwashin1 (talk | contribs) (Annotation of a single sentence)

Jump to: navigation, search

Dependency syntax is a framework for syntax which considers relations between words to be "dependencies". It happens to be easy to deal with for various computational linguistics applications: any given token has a "head" (what it depends on) and a "relation" to that head. E.g., in "a house", a determiner/article depends on a noun (the noun is the head to the determiner), and the relation is one of det (depending on the annotation standard used).

Universal Dependencies is a project aimed at designing a standardised set of dependency syntax annotation principles for use across languages, with community-contributed language-specific annotation guidelines.


Guidance

Annotation of a single sentence

You start with a sentence:

Everyone in this class works on a different language.

You run it through your disambiguator and manually disambiguate as needed so that each word has only one analysis (in CG) format:

"<Everyone>"
	"everyone" prn ind mf sg
"<in>"
	"in" pr
"<this>"
	"this" det dem sg
"<class>"
	"class" n sg
"<works>"
	"work" vblex pres p3 sg
"<on>"
	"on" adv
"<a>"
	"a" det ind sg
"<different>"
	"different" adj
"<language>"
	"language" n sg
"<.>"
	"." sent

Then you consider each word and what other word it depends on and how. E.g., "Everyone" has an nsubj dependency on "works" (i.e., it's its nominal subject). You then encode these relationships in the sentence:

"<Everyone>"
	"everyone" prn ind mf sg @nsubj #1->5
"<in>"
	"in" pr @case #2->4
"<this>"
	"this" det dem sg @det #3->4
"<class>"
	"class" n sg @nmod #4->1
"<works>"
	"work" vblex pres p3 sg @root #5->0
"<on>"
	"on" adv @compound:prt #6->5
"<a>"
	"a" det ind sg @det #7->9
"<different>"
	"different" adj @amod #8->9
"<language>"
	"language" n sg @obj #9->5
"<.>"
	"." sent @punct #10->5

Evaluation

Two metrics:

  • Label Attachment Score (LAS): the percentage of tokens that the parser assigned the right dependency head and relation to.
  • Unlabeled Attachment Score (UAS): the percentage of tokens that the parser assigned the right dependency head to.