Uzbek/Universal Dependencies

From LING073
Revision as of 19:36, 2 May 2018 by Cdalton2 (talk | contribs)

Jump to: navigation, search


uzb.annotated.ud.conllu uzb.annotated2.ud.conllu
uzb.withmorph.udpipe UAS: 82.07%, LAS: 79.68% UAS: 42.98%, LAS: 35.09%
uzb.nomorph.udpipe UAS: 72.11%, LAS: 69.72% UAS: 43.86%, LAS: 35.09%


forms: 251

sentences: 30


forms: 114

sentences: 9

Dependency relations


@nsubj (nominal subject) denotes a semantically agent-like and syntactically subject-like nominal. Note that it cannot be used for pleonastic subjects.

In Uzbek, @nsubj will usually be assigned to a noun in the nominative case. Generally speaking, a noun in the nominative case is the subject of the sentence and the doer of the action. An example of @nsubj can be seen in the following sentence from our corpus.


"azal" n loc @obl #1->6


"xudo" n nom @nsubj #2->6


"osmon" n nom @obj #3->6


"bilan" cnjcoo @cc #4->5


"yer" n acc @conj #5->3


"yarat" v tv past p3 sg @root #6->0


"." sent @punct #7->6

Azalda Xudo osmon bilan yerni yaratdi.

In.the.beginning God heaven and earth created.

"In the beginning, God created the heavens and the earth."

Note that not all nouns in the nominative case are @nsubj. For some reason, osmon "heaven" in this sentence is in the nominative case even though it is an object and a patient. Perhaps the accusative ending on yerni "earth" is able to cover both conjuncts.

Sometimes @nsubj can be assigned to things that are not prototypically nouns. This occurs in the following sentence.


"kech" adv subst nom @nsubj #1->2


"bo'l" v tv past p3 sg @root #2->0


"," cm @cc #3->5


"erta" adv subst nom @nsubj #4->5


"bo'l" v tv past p3 sg @conj #5->2


"-" guio @cc #6->7


"bu" prn dem nom @conj #7->2


"bir" num ord @nmod #8->9


"kun" n nom @nsubj #9->7


"e" cop past p3 sg @cop #10->7


"." sent @punct #11->2

Kech bo'ldi, erta bo'ldi - bu birinchi kun edi.

Evening was, morning was - this first day was.

"And the evening and the morning were the first day."

According to both the transducer and the CTILD dictionary, kech "evening" and erta "morning" are actually adverbs, but in this sentence, they fill the role of nominal subject.


@obj is usually applied to a patient-like constituent. In Uzbek, the constituent designated @obj will often be a noun in the accusative case. This is seen in the sentence below from our corpus.


"ushbu" prn dem nom @det #1->2


"gumbaz" n acc @obj #2->7


"esa" cnjcoo @advmod #3->7


"xudo" n nom @nsubj #4->7


"osmon" n nom @xcomp #5->7


"deb" cnjsub @advmod #6->7


"ata" v tv past p3 sg @root #7->0


"." sent @punct #8->7

Ushbu gumbazni esa Xudo osmon deb atadi.

This dome whereas God heaven that called.

"God called the expanse heaven."

However, sometimes the @obj can be in other cases. In the sentence below, the obj suvlarning "waters" is in the genitive case.


"xudo" n nom @nsubj #1->5


"quruqlik" n acc @obj #2->5


"yer" n nom @xcomp #3->5


"deb" cnjsub @advmod #4->5


"ata" v tv past p3 sg @root #5->0


"," cm @cc #6->12


"suv" n pl gen @obj #7->12


"jami" adj subst acc @nmod #8->7


"esa" cnjcoo @advmod #9->12


"dengiz" n pl nom @xcomp #10->12


"deb" cnjsub @advmod #11->12


"ata" v tv past p3 sg @conj #12->5


"." sent @punct #13->5

Xudo quruqlikni yer deb atadi, suvlarning jamini esa dengizlar deb atadi.

God land earth that called, waters gathering whereas sea that called.

"God called the dry land earth, and the gathering of the waters He called seas."


@parataxis is the relationship between the main predicate of a sentence and other elements "placed side by side without any explicit coordination, subordination, or argument relation with the head word." Parataxis is used in a variety of situations where the relationship between main sentence and clause is not explicit, especially in direct or reported speech (and so can often be identified w/ punctuation like ";" or ":"). @parataxis can be contrasted against @ccomp, which is used when the reported speech is clearly in a subordinate clause of the main verb. We used @parataxis to identify reported speech in our corpus.


"xudo" n nom @nsubj #1->8


":" sent @punct #2->8


"«" lquot @punct #3->5


"yorug'lik" n nom @nsubj #4->5


"bo'l" v tv perm @parataxis #5->8


"»" rquot @punct #6->5


"," cm @punct #7->8 "<dedi>" "de" v tv past p3 sg @root #8->0 "<va>" "va" cnjcoo @cc #9->12 "<yorug‘lik>" "yorug'lik" n nom @nsubj #10->12 "<paydo>" "paydo" adj @nmod #11->10 "<bo‘ldi>" "bo'l" v tv past p3 sg @conj #12->8 "<.>" "." sent @punct #13->8

@Parataxis is especially helpful in these examples because the syntax of reported speech in Uzbek is even less coordinated than in English: the sentence begins with the speaker, launches into the reported speech, and we only get the speaking verb at the end of the sentence.

"<Yana>" "yana" adv @advmod #1->17 "<Xudo>" "xudo" n nom @nsubj #2->17 "<:>" ":" sent @punct #3->17 "<«>" "«" lquot @punct #4->13 "<Suvliklar>" "suv" n loc @nmod #5->11 "<o‘rtasida>" "o'rta" n px3sp loc @case #6->5 "<ularni>" "ular" prn pers p3 pl acc @obj #7->9 "<bir-biridan>" "bir-bir" prn ind px3sp abl @nmod #8->7 "<ajratib>" "ajra" v iv past p3 sg @acl #9->11 "<turadigan>" "tur" v iv past p3 sg @conj #10->9 "<gumbaz>" "gumbaz" n nom @nsubj #11->13 "<paydo>" "paydo" adj @nmod #12->11 "<bo‘lsin>" "bo'l" v tv perm @parataxis #13->17 "<»>" "»" rquot @punct #14->13 "<,>" "," cm @punct #15->17


"-" guio @punct #16->17 "<dedi>" "de" v tv past p3 sg @root #17->0 "<.>" "." sent @punct #18->17

We also have examples where @acl (clausal modifiers of nouns) are embedded within @parataxis. Which brings us to our next point...


We used @acl


@cc links a coordinating conjunction to the conjunct that comes after it. In Uzbek, one common coordinating conjunction is va, which appears in the sentence below.


"xudo" n nom @nsubj #1->8


":" sent @punct #2->8


"«" lquot @punct #3->5


"yorug'lik" n nom @nsubj #4->5


"bo'l" v tv perm @parataxis #5->8


"»" rquot @punct #6->5


"," cm @punct #7->8


"de" v tv past p3 sg @root #8->0


"va" cnjcoo @cc #9->12


"yorug'lik" n nom @nsubj #10->12


"paydo" adj @nmod #11->10


"bo'l" v tv past p3 sg @conj #12->8


"." sent @punct #13->8

Xudo: "Yorug'lik bo'lsin", dedi va yorug'lik paydo bo'ldi.

God: "Light be", said and light appear did.

"God said, 'Let there be light,' and there was light."

This va isn't in the CTILD dictionary, but based on this sentence, it seems to mean something like "and." Another common @cc is esa "whereas":


"xudo" n gen @nmod #1->2


"ruh" n px3sp nom @root #2->0


"esa" cnjcoo @cc #3->2


"suv" n nom @nmod #4->6


"yuz" num subst px3sp loc @case #5->4


"parvoz" n nom @nsubj #6->2


"etar" v tv past p3 sg @aux #7->8


"e" cop past p3 sg @cop #8->2


"." sent @punct #9->2

Xudoning Ruhi esa suv yuzida parvoz etar edi.

God's Spirit whereas water face flight did was.

"And the spirit of God was hovering over the surface of the waters."