Grammar documentation/Examples

From LING073
Jump to: navigation, search

English plurals

this counts as two grammar points: one main point, and a number of nuances that together count again

In English, plural morphology on nouns denote that there is more than one of something.

There are only two numbers in English, singular and plural. The tag for a noun should be <n>, and singular and plural should be <sg> and <pl>, respectively.

Regular plurals

Regular plurals are formed with the addition of «s» or «es».

  • Nouns ending in «s», «ch», «x», or «z» take the suffix «es»:
    • box<n><pl> ↔ boxes
    • match<n><pl> ↔ matches
    • kiss<n><pl> ↔ kisses
  • Otherwise the suffix is «s»:
    • snake<n><pl> ↔ snakes
    • window<n><pl> ↔ windows
    • boy<n><pl> ↔ boys
  • Nouns that end in «o» are unpredictable: some take «s» and some take «es»:
    • potato<n><pl> ↔ potatoes
    • piano<n><pl> ↔ pianos

Regular suffix with stem alternations

  • If a noun ends in a «y» that's preceded by a consonant, the «y» turns to «i» and the suffix is «es»:
    • baby<n><pl> ↔ babies
  • In some (but not all) nouns that end in «f» or «fe», the «f»/«fe» is replaced by «ve» before the suffix «s» is added:
    • leaf<n><pl> ↔ leaves
    • life<n><pl> ↔ lives
  • Some stems that end in «ex» or «ix» change the «x» to «c» before adding «es»:
    • matrix<n><pl> ↔ matrices

No suffix

  • Some plural forms are denoted only by a stem alternation; these are not predictable:
    • tooth<n><pl> ↔ teeth
    • mouse<n><pl> ↔ mice
    • man<n><pl> ↔ men
    • crisis<n><pl> ↔ crises
    • diagnosis<n><pl> ↔ diagnoses
  • Some nouns have identical singular and plural forms:
    • deer<n><pl> ↔ deer
    • fish<n><pl> ↔ fish
    • moose<n><pl> ↔ moose

Other irregular plurals

  • There are a number of other unpredictable patterns:
    • addendum<n><pl> ↔ addenda
    • corpus<n><pl> ↔ corpora
    • alumnus<n><pl> ↔ alumni
    • child<n><pl> ↔ children


Spanish present tense

this only covers the regular forms, so counts as one grammar point; if it covered a bunch of irregular verbs, or a different tense for the regular verbs, a second grammar point would be present

Verbs in Spanish should be tagged <v>. They can be subcategorised into transitive and intransitive verbs with additional tags <tv> and <iv>, respectively (this is important for morphology to some extent, but especially for translation). One of the tenses is the present tense: <pres>. The person and number tags needed will be <p1>, <p2>, <p3> and <sg> and <pl>.

The regular present tense in Spanish is formed by adding a set of endings to the verb stem. The set of endings used depends on the "theme vowel" of the verb: either «a», «e», or «i».

«hablar» (speak) is an «a»-vowel verb, with the stem «habl»-:

  • hablar<v><tv><pres><p1><sg> ↔ hablo
  • hablar<v><tv><pres><p2><sg> ↔ hablas
  • hablar<v><tv><pres><p3><sg> ↔ habla
  • hablar<v><tv><pres><p1><pl> ↔ hablamos
  • hablar<v><tv><pres><p2><pl> ↔ habláis
  • hablar<v><tv><pres><p3><pl> ↔ hablan

«comer» (eat) is an «e»-vowel verb, with the stem «com»-:

  • comer<v><tv><pres><p1><sg> ↔ como
  • comer<v><tv><pres><p2><sg> ↔ comes
  • comer<v><tv><pres><p3><sg> ↔ come
  • comer<v><tv><pres><p1><pl> ↔ comemos
  • comer<v><tv><pres><p2><pl> ↔ coméis
  • comer<v><tv><pres><p3><pl> ↔ comen

«escribir» (write) is an «i»-vowel verb, with the stem «escrib»-:

  • escribir<v><tv><pres><p1><sg> ↔ escribo
  • escribir<v><tv><pres><p2><sg> ↔ escribes
  • escribir<v><tv><pres><p3><sg> ↔ escribe
  • escribir<v><tv><pres><p1><pl> ↔ escribimos
  • escribir<v><tv><pres><p2><pl> ↔ escribís
  • escribir<v><tv><pres><p3><pl> ↔ escriben

Kyrgyz locative case

Kyrgyz nouns (<n>) can be followed by a locative case suffix (<loc>). Locative roughly expresses the same ideas as English "in/at/on".

Plural morphology and possessive morphology may intervene between the verb stem and the locative case suffix. The suffix has eight forms; which one is used is entirely predictable based on the last consonant (if present) and vowel of the material before it, regardless of whether it's part of the noun stem or other morphology.

The first letter of the suffix is «д» after any voiced sound (vowels, sonorants, and «з») and «т» after anything a voiceless sound (the remaining consonants). The second letter of the suffix is a vowel that is either «а», «е», «о», or «ө». «a» occurs if the previous vowel is also «а», or is «ы», «я», «у», or «ю». «е» occurs if the previous vowel is «е», «э», or «и». «о» occurs if the previous vowel is «о» or «ё». «ө» occurs if the previous vowel is «ө» or «ү».

Here is an example of a noun that takes each form of the locative suffix:

  • алма<n><loc> ↔ алмада
  • кол<n><loc> ↔ колдо
  • көз<n><loc> ↔ көздө
  • бел<n><loc> ↔ белде
  • баш<n><loc> ↔ башта
  • чок<n><loc> ↔ чокто
  • күч<n><loc> ↔ күчтө
  • иш<n><loc> ↔ иште

Kyrgyz case suffixes

In Kyrgyz, case suffixes can follow a noun stem (<n>) directly, or number and possession morphology may intervene. This section focuses on the system of case suffixes, not on the conditioning environments for the forms they take.

The main case suffxes used in Kyrgyz include the following:

case name ~meaning tag possible forms алма "apple" гүл "flower"
nominative subject <nom> алма<n><nom> ↔ алма гүл<n><nom> ↔ гүл
accusative definite direct object <acc> ны, ни, ну, нү, ды, ди, ду, дү, ты, ти, ту, тү алма<n><acc> ↔ алманы гүл<n><acc> ↔ гүлдү
genitive possessor <gen> нын, нин, нун, нүн, дын, дин, дун, дүн, тын, тин, тун, түн алма<n><gen> ↔ алманын гүл<n><gen> ↔ гүлдүн
dative "to" <dat> га, го, ге, гө, ка, ко, ке, кө алма<n><dat> ↔ алмага гүл<n><dat> ↔ гүлгө
locative "at, in on" <loc> да, до, де, дө, та, то, те, тө алма<n><loc> ↔ алмада гүл<n><loc> ↔ гүлдө
ablative "from" <abl> дан, дон, ден, дөн, тан, тон, тен, төн алма<n><abl> ↔ алмадан гүл<n><abl> ↔ гүлдөн

Malay adjective reduplication

In Malay, reduplication of an adjective (<adj>) can express either adverbialisation (<advl>) or plurality of a corresponding noun (<pl>).

  • keras<adj><advl> ↔ keras-keras ("loud" → "loudly")
  • besar<adj><pl> ↔ besar-besar ("big", referring to a plural noun)

for this grammar point, I'd want to see at least a couple more example examples (e.g., can "keras-keras" mean "loud (plural)"?), and potentially a sentence or two demonstrating the use

Mandarin personal pronouns

In Mandarin, personal pronouns are distinguished by person (1st, 2nd, 3rd), number (singular, plural), gender in third person (masculine, feminine, neuter), formality in second person (informal, formal), and inclusivity/exclusivity in first person plural (inclusive, exclusive).

Person Singular Plural
1st <prn><pers><p1><sg> ↔ 我 exclusive inclusive
我们<prn><pers><p1><pl><excl> ↔ 我们
我们<prn><pers><p1><pl><excl> ← 我們
咱们<prn><pers><p1><pl><incl> ↔ 咱们
咱们<prn><pers><p1><pl><incl> ← 咱們
2nd informal <prn><pers><p2><sg> ↔ 你 你们<prn><pers><p2><pl> ↔ 你们
你们<prn><pers><p2><pl> ← 你們
formal <prn><pers><p2><sp><frm> ↔ 您
3rd masculine <prn><pers><p3><sg><m> ↔ 他 他们<prn><pers><p3><pl><m> ↔ 他们
他们<prn><pers><p3><pl><m> ← 他們
feminine <prn><pers><p3><sg><f> ↔ 她 她们<prn><pers><p3><pl><f> ↔ 她们
她们<prn><pers><p3><pl><f> ← 她們
neuter <prn><pers><p3><sg><nt> ↔ 它 它们<prn><pers><p3><pl><nt> ↔ 它们
它们<prn><pers><p3><pl><nt> ← 它們
  • Note: while 3rd person pronouns are orthographically distinguished by gender, the pronunciation of the pronouns is the same regardless of gender.

This is one approach to tagging these pronouns, where everything is a category tag. Another approach would treat the plural suffix (consistent throughout) as a grammatical tag, which would mean the lemma for the plural pronouns would change to just the singular element (first character of each). Similarly, gender could be treated as a grammatical tag, and a single lemma could be chosen for the third person pronouns. Taking this to the extreme, person could be a grammatical tag, and then all pronouns would have the same lemma—although deciding what that is might be tricky. In the Apertium English transducer, prpers is used in this way.

English demonstratives

English has four demonstratives, differentiated by proximity (proximal, distal) and number (singular, plural). They may function as determiners and as pronouns.

  • this<det><dem><sg> ↔ this, this<prn><dem><sg> ↔ this
  • that<det><dem><sg> ↔ that, that<prn><dem><sg> ↔ that
  • this<det><dem><pl> ↔ these, this<prn><dem><pl> ↔ these
  • that<det><dem><pl> ↔ those, that<prn><dem><pl> ↔ those

Note that proximity isn't indicated in the tags—that's because it can be distinguished based on the lemma. However, it would be perfectly fine to have tags for that, just like pronouns have tags for person despite also being able to be distinguished by the lemma. At some level it comes down to what's useful at the level of the syntax—in many languages person is useful to know because it needs to agree (e.g., with finite verb forms or possessed noun forms), but proximity often isn't helpful in this way.

Russian spellrelax

In Russian-language pædagogical materials, accent marks are sometimes written to mark stress; they may also be used to differentiate words which are otherwise identical in spelling. Stress marks should be ignored for the purposes of analysis. Some examples include:

  • молоко<n><nt><nom> ↔ молоко́
  • автобус<n><mi><nom> ↔ авто́бус
  • замок<n><mi><nom> ↔ за́мок

Furthermore, the letter «ё» is almost always spelled simply «е», even in normative formal texts, despite the fact that the distinction is part of correct orthographic Russian:

  • шофёр<n><ma><nom> ↔ шофер
  • жёлтый<adj><m><nom> ↔ желтый
  • счёт<n><mi><nom> ↔ счет