Grammar documentation

From LING073
Jump to: navigation, search

Notes

What we're concerned with

  • Any alternation of forms of a given word based on their syntactic or phonological environment
    • Phonological and [functional] morphological alternations
  • Any categorisation schema relevant to the lexicon
    • noun/verb classes, pronoun features, etc.
    • can have bearing on what morphology is taken, or what syntactic arguments are allowed

To consider

  • Are there any irregular forms in the language? What about the pronouns (are their alternations identical to nouns?)?
  • Is there any sort of agreement morphology for person/number/etc on verbs or nouns?
  • How are tense/mood/aspect/evidentiality marked in the language? Do the verbs change form? Are there auxiliary verbs or other particle-like words that might be analysed as part of the morphology instead of as separate words? Do transitive and intransitive verbs take different morphology, or different syntactic arguments?
  • Do nouns change form in different number (e.g., singular/plural)? Do they change form based on how they are used in the sentence? Are they lexically specified for class (masculine/feminine, or more?)? Do all nouns take the same set of forms?
  • Do adjectives behave like nouns in terms of number/case/etc.? How are comparatives formed?
  • What properties of personal pronouns are distinguished? Most languages have at least 3 person distinctions (1st, 2nd, 3rd) and many have number distinctions as well (singular, plural). Are other things distinguished, like an additional person or number, relative social class encoding of speaker and hearer (for all pronouns or just 2nd person?), etc.?
  • What about demonstrative and interrogative pronouns?
  • Are there any phonologically productive alternations in the language?
  • Make sure you say what the use of the morphology is.

Examples

  • You could document something like the plural pattern(s) of English. List the regular form, predictable alternations, and a list of irregular forms.
  • You could document something like a single tense conjugation of Spanish. Mention that the theme vowel determines what the set of endings is, and list the endings for each person/number.
  • spellrelax. If there is a list of common spelling alternatives that you want to interpret as a given standard, listing those (with some explanation) can count as one grammar point. For example, if certain accent marks are considered proper, but most people don't use them, then you'll want to interpret characters without these accent marks

The assignment

This assignment is due before lab on Thursday of the 4th week of class (this semester, 11:20 on Thursday, February 9th, 2017).

  1. Determine what the main parts of speech are in your language. There are going to be some open classes, like nouns, verbs, and adjectives, and some [relatively more] closed classes like prepositions or pronouns. Create one section on a Language/Grammar page on the wiki outlining the main parts of speech and any subcategories, providing computational POS symbols (or tags) for each one that are compatible with the ones used by the Apertium project. Give an example or two of each class and subclass using the {{morphTest}} template as it's used on the examples page.
  2. Find any set of alternations (as described above) in your language. For each one, write one new section describing this grammar point. Provide some examples. You'll have to make preliminary decisions about what the base form is, what tags you should be using, etc. Some examples are available. You should have at least ten grammar points in all.
    • If you're working on a polysynthetic language, you may have a lot of options to sort out in order to choose discrete grammar points, and if you're working on a more isolating language, there may not be a lot of morphology points to choose from. If you need easier grammar points or just more grammar points, then feel free to create sections for some of the examples listed above that aren't on the examples page. Include these in an "Other" section, since they won't be relevant for your language. If you're working on a polysynthetic language, though, please limit the number of easier grammar points from other languages you choose to two only.
    • As mentioned above, you can list spellrelax mappings and count that as one grammar point.
    • If you identify a dominant pattern (like x when A and y when B), and are also able to document a number of exceptions, this can count as a second grammar point—but even if there are four dominant patterns, if it's the same process it can only count as two grammar points.
    • Each grammar point should have at least three examples using the {{morphTest}} template.
  3. Add the page to the category Grammar documentation and also a category for your language. Add a link to this new page to the main language page, under the section for resources developed in this class.

Sanity checks

  • There should be at least 50 morphTests.
    • You can have examples of each part of speech tag in the initial section.
  • Each morphTest should have an analysis on the left and a form on the right.
    • The analysis should have a stem (or "lemma"), a main categorisation tag (e.g., <v>), any sub-categorisation tags (e.g., <iv>), and any morphology tags (e.g., <past>).
    • The morphological forms should be proper orthographic forms of the language (i.e., native orthography, not grammar book orthography). There should be no dashes in the forms, no extraneous quotation marks, no English glosses, etc. inside the morphTest template. You can have these things in notes outside the template.
  • Make sure you use the same tag throughout the page consistently—e.g., you don't want <v>, <vb>, and <vblex> all used for verbs—choose one and be consistent.
  • There should be minimal use of non-productive morphology, such as derivations. An example of this might be infinite<adj><→n> ↔ infinity, since this same process can't be applied freely to any noun. In some languages, derivation of this sort is entirely or almost entirely productive, in which case this is fine.