Difference between revisions of "Urum/Transducer"

From LING073
Jump to: navigation, search
(Evaluation)
(Notes)
Line 92: Line 92:
  
 
'''What tests still don't work and why'''
 
'''What tests still don't work and why'''
* '''Accusative Case''', we still have no idea why they are not working correctly. When we tried to check for as to why the error might be occuring, we checked each of the individual generations per the hfst-expand greping for the correct output, and it was able to be found for ever single instance. As for why this does not register as "gold" by the apertium-regtest, is still unclear. Perhaps there could be an issue with the unicode of the tests on the wiki (will check later).
 
 
* '''Locative Case ''', For locative case, there are some additional steps that need to account for the dropping of [м] from possession. This will need to be ironed out later.
 
* '''Locative Case ''', For locative case, there are some additional steps that need to account for the dropping of [м] from possession. This will need to be ironed out later.
* '''Dative Case ''', Same as above.
+
* '''Posessiveness ''', Similar to the issue in the Locative Case, these are actually generating perfectly to match the gold, however, for some reason they do not actually register when looking at them through the tests. This does not make sense, grepping should provide an accurate snapshot of what is actually being generated.
* '''Posessiveness ''', Similar to the issue in the Accusative Case, these are actually generating perfectly to match the gold, however, for some reason they do not actually register when looking at them through the tests. This does not make sense, grepping should provide an accurate snapshot of what is actually being generated.
 
* '''Plurality ''', Same as above.
 
* '''Verbs Present Tense ''', Same as above.
 
  
 
'''To be revised and other comments (might resolve the issues mentioned above!)'''
 
'''To be revised and other comments (might resolve the issues mentioned above!)'''

Revision as of 00:26, 17 March 2023

This is the page for our Urum transducer.

Evaluation

Total number of stems in the transducer

    • Lexicons: 17
    • Lexicon entries: 107
    • Patterns: 7
    • Pattern entries: 14
  • Counts for individual lexicons:
    • Punctuation: 22
    • V-Root: 10
    • TransitivityTag: 2
    • VPres: 6
    • VPresNeg: 6
    • VPast: 6
    • N-Root: 27
    • Plurality: 1
    • Possession: 6
    • Cases: 6
    • Numeral: 10
    • All anonymous lexicons: 5

Current coverage over your combined corpus

  • coverage: 333 / 1178 (~0.28268251273344651952)
  • remaining unknown forms: 845
  • ambiguity: 1341 / 1178 (~1.13837011884550084890)

The current list of top unknown words returned by aq-covtest

TOP UNKNOWN WORDS:

    13 *
    11 ^д/*д$
    11 ^ве/*ве$
    10 ^дэ/*дэ$
    10 ^да/*да$
     8 ^эй/*эй$
     7 ^бен/*бен$
     6 ^олсун/*олсун$
     6 ^махлуклар/*махлуклар$
     6 ^йар/*йар$
     6 ^и/*и$
     6 ^бир/*бир$
     5 ^эм/*эм$
     5 ^т`и/*т`и$
     5 ^илен/*илен$
     5 ^доғду/*доғду$
     5 ^бу/*бу$
     5 ^Христос/*Христос$
     4 ^эт/*эт$
     4 ^эди/*эди$

Number of tests that pass from regtests in test/ Corpus 1 of 10: Plurality-morph

 8/8 (100.0%) tests pass (8/8 (100.0%) match gold)

Corpus 2 of 10: Verbs Present Tense-morph

 10/11 (90.91%) tests pass (10/10 (100.0%) match gold)

Corpus 3 of 10: Possession-morph

 5/7 (71.43%) tests pass (5/5 (100.0%) match gold)

Corpus 4 of 10: Accusative Case-morph

 3/3 (100.0%) tests pass (3/3 (100.0%) match gold)

Corpus 5 of 10: Dative Case-morph

 4/4 (100.0%) tests pass (4/4 (100.0%) match gold)

Corpus 6 of 10: Locative Case-morph

 4/6 (66.67%) tests pass (2/4 (50.0%) match gold)

Corpus 7 of 10: Genitive Case-morph

 3/3 (100.0%) tests pass (3/3 (100.0%) match gold)

Corpus 8 of 10: Ablative Case-morph

 4/4 (100.0%) tests pass (4/4 (100.0%) match gold)

Corpus 9 of 10: Past Tense-morph

 6/6 (100.0%) tests pass (6/6 (100.0%) match gold)

Corpus 10 of 10: Verb Negation-morph

 4/4 (100.0%) tests pass (4/4 (100.0%) match gold)


All tests pass.

Notes

What tests still don't work and why

  • Locative Case , For locative case, there are some additional steps that need to account for the dropping of [м] from possession. This will need to be ironed out later.
  • Posessiveness , Similar to the issue in the Locative Case, these are actually generating perfectly to match the gold, however, for some reason they do not actually register when looking at them through the tests. This does not make sense, grepping should provide an accurate snapshot of what is actually being generated.

To be revised and other comments (might resolve the issues mentioned above!)

  • Verb negation <neg> should be just a -м{A} suffix that comes before the verb ending. I just added notes to lexd on this, waiting for approval from Sasha/JNW. Will have to edit lexd accordingly, and change order of tags in Urum/Grammar. (from pres-neg-person to pres-person-neg)
  • maybe something is not working because we redefined {A} in twol in the process. It used to be {A}:0 by default, now it's the "first type of vowel harmony".
  • Alina is currently reviewing and editing Urum/Grammar. There are some issues with Locative Case + Possession that I would like to ask JNW about.