Difference between revisions of "User:Qfeng1/Final project"
From LING073
(Created page with "Our project is to expand what we have accomplished in class, the monolingual transducer for the Chechen language. We are aiming at a goal of over 85% coverage rate over the l...") |
|||
Line 8: | Line 8: | ||
==Evaluation== | ==Evaluation== | ||
− | {| | + | {| class="wikitable" |
|# of stems in transducer | |# of stems in transducer | ||
|# of disambiguation rules | |# of disambiguation rules |
Revision as of 22:30, 13 May 2019
Our project is to expand what we have accomplished in class, the monolingual transducer for the Chechen language. We are aiming at a goal of over 85% coverage rate over the large corpus we extracted from Wikipedia pages that are in Chechen.
Major Steps
- To expand the morphology (in lexc and towl files)
- To generate the top list of unknown words in corpus and add more word stems to lexc file
- To measure the level of ambiguity in the large corpus and figure out more disambiguation rules to increase the accuracy of tagger.
Evaluation
# of stems in transducer | # of disambiguation rules | # of words in Wiki corpus | Coverage Rate | Precision | Recall |
Bread | Pie |