Difference between revisions of "User:Qfeng1/Final project"

From LING073
Jump to: navigation, search
(Created page with "Our project is to expand what we have accomplished in class, the monolingual transducer for the Chechen language. We are aiming at a goal of over 85% coverage rate over the l...")
 
Line 8: Line 8:
 
==Evaluation==
 
==Evaluation==
  
{|
+
{| class="wikitable"
 
|# of stems in transducer
 
|# of stems in transducer
 
|# of disambiguation rules  
 
|# of disambiguation rules  

Revision as of 22:30, 13 May 2019

Our project is to expand what we have accomplished in class, the monolingual transducer for the Chechen language. We are aiming at a goal of over 85% coverage rate over the large corpus we extracted from Wikipedia pages that are in Chechen.

Major Steps

  • To expand the morphology (in lexc and towl files)
  • To generate the top list of unknown words in corpus and add more word stems to lexc file
  • To measure the level of ambiguity in the large corpus and figure out more disambiguation rules to increase the accuracy of tagger.

Evaluation

# of stems in transducer # of disambiguation rules # of words in Wiki corpus Coverage Rate Precision Recall
Bread Pie

Further Improvement