Khasi/Final Project
From LING073
Pre-Final Project
Number of tokenised words in the corpus: 57847 Coverage: 57.26% Top unknown words in the corpus: 206 kam 200 kum 179 Bah 172 kiwei 171 bynta 170 baroh 166 lah 159 pat 147 mynta 130 noh 125 paidbah 124 ne 122 ïoh 119 por 118 wan 118 Shillong 117 namar 117 Khasi 112 katei 111 Jylla