Hokkien/Transducer
From LING073
Contents
Links to Code
Evaluation
March, 2022
Corpus Coverage
The current coverage over the bible is 33%.
TOP UNKNOWN WORDS:
268 ^ê/*ê$ 85 ^kóng/*kóng$ 78 ^lâng/*lâng$ 71 ^kā/*kā$ 64 ^so͘/*so͘$ 64 ^Iâ/*Iâ$ 58 ^sī/*sī$ 54 ^ōe/*ōe$ 49 ^tio̍h/*tio̍h$ 49 ^in/*in$ 48 ^I/*I$ 47 ^chi̍t/*chi̍t$ 44 ^lâi/*lâi$ 44 ^khì/*khì$ 42 ^tī/*tī$ 41 ^m̄/*m̄$ 40 ^chiū/*chiū$ 39 ^ū/*ū$ 36 ^teh/*teh$ 36 ^lín/*lín$
coverage: 2075/6243 (~0.33237225692775909018) remaining unknown forms: 4168 nan.corpus Wed 16 Mar 2022 10:10:16 PM EDT 2075/6243 ~0.33237225692775909018
May 24th, 2022
Corpus Coverage
The current coverage over the bible is 48%.
TOP UNKNOWN WORDS:
64 ^Iâ-so͘/*Iâ-so͘$ 57 ^sī/*sī$ 49 ^in/*in$ 48 ^tio̍h/*tio̍h$ 47 ^chi̍t/*chi̍t$ 44 ^khì/*khì$ 43 ^lâi/*lâi$ 42 ^tī/*tī$ 41 ^m̄/*m̄$ 40 ^chiū/*chiū$ 36 ^lín/*lín$ 34 ^sî/*sî$ 32 ^ū/*ū$ 32 ^hō͘/*hō͘$ 29 ^tùi/*tùi$ 29 ^beh/*beh$ 28 ^tè/*tè$ 28 ^In/*In$ 27 ^khòaⁿ/*khòaⁿ$ 27 ^kap/*kap$
coverage: 2854 / 5849 (~0.48794665754829885451) remaining unknown forms: 2995 nan.corpus 24 Jun 2022 03:12:03 PM EDT 2854/5849 ~0.4879466575482