Hokkien/Transducer

From LING073
Jump to: navigation, search

Links to Code

Link to GitHub repository

Evaluation

March, 2022

Corpus Coverage

The current coverage over the bible is 33%.

TOP UNKNOWN WORDS:

  268 ^ê/*ê$
  85 ^kóng/*kóng$
  78 ^lâng/*lâng$
  71 ^kā/*kā$
  64 ^so͘/*so͘$
  64 ^Iâ/*Iâ$
  58 ^sī/*sī$
  54 ^ōe/*ōe$
  49 ^tio̍h/*tio̍h$
  49 ^in/*in$
  48 ^I/*I$
  47 ^chi̍t/*chi̍t$
  44 ^lâi/*lâi$
  44 ^khì/*khì$
  42 ^tī/*tī$
  41 ^m̄/*m̄$
  40 ^chiū/*chiū$
  39 ^ū/*ū$
  36 ^teh/*teh$
  36 ^lín/*lín$

coverage: 2075/6243 (~0.33237225692775909018) remaining unknown forms: 4168 nan.corpus Wed 16 Mar 2022 10:10:16 PM EDT 2075/6243 ~0.33237225692775909018

May 24th, 2022

Corpus Coverage

The current coverage over the bible is 48%.


TOP UNKNOWN WORDS:

  64 ^Iâ-so͘/*Iâ-so͘$
  57 ^sī/*sī$
  49 ^in/*in$
  48 ^tio̍h/*tio̍h$
  47 ^chi̍t/*chi̍t$
  44 ^khì/*khì$
  43 ^lâi/*lâi$
  42 ^tī/*tī$
  41 ^m̄/*m̄$
  40 ^chiū/*chiū$
  36 ^lín/*lín$
  34 ^sî/*sî$
  32 ^ū/*ū$
  32 ^hō͘/*hō͘$
  29 ^tùi/*tùi$
  29 ^beh/*beh$
  28 ^tè/*tè$
  28 ^In/*In$
  27 ^khòaⁿ/*khòaⁿ$
  27 ^kap/*kap$

coverage: 2854 / 5849 (~0.48794665754829885451) remaining unknown forms: 2995 nan.corpus 24 Jun 2022 03:12:03 PM EDT 2854/5849 ~0.4879466575482

Notes