Hokkien and English

From LING073
Revision as of 22:30, 1 June 2022 by Pzhang2 (talk | contribs) (Resources Developed for Hokkien)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
10 Sentences
  • (nan) I chit-má bô teh tha̍k Gô-bûn--ah. → (eng) He is no longer studying Russian.
  • output of lexical transfer: I<prn>/He<prn>$ ^*chit/*chit$-^*má/*má$ ^bô<adv>/not<adv>$ ^teh<v>/is<vblex>$ ^*tha/*tha$̍^*k/*k$ ^Gô-bûn<n>/Russian language<n>$--^*ah/*ah$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #He #is #not #Russian language
  • (nan) Góa kin-á-jı̍t ê kong-khò í-keng ôan-sêng--ah. → (eng) I've already finished today's homework
  • output of lexical transfer: ^Góa<prn>/I<prn>$ ^*kin/*kin$-^*á/*á$-^*jı/*jı$̍^*t/*t$ ^*ê/*ê$ ^kong-khò<n>/homework<n>$ ^í-keng<adv>/already<adv>$ ^ôan-sêng<v>/finish<vblex>$--^*ah/*ah$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #I #homework #already #finish.


  • (nan) Hit chhut tiān-iá góa jú-lâi-jú siu-beh khòa. → (eng) I would like to see that movie more and more.
  • output of lexical transfer: ^*Hit/*Hit$ ^*chhut/*chhut$ ^tiān-iá<n>/movie<n>$ ^góa<prn>/I<prn>$ ^*jú/*jú$-^*lâi/*lâi$-^*jú/*jú$ ^siu-beh<v>/want<vblex>$ ^*khòa/*khòa$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #movie #I #want
  • (nan) Góa jú-lâi-jú kah-ì lí ê pêng-iú. → (eng) I like your friend more and more.
  • output of lexical transfer: ^Góa<prn>/I<prn>$ ^*jú/*jú$-^*lâi/*lâi$-^*jú/*jú$ ^kah-ì<v>/like<vblex>$ ^lí<prn>/you<prn>$ ^*ê/*ê$ ^pêng-iú<n>/friend<n>$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #I #like #you #friend
  • (nan) Àm-tǹg chia̍h liáu, góa ū teh lim tê. → (eng) I drink tea after dinner.
  • output of lexical transfer: ^Àm-tǹg<n>/dinner<n>$ ^*chia/*chia$̍^*h/*h$ ^liáu<adv>/after<adv>$^,<cm>/,<cm>$ ^góa<prn>/I<prn>$ ^*ū/*ū$ ^teh<v>/is<vblex>$ ^lim<v>/drink<vblex>$ ^*tê/*tê$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #dinner #after #I #is #drink
  • (nan) Âng-e-á ū teh kóng ōe! → (eng) The baby is talking!
  • output of lexical transfer: ^Âng-e-á<n>/baby<n>$ ^*ū/*ū$ ^teh<v>/is<vblex>$ ^kóng<v>/talk<vblex>$ ^ōe<det>/are<det>$^!<sent>/!<sent>$^.<sent>/.<sent>$
  • output of biltrans: #baby #is #talk #are
  • (nan) Chúi-chi-teng àn chhù-téng tiàu--leh. → (eng) Crystal lights are hanging from the ceiling.
  • output of lexical transfer: ^*Chúi/*Chúi$-^*chi/*chi$-^teng<n>/lights<n>$ ^àn<pr>/from<pr>$ ^chhù-téng<n>/ceiling<n>$ ^tiàu<v>/hang<vblex>$--^leh<det>/are<det>$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #lights #from #ceiling #hang #are
  • (nan) I tī chhia thâu-chêng khiā-leh. → (eng) She is standing in front of the car.
  • output of lexical transfer: ^I<prn>/He<prn>$ ^*tī/*tī$ ^chhia<n>/car<n>$ ^thâu<pr>/in<pr>$-^chêng<n>/front<n>$ ^khiā<v>/stand<vblex>$-^leh<det>/are<det>$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #He #car #in #front #stand #are


  • (nan) Ha̍k-seng chheh the̍h--leh. → (eng) The students are holding their books.
  • output of lexical transfer: ^*Ha/*Ha$̍^*k/*k$-^*seng/*seng$ ^chheh<n>/books<n>$ ^*the/*the$̍^*h/*h$--^leh<det>/are<det>$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #books #the #are
  • (nan) Mn̂g lóng koai--leh. → (eng) All the doors are closed.
  • output of lexical transfer: ^*Mn/*Mn$̂^*g/*g$ ^lóng<adv>/all<adv>$ ^koai<v>/close<vblex>$--^leh<det>/are<det>$^.<sent>/.<sent>$^.<sent>/.<sent>$
  • output of biltrans: #all #close #are

Evaluation

  • Coverage on Monolingual Transducer : 697/4756 ~0.14655172413793103448 (14.7%)
    • TOP UNKNOWN WORDS:
   254 ^ê/*ê$
    85 ^kóng/*kóng$
    78 ^lâng/*lâng$
    71 ^kā/*kā$
    64 ^so͘/*so͘$
    64 ^Iâ/*Iâ$
    58 ^sī/*sī$
    54 ^ōe/*ōe$
    49 ^tio̍h/*tio̍h$
    49 ^in/*in$
    47 ^chi̍t/*chi̍t$
    44 ^lâi/*lâi$
    44 ^khì/*khì$
    42 ^tī/*tī$
    41 ^m̄/*m̄$
    40 ^chiū/*chiū$
    39 ^ū/*ū$
    36 ^sî/*sî$
    36 ^lín/*lín$
    35 ^góa/*góa$

coverage: 697 / 4756 (~0.14655172413793103448) remaining unknown forms: 4059


  • Coverage on Bilingual Transducer : 428 / 2717 (~15.8%)
    • TOP UNKNOWN WORDS:
   156 ^ê/*ê$
    40 ^lâng/*lâng$
    40 ^kóng/*kóng$
    36 ^sī/*sī$
    36 ^ōe/*ōe$
    32 ^tio̍h/*tio̍h$
    29 ^kā/*kā$
    26 ^m̄/*m̄$
    25 ^ū/*ū$
    25 ^so͘/*so͘$
    25 ^in/*in$
    25 ^hō͘/*hō͘$
    25 ^Iâ/*Iâ$
    24 ^tī/*tī$
    24 ^khì/*khì$
    23 ^lâi/*lâi$
    22 ^tùi/*tùi$
    22 ^sî/*sî$
    22 ^chi̍t/*chi̍t$
    20 ^tè/*tè$

coverage: 428 / 2717 (~0.15752668384247331616)