Difference between revisions of "Neo-Aramaic/Transducer"

From LING073
Jump to: navigation, search
Line 51: Line 51:
 
== Notes ==
 
== Notes ==
  
Sadly, only 44 of our 92 tests pass at the moment. We changed many of our lemmas in the .yaml file to match what our transducer produces. We still need to change a few of the lemmas and remove a lot of tags that we don't currently have implemented in our transducer. Also, there are a few roots that we don't have in our transducer yet. As we do these things, we expect more tests to start passing. At some point, we also hope to be able to modify the transducer to re-implement some of the tags we removed and to use the canonical roots of words as the lemmas, a process that will require introducing more archiphonemes.
+
53 of 83 tests pass now. Yay!
  
 
[[Category:Sp18_Transducers]]
 
[[Category:Sp18_Transducers]]

Revision as of 12:58, 15 March 2018

The code for the transducer can be found in this Github repository.

Evaluation

Number of tokenised words in the corpus: 2556

Coverage: 9.66%

Top unknown words in the corpus:

114 ܐ

86 ܕ

66 ܠܹܗ

62 ܡ

54 ܒ

46 ܢ

44 ܘ

44 ܠ

39 ܡܘܼܠܸܕ

36 ܝ

31 ܪ

30 ܡܢ

23 ܵܐ

21 ܗ

20 ܹܐ

18 ܠܐ

17 ܫ

14 ܵ

13 ܚ

12 ܡܕܝܢܬܐ

Notes

53 of 83 tests pass now. Yay!