JHU/APL Experiments in Tokenization and Non-Word Translation
CLEF (Working Notes), 2003.
In the past we have conducted experiments that investigate the benefits and peculiarities attendant to alternative methods for tokenization, particularly overlapping character n-grams. This year we continued this line of work and report new findings reaffirming that the judicious use of n-grams can lead to performance surpassing that of w...More
PPT (Upload PPT)