FMPP is a general-purpose text file preprocessor tool that uses FreeMarker templates. It is particularly designed for HTML preprocessor, for the generation of complete (static) home-pages: directory structure that contains HTML-s, image files, etc.
Ucto-基于规则的令牌生成器
Centre for Language and Speech technology, Radboud University Nijmegen
Induction of Linguistic Knowledge Research Group, Tilburg University
网址: :
Ucto标记文本文件:将单词与标点符号分开,并拆分句子。 这是几乎所有自然语言处理应用程序的首要任务之一。 Ucto提供了其他几个基本的预处理步骤,例如更改大小写,您都可