Skip to content

Rule based tokenizer V1

Uses spaCy language rules as a base and extends them with custom regex patterns and additional infix rules to support specific tokenization cases.

Merge request reports

Loading