#portuguese
Every summary, chronological. Filter by category, tag, or source from the rail.
Tag · #portuguese
Toten: Ontological Tokenization for Technical Portuguese
Toten is a knowledge-based tokenization framework designed to accurately parse physical quantities and technical notation in Brazilian Portuguese, addressing common failures in standard NLP tokenizers.
arXiv cs.AI
Showing 1 of 1