#knowledge-distillation
Every summary, chronological. Filter by category, tag, or source from the rail.
Tag · #knowledge-distillation
NVIDIA's X-Token: Solving Cross-Tokenizer Knowledge Distillation
X-Token is a projection-based method for cross-tokenizer knowledge distillation that eliminates the harmful partitioning found in previous state-of-the-art methods, outperforming GOLD by +3.82 points on Llama-3.2-1B.
MarkTechPost
Showing 1 of 1