Hasty Briefsbeta

Bilingual

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

10 months ago
  • #natural language processing
  • #end-to-end models
  • #machine learning
  • Introduces dynamic chunking mechanism for end-to-end hierarchical sequence modeling.
  • Replaces tokenization-LM-detokenization pipeline with a single H-Net model.
  • H-Net outperforms Transformer models at byte level and scales better with data.
  • Shows increased character-level robustness and learns meaningful chunking strategies without supervision.
  • Demonstrates significant improvements in languages and modalities with weaker tokenization heuristics.