Model2Vec: Distill a Small Fast Model from any Sentence Transformer

Distill small, fast static models from any Sentence Transformer without needing a dataset.

October 14, 2024 · Thomas van Dongen

Demystifying Efficient Self-Attention

A practical overview of efficient attention mechanisms that tackle the quadratic scaling problem.

November 7, 2022 · Thomas van Dongen

Overcoming Input Length Constraints of Transformers

Using extractive summarization to train Transformers on long documents efficiently.

December 14, 2021 · Thomas van Dongen