Paper • 2004.05150 • Published • 4
Introduction
Allenai's Longformer Encoder-Decoder (LED).
As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.
This model is especially interesting for long-range summarization and question answering.
Fine-tuning for down-stream task
This notebook shows how led-base-16384 can effectively be fine-tuned on a downstream task.
- Downloads last month
- 18,127
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
