Transformers list and codes
A Length-Extrapolatable Transformer
https://arxiv.org/abs/2212.10554
https://github.com/sunyt32/torchscale
Longformer: The Long-Document Transformer
https://github.com/allenai/longformer
https://arxiv.org/pdf/2004.05150.pdf
A Length-Extrapolatable Transformer
https://arxiv.org/pdf/2212.10554.pdf
https://github.com/microsoft/torchscale
Efficient Content-Based Sparse Attention with Routing
Transformers
https://arxiv.org/pdf/2003.05997.pdf
https://github.com/google-research/google-research/tree/master/routing_transformer
Comments