Transformers list and codes

A Length-Extrapolatable Transformer https://arxiv.org/abs/2212.10554 https://github.com/sunyt32/torchscale Longformer: The Long-Document Transformer https://github.com/allenai/longformer https://arxiv.org/pdf/2004.05150.pdf A Length-Extrapolatable Transformer https://arxiv.org/pdf/2212.10554.pdf https://github.com/microsoft/torchscale Efficient Content-Based Sparse Attention with Routing Transformers https://arxiv.org/pdf/2003.05997.pdf https://github.com/google-research/google-research/tree/master/routing_transformer

Comments

Popular posts from this blog

The specified initialization vector (IV) does not match the block size for this algorithm