attention-matrix
Here are 3 public repositories matching this topic...
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
-
Updated
Jan 24, 2024 - Jupyter Notebook
TensorFlow implementation of Graphical Attention Recurrent Neural Networks based on work by Cirstea et al., 2019.
-
Updated
Jan 2, 2020 - Python
Attention Saver lets you extract entire attention matrices or row-wise statistics (e.g. entropy) from any HuggingFace causal LLM layer for ultra-long context when using flash-attention without running out of GPU memory.
-
Updated
Oct 6, 2025 - Python
Improve this page
Add a description, image, and links to the attention-matrix topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the attention-matrix topic, visit your repo's landing page and select "manage topics."