4Opinion·Jun 17
Attention Sinks in Diffusion Transformers: A Causal Analysis
Researchers have conducted a causal analysis to determine the role of attention sinks in diffusion transformer models. While these high-attention tokens are well-understood in language models, this study aims to clarify if they serve a similarly critical function in image generation systems.
Covered by 1 source
- AarXiv CS.AI↗Fangzheng Wu, Brian SummaJun 17