- tags
- Transformers, Computer vision, ViT
- paper
- (Hatamizadeh et al. 2022)
Architecture
This is a hierarchical version of ViT with both local and global attention.
Parameter count
90M
Bibliography
- Ali Hatamizadeh, Hongxu Yin, Jan Kautz, Pavlo Molchanov. . "Global Context Vision Transformers". arXiv. DOI.