Conclusion :
Visual Transformers are a testament to the versatility of the Transformer architecture, proving its efficacy beyond just textual data. As the computer vision community continues to explore this novel direction, we can expect further advancements and perhaps a new state-of-the-art that integrates the best of both CNNs and Transformers. The future of computer vision is evolving, and Visual Transformers are at the forefront of this exciting journey.
--
Sundar Balamurugan