DARIO: Differentiable vision transformer pruning with low-cost proxies
Enlaces del Item
URI: http://hdl.handle.net/10818/63362Visitar enlace: https://www.scopus.com/inward/ ...
DOI: 10.1109/JSTSP.2024.3501685
Compartir
Estadísticas
Ver Estadísticas de usoCatalogación bibliográfica
Mostrar el registro completo del ítemFecha
2024Resumen
Transformer models have gained popularity for their exceptional performance. However, these models still face the challenge of high inference latency. To improve the computational efficiency of such models, we propose a novel differentiable pruning method called DARIO (DifferentiAble vision transformer pRunIng with low-cost prOxies). Our approach involves optimizing a set of gating parameters using differentiable, data-agnostic, scale-invariant, and low-cost performance proxies. DARIO is a data-agnostic pruning method, it does not need any classification heads during pruning. We evaluated DARIO on two popular state-of-the-art pre-trained ViT models, including both large (MAE-ViT) and small (MobileViT) sizes. Extensive experiments conducted across 40 diverse datasets demonstrated the effectiveness and efficiency of our DARIO method. DARIO not only significantly improves inference efficiency on modern hardware but also excels in preserving accuracy. Notably, DARIO has even achieved an increase in accuracy on MobileViT, despite only fine-tuning the last block and the classification head. © 2007-2012 IEEE.
Ubicación
IEEE Journal on Selected Topics in Signal Processing
Colecciones a las que pertenece
- pruebas_A1 [130]