DARIO: Differentiable vision transformer pruning with low-cost proxies

Sun H.; Heuillet A.; Mohr F.; Tabia H.

Enlaces del Item

URI: http://hdl.handle.net/10818/63362
Visitar enlace: https://www.scopus.com/inward/ ...

DOI: 10.1109/JSTSP.2024.3501685

Estadísticas

Ver Estadísticas de uso

Métricas

Catalogación bibliográfica

Mostrar el registro completo del ítem

Autor/es

Sun H.; Heuillet A.; Mohr F.; Tabia H.

Fecha

2024

Resumen

Transformer models have gained popularity for their exceptional performance. However, these models still face the challenge of high inference latency. To improve the computational efficiency of such models, we propose a novel differentiable pruning method called DARIO (DifferentiAble vision transformer pRunIng with low-cost prOxies). Our approach involves optimizing a set of gating parameters using differentiable, data-agnostic, scale-invariant, and low-cost performance proxies. DARIO is a data-agnostic pruning method, it does not need any classification heads during pruning. We evaluated DARIO on two popular state-of-the-art pre-trained ViT models, including both large (MAE-ViT) and small (MobileViT) sizes. Extensive experiments conducted across 40 diverse datasets demonstrated the effectiveness and efficiency of our DARIO method. DARIO not only significantly improves inference efficiency on modern hardware but also excels in preserving accuracy. Notably, DARIO has even achieved an increase in accuracy on MobileViT, despite only fine-tuning the last block and the classification head. © 2007-2012 IEEE.

Palabras clave

Neural architecture search

Neural network compression

Neural network pruning

Ubicación

IEEE Journal on Selected Topics in Signal Processing

Colecciones a las que pertenece

Facultad de Ingeniería [585]

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Organizar

Lo que necesitas saber

Autoarchivo de trabajos

Mi cuenta

Contexto

Estadísticas

Sitios de Interés