DARIO: Differentiable vision transformer pruning with low-cost proxies

Sun H.; Heuillet A.; Mohr F.; Tabia H.

Item Links

URI: http://hdl.handle.net/10818/63362
Visitar enlace: https://www.scopus.com/inward/ ...

DOI: 10.1109/JSTSP.2024.3501685

Statistics

View Usage Statistics

Metrics

Bibliographic cataloging

Show full item record

Author

Sun H.; Heuillet A.; Mohr F.; Tabia H.

Date

2024

Abstract

Transformer models have gained popularity for their exceptional performance. However, these models still face the challenge of high inference latency. To improve the computational efficiency of such models, we propose a novel differentiable pruning method called DARIO (DifferentiAble vision transformer pRunIng with low-cost prOxies). Our approach involves optimizing a set of gating parameters using differentiable, data-agnostic, scale-invariant, and low-cost performance proxies. DARIO is a data-agnostic pruning method, it does not need any classification heads during pruning. We evaluated DARIO on two popular state-of-the-art pre-trained ViT models, including both large (MAE-ViT) and small (MobileViT) sizes. Extensive experiments conducted across 40 diverse datasets demonstrated the effectiveness and efficiency of our DARIO method. DARIO not only significantly improves inference efficiency on modern hardware but also excels in preserving accuracy. Notably, DARIO has even achieved an increase in accuracy on MobileViT, despite only fine-tuning the last block and the classification head. © 2007-2012 IEEE.

Keywords

Neural architecture search

Neural network compression

Neural network pruning

Ubication

IEEE Journal on Selected Topics in Signal Processing

Collections to which it belong

Facultad de Ingeniería [585]

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Browse

What you need to know

Autofile of works

My Account

Context

Statistics

Sitios de Interés