Convexity Based Pruning of Speech Representation Models

Teresa Dorszewski*, Lenka Tetkova, Lars Kai Hansen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Speech representation models based on the transformer ar-chitecture and trained by self-supervised learning have shown great promise for solving tasks such as speech and speaker recognition, keyword spotting, emotion detection, and more. Typically, it is found that larger models lead to better per-formance. However, the significant computational effort involved in such large transformer systems is a challenge for embedded and real-world applications. Recent work has shown that there is significant redundancy in the transformer models for NLP and massive layer pruning is feasible (Sajjad et al., 2023). Here, we investigate layer pruning in audio models. We base the pruning decision on a convexity criterion. Convexity of classification regions has recently been proposed as an indicator of subsequent fine-tuning performance in a range of application domains, including NLP and audio. In empirical investigations, we find a massive reduction in the computational effort with no loss of performance or even improvements in certain cases.
Original languageEnglish
Title of host publicationProceedings of the 2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP)
Number of pages6
PublisherIEEE
Publication date2024
ISBN (Print)979-8-3503-7225-0
ISBN (Electronic)979-8-3503-7226-7
DOIs
Publication statusPublished - 2024
Event2024 IEEE 34th International Workshop on Machine Learning for Signal Processing - London, United Kingdom
Duration: 22 Sept 202425 Sept 2024

Workshop

Workshop2024 IEEE 34th International Workshop on Machine Learning for Signal Processing
Country/TerritoryUnited Kingdom
CityLondon
Period22/09/202425/09/2024
Series IEEE International Workshop on Machine Learning for Signal Processing
ISSN2161-0371

Keywords

  • Convexity
  • Network pruning
  • Self-supervised learning
  • Speech representation learning

Fingerprint

Dive into the research topics of 'Convexity Based Pruning of Speech Representation Models'. Together they form a unique fingerprint.

Cite this