Abstract
A vastly under-explored area in speech anonymization involves characterizing how different speakers perform in voice privacy tasks. In this paper, we present a deeper analysis by creating and analyzing groups of challenging speakers categorized based on their performance in two related facets of voice anonymization evaluation: (1) speaker similarity using automatic speaker verification (ASV) and (2) human perception using a large-scale A/B listening test. We group speakers into four categories (sheep, goats, lambs, and wolves) based on their anonymization properties. We present an extension of voice anonymization evaluation by identifying speakers who are easy to imitate or difficult to recognize. This knowledge is important for trustworthy anonymization evaluation, and it has the potential to influence how evaluation datasets are created from a pool of speakers. We provide further insights on speaker influence on anonymized speech between human perception and automatic speaker similarity scoring.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Publisher | IEEE |
Publication date | 2024 |
Pages | 12491-12495 |
ISBN (Print) | 979-8-3503-4486-8 |
ISBN (Electronic) | 979-8-3503-4485-1 |
DOIs | |
Publication status | Published - 2024 |
Event | 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Seoul, Korea, Republic of Duration: 14 Apr 2024 → 19 Apr 2024 |
Conference
Conference | 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
---|---|
Country/Territory | Korea, Republic of |
City | Seoul |
Period | 14/04/2024 → 19/04/2024 |
Keywords
- Anonymization perception
- Speaker characterization
- Voice anonymization