Glimpse formation based on local feature contrast and spectro-temporal context

Tobias May, Sarinah Sutojo*, Steven van de Par

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

23 Downloads (Pure)


The blind segregration of acoustic sources from a mixture of different sounds remains one of the main challenges in the computer-based analysis of audio signals. One approach to achieve this segregation is to divide the audio input into spectro-temporal segments which each are assumed to be dominated by the same source. These segments are also referred to as glimpses of the locally dominant source and can be used to reconstruct or analyze the corresponding source signal. This contribution is concerned with the source-independent segmentation of acoustic scenes by extracting glimpses based on locally observable feature contrasts between neighboring timefrequency units. The goal of this data-driven approach is to avoid source-specific assumptions and to achieve more robustness to unknown acoustic scenes as compared to class-based systems. The presented algorithm uses a combination of different acoustic features to derive a map of feature contrasts which indicates on- and offsets of acoustic sources. Areas which are enclosed by high contrasts, are assumed to exhibit consistent features and thus orignate from the same source. Such regions are then converted into spectro-temporal glimpses by applying two different image segmentation methods (graph-based superpixels and regiongrow).
Original languageEnglish
Title of host publicationProceedings of Forum Acusticum 2020
PublisherEuropean Acoustics Association
Publication date2020
Publication statusPublished - 2020
EventForum Acusticum 2020 - Virtual event
Duration: 7 Dec 202011 Dec 2020


ConferenceForum Acusticum 2020
LocationVirtual event
Internet address

Fingerprint Dive into the research topics of 'Glimpse formation based on local feature contrast and spectro-temporal context'. Together they form a unique fingerprint.

Cite this