In the present study, speech intelligibility was evaluated in realistic, controlled conditions. “Critical sound scenarios” were defined as acoustic scenes that hearing aid users considered important, difficult, and common through ecological momentary assessment. These sound scenarios were acquired in the real world using a spherical microphone array and reproduced inside a loudspeaker-based virtual sound environment (VSE) using Ambisonics. Speech reception thresholds (SRT) were measured for normal-hearing (NH) and hearing-impaired (HI) listeners, using sentences from the Danish hearing in noise test, spatially embedded in the acoustic background of an office meeting sound scenario. In addition, speech recognition scores (SRS) were obtained at a fixed signal-to-noise ratio (SNR) of −2.5 dB, corresponding to the median conversational SNR in the office meeting. SRTs measured in the realistic VSE-reproduced background were significantly higher for NH and HI listeners than those obtained with artificial noise presented over headphones, presumably due to an increased amount of modulation masking and a larger cognitive effort required to separate the target speech from the intelligible interferers in the realistic background. SRSs obtained at the fixed SNR in the realistic background could be used to relate the listeners' SI to the potential challenges they experience in the real world.