Communication requires cognitive processes which are not captured by traditional speech understanding tests. Under challenging listening situations, more working memory resources are needed to process speech, leaving fewer resources available for storage. The aim of the current study was to investigate the effect of task difficulty predictability, that is, knowing versus not knowing task difficulty in advance, and the effect of noise reduction on working memory resource allocation to processing and storage of speech heard in background noise. For this purpose, an "offline" behavioral measure, the Sentence-Final Word Identification and Recall (SWIR) test, and an "online" physiological measure, pupillometry, were combined. Moreover, the outcomes of the two measures were compared to investigate whether they reflect the same processes related to resource allocation. Twenty-four experienced hearing aid users with moderate to moderately severe hearing loss participated in this study. The SWIR test and pupillometry were measured simultaneously with noise reduction in the test hearing aids activated and deactivated in a background noise composed of four-talker babble. The task of the SWIR test is to listen to lists of sentences, repeat the last word immediately after each sentence and recall the repeated words when the list is finished. The sentence baseline dilation, which is defined as the mean pupil dilation before each sentence, and task-evoked peak pupil dilation (PPD) were analyzed over the course of the lists. The task difficulty predictability was manipulated by including lists of three, five, and seven sentences. The test was conducted over two sessions, one during which the participants were informed about list length before each list (predictable task difficulty) and one during which they were not (unpredictable task difficulty). The sentence baseline dilation was higher when task difficulty was unpredictable compared to predictable, except at the start of the list, where there was no difference. The PPD tended to be higher at the beginning of the list, this pattern being more prominent when task difficulty was unpredictable. Recall performance was better and sentence baseline dilation was higher when noise reduction was on, especially toward the end of longer lists. There was no effect of noise reduction on PPD. Task difficulty predictability did not have an effect on resource allocation, since recall performance was similar independently of whether task difficulty was predictable or unpredictable. The higher sentence baseline dilation when task difficulty was unpredictable likely reflected a difference in the recall strategy or higher degree of task engagement/alertness or arousal. Hence, pupillometry captured processes which the SWIR test does not capture. Noise reduction frees up resources to be used for storage of speech, which was reflected in the better recall performance and larger sentence baseline dilation toward the end of the list when noise reduction was on. Thus, both measures captured different temporal aspects of the same processes related to resource allocation with noise reduction on and off.