Urban development models typically provide simulated building areas in an aggregated form. When using such outputs to parametrize pluvial flood risk simulations in an urban setting, we need to identify ways to characterize imperviousness and flood exposure. We develop data-driven approaches for establishing this link, and we focus on the data resolutions and spatial scales that should be considered. We use regression models linking aggregated building areas to total imperviousness and models that link aggregated building areas and simulated flood areas to flood damage. The data resolutions used for training regression models are demonstrated to have a strong impact on identifiability, with too fine data resolutions preventing the identification of the link between building areas and hydrology and too coarse resolutions leading to uncertain parameter estimates. The optimal data resolution for modeling imperviousness was identified to be 400 m in our case study, while an aggregation of the data to at least 1000 m resolution is required when modeling flood damage. In addition, regression models for flood damage are more robust when considering building data with coarser resolutions of 200 m than with finer resolutions. The results suggest that aggregated building data can be used to derive realistic estimations of flood risk in screening simulations.