Information and communication technologies combined with in-situ sensors are increasingly being used in the management of urban drainage systems. The large amount of data collected in these systems can be used to train a data-driven soft sensor, which can supplement the physical sensor. Artificial Neural Networks have long been used for time series forecasting given their ability to recognize patterns in the data. Long Short-Term Memory (LSTM) neural networks are equipped with memory gates to help them learn time dependencies in a data series and have been proven to outperform other type of networks in predicting water levels in urban drainage systems. When used for soft sensing, neural networks typically receive antecedent observations as input, as these are good predictors of the current value. However, the antecedent observations may be missing due to transmission errors or deemed anomalous due to errors that are not easily explained. This study quantifies and compares the predictive accuracy of LSTM networks in scenarios of limited or missing antecedent observations. We applied these scenarios to an 11-month observation series from a combined sewer overflow chamber in Copenhagen, Denmark. We observed that i) LSTM predictions generally displayed large variability across training runs, which may be reduced by improving the selection of hyperparameters (non-trainable parameters); ii) when the most recent observations were known, adding information on the past did not improve the prediction accuracy; iii) when gaps were introduced in the antecedent water depth observations, LSTM networks were capable of compensating for the missing information with the other available input features (time of the day and rainfall intensity); iv) LSTM networks trained without antecedent water depth observations yielded larger prediction errors, but still comparable with other scenarios and captured both dry and wet weather behaviors. Therefore, we concluded that LSTM neural network may be trained to act as soft sensors in urban drainage systems even when observations from the physical sensors are missing.