Deep learning techniques have been widely applied to traffic flow prediction, considering underlying routine patterns, and multiple context factors (e.g., time and weather). However, the complex spatio-temporal dependencies between inherent traffic patterns and multiple disturbances have not beenfully addressed. In this paper, we propose a two-phase end-to-end deep learning framework, namely DeepSTD to uncover the spatio-temporal disturbances (STD) to predict the citywide trafficflow. In the STD Modelingphase, we propose an STD modeling method to model both the different regional disturbances causedby various region functions and the spatio-temporal propagating effects. In the Prediction phase, we eliminate the STD from the historical traffic flow to enhance the leaning of inherent traffic patterns and combine the STD at the prediction time interval to consider the future disturbances. The experimental results on two real-world datasets demonstrate that DeepSTD outperformsthe state-of-the-art methods.