Given a scene, what is going to move, and in what direction will it move?Such a question could be considered a non-semantic form of action prediction.In this work, we present a convolutional neural network (CNN) based approachfor motion prediction. Given a static image, this CNN predicts the futuremotion of each and every pixel in the image in terms of optical flow. Our CNNmodel leverages the data in tens of thousands of realistic videos to train ourmodel. Our method relies on absolutely no human labeling and is able to predictmotion based on the context of the scene. Because our CNN model makes noassumptions about the underlying scene, it can predict future optical flow on adiverse set of scenarios. We outperform all previous approaches by largemargins.
translated by 谷歌翻译