This paper presents the first study on forecasting human dynamics from staticimages. The problem is to input a single RGB image and generate a sequence ofupcoming human body poses in 3D. To address the problem, we propose the 3D PoseForecasting Network (3D-PFNet). Our 3D-PFNet integrates recent advances onsingle-image human pose estimation and sequence prediction, and converts the 2Dpredictions into 3D space. We train our 3D-PFNet using a three-step trainingstrategy to leverage a diverse source of training data, including image andvideo based human pose datasets and 3D motion capture (MoCap) data. Wedemonstrate competitive performance of our 3D-PFNet on 2D pose forecasting and3D pose recovery through quantitative and qualitative results.
translated by 谷歌翻译