Figure 1: Given challenging in-the-wild videos, a recent state-of-the-art video-pose-estimation approach [31] (top), fails to produce accurate 3D body poses. To address this, we exploit a large-scale motion-capture dataset to train a motion discriminator using an adversarial approach. Our model (VIBE) (bottom) is able to produce realistic and accurate pose and shape, outperforming previous work on standard benchmarks.
translated by 谷歌翻译