Pedestrians follow different trajectories to avoid obstacles and accommodate fellow pedestrians. Any autonomous vehicle navigating such a scene should be able to foresee the future positions of pedestrians and accordingly adjust its path to avoid collisions. This problem of trajectory prediction can be viewed as a sequence generation task, where we are interested in predicting the future trajectory of people based on their past positions. Following the recent success of Recurrent Neural Network (RNN) models for sequence prediction tasks, we propose an LSTM model which can learn general human movement and predict their future trajectories. This is in contrast to traditional approaches which use hand-crafted functions such as Social forces. We demonstrate the performance of our method on several public datasets. Our model outperforms state-of-the-art methods on some of these datasets. We also analyze the tra-jectories predicted by our model to demonstrate the motion behaviour learned by our model.
translated by 谷歌翻译
为了在复杂的城市交通中安全有效地进行导航,自主车辆必须对周围的交通代理(车辆,自行车,行人等)做出可靠的预测。一项具有挑战性和关键性的任务是探索不同交通代理的运动模式,准确预测其未来轨迹,以帮助自主车辆做出合理的导航决策。为了解决这个问题,我们提出了一种基于长期存储器(基于LSTM)的实时流量预测算法TrafficPredict。我们的方法使用实例层来学习实例的移动和交互,并且具有类别层以学习属于相同类型的实例的相似性以细化预测。为了评估其性能,我们在一个由不同条件和交通密度组成的大城市中收集了轨迹数据集。该数据集包括许多具有挑战性的场景,其中车辆,自行车和行人在另一个之间移动。我们在新数据集上评估TrafficPredict的性能,通过与先前的预测方法进行比较,突出显示其对轨迹预测的更高准确性。
translated by 谷歌翻译
Understanding human motion behavior is critical for autonomous movingplatforms (like self-driving cars and social robots) if they are to navigatehuman-centric environments. This is challenging because human motion isinherently multimodal: given a history of human motion paths, there are manysocially plausible ways that people could move in the future. We tackle thisproblem by combining tools from sequence prediction and generative adversarialnetworks: a recurrent sequence-to-sequence model observes motion histories andpredicts future behavior, using a novel pooling mechanism to aggregateinformation across people. We predict socially plausible futures by trainingadversarially against a recurrent discriminator, and encourage diversepredictions with a novel variety loss. Through experiments on several datasetswe demonstrate that our approach outperforms prior work in terms of accuracy,variety, collision avoidance, and computational complexity.
translated by 谷歌翻译
本文讨论了场景中多个交互作用的路径预测问题,这是许多自动驾驶平台如自动驾驶汽车和社交机器人的关键步骤。我们提出\ textit {SoPhie};基于生成对抗网络(GAN)的可解释框架,其使用场景的图像来提取两个信息源,所有代理的路径历史以及场景上下文信息。为了预测代理人的未来路径,必须对物理和社会信息进行杠杆化。以前的工作尚未成功地共同模拟物理和社会互动。我们的方法将社会关注机制与物理注意力相结合,帮助模型学习在大型场景中查看的位置,并提取与路径相关的图像的最显着部分。然而,社交关注组件在不同的交互中聚合信息并从周围的邻居中提取最重要的轨迹信息。 SoPhie还利用GAN生成更实际的样本,并通过模拟其分布来捕捉未来路径的不确定性。所有这些机制使我们的方法能够为代理预测社会和物理上合理的路径,并在几个不同的轨迹预测基准上实现最先进的性能。
translated by 谷歌翻译
时空图(STG)是用于建立多智能体交互场景的强大工具,通常用于人类轨迹预测和主动规划以及安全人机交互的决策。然而,许多当前STG支持的方法依赖于静态图假设,即。底层图形结构在场景中维护相同的节点和边缘。在现实世界的应用中,这种假设经常被打破,特别是在人群中人类轨迹预测等高动态问题中。为了消除对这种假设的依赖,我们提出了一种方法,用于在高动态和多模态场景中建模和预测代理行为(即场景的图形结构是时变的,并且有许多可能的非常明显的eachagent未来)。我们的动态STG模型方法增强了先前的多模式,多智能体建模方法,在边缘模型上使用门控功能,可以平滑地添加和消除节点的边缘影响。我们展示了我们的方法在ETH多人类轨迹数据集和onNBA篮球运动员轨迹上的表现。两者都是高动态,多模式和多代理交互场景,作为许多机器人应用的代理。
translated by 谷歌翻译
对于与人类共享相同环境的自主机器人来说,预测行人的轨迹是非常重要的。为了有效且安全地与人类交互,轨迹预测需要精确且计算有效。在这项工作中,我们提出了一种基于卷积神经网络(CNN)的人类轨迹预测方法。与最近基于LSTM的痣一样,我们的模型支持增加的并行性和有效的时间表示。所提出的紧凑CNN模型比当前方法更快但仍产生竞争结果。
translated by 谷歌翻译
对于在人行道上航行的移动机器人,必须能够跨越街道交叉口。大多数现有方法依赖于交通灯信号的识别以作出明智的交叉决定。尽管这些方法已成为城市导航的关键促成因素,但采用这种方法的机器人的能力仍然有限,仅限于在包含信号交叉口的街道上。在本文中,我们解决了这一挑战,并提出了一种多模式卷积神经网络框架,以预测交叉口的街道交叉口的安全性。 Ourarchitecture包含两个子网络;交互感知轨迹估计流IA-TCNN,其预测场景中所有观察到的交通参与者的未来状态,以及交通灯识别流AthtteNet。我们的IA-TCNN利用扩张的因果卷积来模拟场景中可观察的动态代理的行为,而无需明确地为它们之间的交互分配优先级。虽然AtteNet利用挤压激励块来学习用于从数据中选择相关特征的内容感知机制,从而提高噪声鲁棒性。来自交通灯识别流的学习表示与来自运动预测流的估计轨迹融合以学习交叉决策。此外,我们扩展了我们之前引入的FreiburgStreet Crossing数据集,其中包含了在不同类型的交叉点捕获的序列,展示了交通参与者之间复杂的相互作用。对公共基准数据集和我们提出的数据集的广泛实验评估表明,我们的网络实现了每个子任务的最新性能,以及交叉安全性预测。
translated by 谷歌翻译
我们开发了一种人体运动轨迹预测系统,该系统在静态拥挤场景中的预测过程中结合了场景信息(Scene-LSTM)以及人体运动轨迹(行人运动LSTM)。我们叠加了一个两级网格结构(场景被划分为每个由场景LSTM建模的网格单元,它们被进一步划分为更小的子网格以获得更精细的空间粒度)并探索在网格单元中发生的常见人类轨迹(例如,做出正确的或者从巷子里转入人行道;或者站在公共汽车/火车站停下来。两个耦合的LSTMnetworks,行人运动LSTM(每个目标一个)和相应的Snene-LSTM(每个网格单元一个)被同时训练以预测下一个运动。我们表明,这种共同的路径信息极大地影响了对未来运动的预测。我们进一步设计了一个场景数据滤波器,它可以保存非重要的非线性运动信息。场景数据过滤器允许我们从网格单元的内存中选择与目标状态相关的信息的相关部分。我们在UCY [1]和ETH [2]数据集的五个拥挤视频序列上评估和比较我们的方法的两个版本与线性和几个现有的基于LSTM的方法。结果表明,与相关方法相比,我们的方法减少了位置偏移误差,特别是与社交方法相比减少了80%。
translated by 谷歌翻译
我们提出了一种新算法,用于预测密集交通视频中道路代理的近期轨迹。我们的方法是针对异质性交通而设计的,其中道路代理可以对应于公共汽车,汽车,踏板车,自行车或行人。我们使用新型LSTM-CNN混合网络模拟差分代理之间的相互作用以进行轨迹预测。特别地,我们考虑了不同的交互,这些交互隐含地记录了不同行车者的不同形状,动态和行为。此外,我们模拟基于地平线的相互作用,用于隐含地模拟每个道路代理的驾驶行为。我们评估了我们的预测算法TraPHic在标准数据集上的性能,并引入了一个新的密集,异构的交通数据集,对应的旅游亚洲视频和代理轨迹。我们在密集交通数据集上的表现优于最先进的方法30%。
translated by 谷歌翻译
We present a novel real-time algorithm to predict the path of pedestrians in cluttered environments. Our approach makes no assumption about pedestrian motion or crowd density, and is useful for short-term as well as long-term prediction. We interactively learn the characteristics of pedestrian motion and movement patterns from 2D trajectories using Bayesian inference. These include local movement patterns corresponding to the current and preferred velocities and global characteristics such as entry points and movement features. Our approach involves no precomputation and we demonstrate the real-time performance of our prediction algorithm on sparse and noisy trajectory data extracted from dense indoor and outdoor crowd videos. The combination of local and global movement patterns can improve the accuracy of long-term prediction by 12-18% over prior methods in high-density videos.
translated by 谷歌翻译
Mobile robots are increasingly populating our human environments. To interact with humans in a socially compliant way, these robots need to understand and comply with mutually accepted rules. In this paper, we present a novel approach to model the cooperative navigation behavior of humans. We model their behavior in terms of a mixture distribution that captures both the discrete navigation decisions, such as going left or going right, as well as the natural variance of human trajectories. Our approach learns the model parameters of this distribution that match, in expectation, the observed behavior in terms of user-defined features. To compute the feature expectations over the resulting high-dimensional continuous distributions, we use Hamiltonian Markov chain Monte Carlo sampling. Furthermore, we rely on a Voronoi graph of the environment to efficiently explore the space of trajectories from the robot's current position to its target position. Using the proposed model, our method is able to imitate the behavior of pedestrians or, alternatively, to replicate a specific behavior that was taught by tele-operation in the target environment of the robot. We implemented our approach on a real mobile robot and demonstrate that it is able to successfully navigate in an office environment in the presence of humans. An extensive set of experiments suggests that our technique outperforms state-of-the-art methods to model the behavior of pedestrians, which makes it also applicable to fields such as behavioral science or computer graphics.
translated by 谷歌翻译
Humans navigate crowded spaces such as a university campus by following common sense rules based on social etiquette. In this paper, we argue that in order to enable the design of new target tracking or trajectory forecasting methods that can take full advantage of these rules, we need to have access to better data in the first place. To that end, we contribute a new large-scale dataset that collects videos of various types of targets (not just pedestrians, but also bikers, skate-boarders, cars, buses, golf carts) that navigate in a real world outdoor environment such as a university campus. Moreover, we introduce a new characterization that describes the "social sensitivity" at which two targets interact. We use this characterization to define "navigation styles" and improve both forecasting models and state-of-the-art multi-target tracking-whereby the learnt forecasting models help the data association step.
translated by 谷歌翻译
我们提出了一个可解释的路径预测框架,它利用了代理行为与其空间导航环境之间的依赖关系。我们利用两种信息来源:感兴趣的代理人的过去运动轨迹和导航场景的宽视角图像。我们提出了一个千里眼的注意力递归网络(CAR-Net),它可以在解决路径预测任务时学习在场景的大图像中查看的位置。我们的方法可以处理原始图像中的任何区域或区域组合(例如,道路交叉点)当预测代理的轨迹时,这使我们能够可视化影响轨迹预测的导航场景的细粒度语义元素。为了研究空间试剂的轨迹的影响,我们建立了一个由数百个场景(一级方程式赛道)的顶视图图像组成的新数据集,其中代理人的行为受到图像中已知区域(例如,即将到来的转弯)的重大影响。 CAR-Nets成功地参加了这些显着的地区。此外,CAR-Net在标准轨迹预测基准斯坦福无人机数据集(SDD)上达到了最先进的精度。最后,我们展示了CAR-Net能够全面展现看不见的场景。
translated by 谷歌翻译
Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level structure and can benefit from it. Spatio-temporal graphs are a popular tool for imposing such high-level intuitions in the formulation of real world problems. In this paper, we propose an approach for combining the power of high-level spatio-temporal graphs and sequence learning success of Recurrent Neural Networks (RNNs). We develop a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable. The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps. The evaluations of the proposed approach on a diverse set of problems, ranging from modeling human motion to object interactions, shows improvement over the state-of-the-art with a large margin. We expect this method to empower new approaches to problem formulation through high-level spatio-temporal graphs and Recurrent Neural Networks. Links: Web
translated by 谷歌翻译
我们引入了深度随机IOC RNN编码器解码器框架DESIRE,用于未来预测动态范围内多个交互代理的任务。 DESIRE通过1)考虑未来预测的多模态性质(即,给定相同的背景,未来可能不同),2)预测潜在的未来结果并基于此进行战略预测来有效地预测多目标物体的未来位置。 ,3)推理不仅来自过去的运动历史,而且来自场景背景以及代理人之间的相互作用。 DESIRE在单端到端的可训练神经网络模型中实现了这些,同时具有计算效率。该模型首先使用条件变分自动编码器获得一组不同的假设的未来预测样本,其由下面的RNN评分 - 回归模块进行排序和细化。通过计算未来累积的奖励来对样本进行评分,从而实现类似于IOC框架的更好的长期战略决策。 RNN场景上下文融合模块联合捕获过去的运动历史,语义场景上下文和多个代理之间的交互。反馈机制迭代排序和细化以进一步提高预测准确性。我们在两个公开可用的数据集上评估我们的模型:KITTI和Stanford DroneDataset。我们的实验表明,与其他基线方法相比,所提出的模型显着提高了预测精度。
translated by 谷歌翻译
In this paper, we study the safe navigation of a mobile robot through crowds of dynamic agents with uncertain trajectories. Existing algorithms suffer from the "freezing robot" problem: once the environment surpasses a certain level of complexity, the planner decides that all forward paths are unsafe, and the robot freezes in place (or performs unnecessary maneuvers) to avoid collisions. Since a feasible path typically exists, this behavior is suboptimal. Existing approaches have focused on reducing the predictive uncertainty for individual agents by employing more informed models or heuristically limiting the predictive covariance to prevent this overcautious behavior. In this work, we demonstrate that both the individual prediction and the predictive uncertainty have little to do with the frozen robot problem. Our key insight is that dynamic agents solve the frozen robot problem by engaging in "joint collision avoidance": They cooperatively make room to create feasible trajectories. We develop IGP, a nonparametric statistical model based on dependent output Gaussian processes that can estimate crowd interaction from data. Our model naturally captures the non-Markov nature of agent trajectories, as well as their goal-driven navigation. We then show how planning in this model can be efficiently implemented using particle based inference. Lastly, we evaluate our model on a dataset of pedestrians entering and leaving a building, first comparing the model with actual pedestrians, and find that the algorithm either outperforms human pedestrians or performs very similarly to the pedestrians. We also present an experiment where a covariance reduction method results in highly overcautious behavior, while our model performs desirably.
translated by 谷歌翻译
Mobile robots that operate in a shared environment with humans need the ability to predict the movements of people to better plan their navigation actions. In this paper, we present a novel approach to predict the movements of pedestrians. Our method reasons about entire trajectories that arise from interactions between people in navigation tasks. It applies a maximum entropy learning method based on features that capture relevant aspects of the trajectories to determine the probability distribution that underlies human navigation behavior. Hence, our approach can be used by mobile robots to predict forthcoming interactions with pedestrians and thus react in a socially compliant way. In extensive experiments, we evaluate the capability and accuracy of our approach and demonstrate that our algorithm outperforms the popular social forces method, a state-of-the-art approach. Furthermore, we show how our algorithm can be used for autonomous robot navigation using a real robot.
translated by 谷歌翻译
We develop predictive models of pedestrian dynamics by encoding the coupled nature of multi-pedestrian interaction using game theory and deep learning-based visual analysis to estimate person-specific behavior parameters. We focus on predictive models since they are important for developing interactive autonomous systems (e.g., autonomous cars, home robots, smart homes) that can understand different human behavior and pre-emptively respond to future human actions. Building predictive models for multi-pedestrian interactions however, is very challenging due to two reasons: (1) the dynamics of interaction are complex interdependent processes, where the decision of one person can affect others ; and (2) dynamics are variable, where each person may behave differently (e.g., an older person may walk slowly while the younger person may walk faster). We address these challenges by utilizing concepts from game theory to model the intertwined decision making process of multiple pedestrians and use visual classifiers to learn a mapping from pedestrian appearance to behavior parameters. We evaluate our proposed model on several public multiple pedestrian interaction video datasets. Results show that our strategic planning model predicts and explains human interactions 25% better when compared to a state-of-the-art activity forecasting method.
translated by 谷歌翻译
在这项工作中,我们探索了人的轨迹和他们的头部方向之间的相关性。我们认为人们的轨迹和头部姿势预测可以被建模为一个共同的问题。最近关于轨道预测的方法利用行人的短期轨迹(也称为轨迹)来预测其未来路径。此外,社会学线索,例如预期目的地或行人互动,通常与轨迹相结合。在本文中,我们提出MiXing-LSTM(MX-LSTM)来捕捉位置和头部方向(vislets)之间的相互作用,这要归功于在LSTM反向传播期间对完全协方差矩阵的联合无约束优化。在建模社交互动时,我们通常利用头部方向作为视觉注意力的代理。 MX-LSTM预测未来的行人位置和头部姿势,增加当前方法在长期轨迹预测中的标准能力。与最先进的技术相比,ourapproach在广泛的公共基准测试中表现出更好的表现.MX-LSTM在人们移动缓慢时尤为有效,即对所有其他模型而言最具挑战性的情景。所提出的方法还允许在更长的时间范围内进行准确的预测。
translated by 谷歌翻译
We present a real-time algorithm to automatically classify the dynamic behavior or personality of a pedestrian based on his or her movements in a crowd video. Our classification criterion is based on Personality Trait Theory. We present a statistical scheme that dynamically learns the behavior of every pedestrian in a scene and computes that pedes-trian's motion model. This model is combined with global crowd characteristics to compute the movement patterns and motion dynamics, which can also be used to predict the crowd movement and behavior. We highlight its performance in identifying the personalities of different pedestrians in low-and high-density crowd videos. We also evaluate the accuracy by comparing the results with a user study.
translated by 谷歌翻译