估计越野环境中的地形横穿性需要关于机器人和这些地形之间复杂相互作用动态的推理。但是,建立准确的物理模型,或创建有益的标签来以有监督的方式学习模型是有挑战性的。我们提出了一种方法,该方法通过将外部感受性的环境信息与本体感受性的地形相互作用反馈相结合,以自我监督的方式将遍历性成本映像结合在一起。此外,我们提出了一种将机器人速度纳入Costmap预测管道中的新型方法。我们在具有挑战性的越野地形上,在多个大型,自动的全地形车辆(ATV)上验证了我们的方法,并在单独的大型地面机器人上易于集成。我们的短尺寸导航结果表明,使用我们学到的Costmaps可以使整体航行更顺畅,并为机器人提供了对机器人与不同地形类型(例如草和砾石)之间相互作用的更细粒度的了解。我们的大规模导航试验表明,与基于占用率的导航基线相比,我们可以将干预措施的数量减少多达57%,这是在挑战400 m至3150 m不等的越野课程中。
translated by 谷歌翻译
Over the past decade, neural networks have been successful at making predictions from biological sequences, especially in the context of regulatory genomics. As in other fields of deep learning, tools have been devised to extract features such as sequence motifs that can explain the predictions made by a trained network. Here we intend to go beyond explainable machine learning and introduce SEISM, a selective inference procedure to test the association between these extracted features and the predicted phenotype. In particular, we discuss how training a one-layer convolutional network is formally equivalent to selecting motifs maximizing some association score. We adapt existing sampling-based selective inference procedures by quantizing this selection over an infinite set to a large but finite grid. Finally, we show that sampling under a specific choice of parameters is sufficient to characterize the composite null hypothesis typically used for selective inference-a result that goes well beyond our particular framework. We illustrate the behavior of our method in terms of calibration, power and speed and discuss its power/speed trade-off with a simpler data-split strategy. SEISM paves the way to an easier analysis of neural networks used in regulatory genomics, and to more powerful methods for genome wide association studies (GWAS).
translated by 谷歌翻译
Vision Transformers (ViTs) have become a dominant paradigm for visual representation learning with self-attention operators. Although these operators provide flexibility to the model with their adjustable attention kernels, they suffer from inherent limitations: (1) the attention kernel is not discriminative enough, resulting in high redundancy of the ViT layers, and (2) the complexity in computation and memory is quadratic in the sequence length. In this paper, we propose a novel attention operator, called lightweight structure-aware attention (LiSA), which has a better representation power with log-linear complexity. Our operator learns structural patterns by using a set of relative position embeddings (RPEs). To achieve log-linear complexity, the RPEs are approximated with fast Fourier transforms. Our experiments and ablation studies demonstrate that ViTs based on the proposed operator outperform self-attention and other existing operators, achieving state-of-the-art results on ImageNet, and competitive results on other visual understanding benchmarks such as COCO and Something-Something-V2. The source code of our approach will be released online.
translated by 谷歌翻译
Streets networks provide an invaluable source of information about the different temporal and spatial patterns emerging in our cities. These streets are often represented as graphs where intersections are modelled as nodes and streets as links between them. Previous work has shown that raster representations of the original data can be created through a learning algorithm on low-dimensional representations of the street networks. In contrast, models that capture high-level urban network metrics can be trained through convolutional neural networks. However, the detailed topological data is lost through the rasterisation of the street network. The models cannot recover this information from the image alone, failing to capture complex street network features. This paper proposes a model capable of inferring good representations directly from the street network. Specifically, we use a variational autoencoder with graph convolutional layers and a decoder that outputs a probabilistic fully-connected graph to learn latent representations that encode both local network structure and the spatial distribution of nodes. We train the model on thousands of street network segments and use the learnt representations to generate synthetic street configurations. Finally, we proposed a possible application to classify the urban morphology of different network segments by investigating their common characteristics in the learnt space.
translated by 谷歌翻译
部署AI驱动的系统需要支持有效人类互动的值得信赖的模型,超出了原始预测准确性。概念瓶颈模型通过在类似人类的概念的中间级别调节分类任务来促进可信度。这使得人类干预措施可以纠正错误预测的概念以改善模型的性能。但是,现有的概念瓶颈模型无法在高任务准确性,基于概念的强大解释和对概念的有效干预措施之间找到最佳的妥协,尤其是在稀缺完整和准确的概念主管的现实情况下。为了解决这个问题,我们提出了概念嵌入模型,这是一种新型的概念瓶颈模型,它通过学习可解释的高维概念表示形式而超出了当前的准确性-VS解关性权衡。我们的实验表明,嵌入模型(1)达到更好或竞争性的任务准确性W.R.T. W.R.T.没有概念的标准神经模型,(2)提供概念表示,以捕获有意义的语义,包括其地面真相标签,(3)支持测试时间概念干预措施,其在测试准确性中的影响超过了标准概念瓶颈模型,以及(4)规模对于稀缺的完整概念监督的现实条件。
translated by 谷歌翻译
我们将图形神经网络训练来自小工具N体模拟的光晕目录的神经网络,以执行宇宙学参数的无现场级别可能的推断。目录包含$ \ Lessim $ 5,000 HAROS带质量$ \ gtrsim 10^{10} 〜h^{ - 1} m_ \ odot $,定期卷为$(25〜H^{ - 1} {\ rm mpc}){\ rm mpc}) ^3 $;目录中的每个光环都具有多种特性,例如位置,质量,速度,浓度和最大圆速度。我们的模型构建为置换,翻译和旋转的不变性,不施加最低限度的规模来提取信息,并能够以平均值来推断$ \ omega _ {\ rm m} $和$ \ sigma_8 $的值$ \ sim6 \%$的相对误差分别使用位置加上速度和位置加上质量。更重要的是,我们发现我们的模型非常强大:他们可以推断出使用数千个N-n-Body模拟的Halo目录进行测试时,使用五个不同的N-进行测试时,在使用Halo目录进行测试时,$ \ omega _ {\ rm m} $和$ \ sigma_8 $身体代码:算盘,Cubep $^3 $ M,Enzo,PKDGrav3和Ramses。令人惊讶的是,经过培训的模型推断$ \ omega _ {\ rm m} $在对数千个最先进的骆驼水力动力模拟进行测试时也可以使用,该模拟使用四个不同的代码和子网格物理实现。使用诸如浓度和最大循环速度之类的光环特性允许我们的模型提取更多信息,而牺牲了模型的鲁棒性。这可能会发生,因为不同的N体代码不会在与这些参数相对应的相关尺度上收敛。
translated by 谷歌翻译
现有的视频理解数据集主要集中在人类的互动上,几乎没有关注“在野外”设置,在户外录制了视频。我们提出了Wildqa,这是一个视频理解外部设置中录制的视频的数据集。除了视频问答(视频质量质量检查)外,我们还介绍了确定给定问答(视频证据选择)视觉支持的新任务。通过使用各种基线模型的评估,我们表明Wildqa对愿景和语言研究社区构成了新的挑战。该数据集可在https://lit.eecs.umich.edu/wildqa/上找到。
translated by 谷歌翻译
许多微体系式优化为深度神经网络解锁了巨大的处理能力,从而促进了AI革命。随着这种优化的精疲力尽,现代AI的增长现在是通过培训系统的性能,尤其是其数据流动的。我们没有专注于单个加速器,而是研究了全系统规模的大规模培训的数据移动特征。基于我们的工作量分析,我们设计了HammingMesh,这是一种新颖的网络拓扑,以低成本提供高的带宽,并具有很高的工作计划灵活性。具体而言,HammingMesh可以支持具有两个并行性的两个维度的深度学习培训工作的完整带宽和隔离。此外,它还为通用流量的高全球带宽提供支持。因此,HammingMesh将为未来的大规模深度学习系统供电,并具有极端的带宽要求。
translated by 谷歌翻译
我们借鉴物理界的最新进步,提出了一种新的方法,以发现强化学习中物理系统的非线性动力学(RL)。我们确定该方法能够使用较少的轨迹(仅$ \ leq 30 $时间步骤)发现基础动力学,而不是最先进的模型学习算法。此外,该技术学习了一个足够准确的模型,可以诱导近乎最佳的策略,而轨迹明显少于无模型算法所要求的轨迹。它带来了基于模型的RL的好处,而无需提前开发模型,即具有基于物理动力的系统。为了确定该算法的有效性和适用性,我们对四个经典控制任务进行实验。我们发现,对基础系统的发现动力进行培训的最佳政策可以很好地概括。此外,当部署在实际物理系统上时,学到的策略表现良好,从而将模型桥接到实际系统差距中。我们将我们的方法与最新的基于模型和无模型的方法进行了比较,并表明我们的方法需要在真实的物理系统上比较其他方法所采样的轨迹更少。此外,我们探索了近似动力学模型,发现它们也可以表现良好。
translated by 谷歌翻译
最近的研究表明,看似公平的机器学习模型在为对人们的生活或福祉产生影响的决策提供信息(例如,涉及教育,就业和贷款的申请)可能会在长期内无意中增加社会不平等。这是因为先前的公平意识算法仅考虑静态公平限制,例如机会均等或人口统计奇偶。但是,强制执行这种类型的限制可能会导致模型对处境不利的个人和社区产生负面影响。我们介绍ELF(执行长期公平性),这是第一个分类算法,可提供高信任公平保证,以长期或延迟影响。我们证明,ELF返回不公平解决方案的概率小于用户指定的公差,并且(在轻度假设下),如果有足够的培训数据,ELF能够找到并返回公平的解决方案,如果存在一个公平的解决方案。我们通过实验表明,我们的算法可以成功缓解长期不公平。
translated by 谷歌翻译