We present a method for controlling a swarm using its spectral decomposition -- that is, by describing the set of trajectories of a swarm in terms of a spatial distribution throughout the operational domain -- guaranteeing scale invariance with respect to the number of agents both for computation and for the operator tasked with controlling the swarm. We use ergodic control, decentralized across the network, for implementation. In the DARPA OFFSET program field setting, we test this interface design for the operator using the STOMP interface -- the same interface used by Raytheon BBN throughout the duration of the OFFSET program. In these tests, we demonstrate that our approach is scale-invariant -- the user specification does not depend on the number of agents; it is persistent -- the specification remains active until the user specifies a new command; and it is real-time -- the user can interact with and interrupt the swarm at any time. Moreover, we show that the spectral/ergodic specification of swarm behavior degrades gracefully as the number of agents goes down, enabling the operator to maintain the same approach as agents become disabled or are added to the network. We demonstrate the scale-invariance and dynamic response of our system in a field relevant simulator on a variety of tactical scenarios with up to 50 agents. We also demonstrate the dynamic response of our system in the field with a smaller team of agents. Lastly, we make the code for our system available.
translated by 谷歌翻译
Novel view synthesis and 3D modeling using implicit neural field representation are shown to be very effective for calibrated multi-view cameras. Such representations are known to benefit from additional geometric and semantic supervision. Most existing methods that exploit additional supervision require dense pixel-wise labels or localized scene priors. These methods cannot benefit from high-level vague scene priors provided in terms of scenes' descriptions. In this work, we aim to leverage the geometric prior of Manhattan scenes to improve the implicit neural radiance field representations. More precisely, we assume that only the knowledge of the scene (under investigation) being Manhattan is known - with no additional information whatsoever - with an unknown Manhattan coordinate frame. Such high-level prior is then used to self-supervise the surface normals derived explicitly in the implicit neural fields. Our modeling allows us to group the derived normals, followed by exploiting their orthogonality constraints for self-supervision. Our exhaustive experiments on datasets of diverse indoor scenes demonstrate the significant benefit of the proposed method over the established baselines.
translated by 谷歌翻译
Classification of cancer cellularity within tissue samples is currently a manual process performed by pathologists. This process of correctly determining cancer cellularity can be time intensive. Deep Learning (DL) techniques in particular have become increasingly more popular for this purpose, due to the accuracy and performance they exhibit, which can be comparable to the pathologists. This work investigates the capabilities of two DL approaches to assess cancer cellularity in whole slide images (WSI) in the SPIE-AAPM-NCI BreastPathQ challenge dataset. The effects of training on augmented data via rotations, and combinations of multiple architectures into a single network were analyzed using a modified Kendall Tau-b prediction probability metric known as the average prediction probability PK. A deep, transfer learned, Convolutional Neural Network (CNN) InceptionV3 was used as a baseline, achieving an average PK value of 0.884, showing improvement from the average PK value of 0.83 achieved by pathologists. The network was then trained on additional training datasets which were rotated between 1 and 360 degrees, which saw a peak increase of PK up to 4.2%. An additional architecture consisting of the InceptionV3 network and VGG16, a shallow, transfer learned CNN, was combined in a parallel architecture. This parallel architecture achieved a baseline average PK value of 0.907, a statistically significantly improvement over either of the architectures' performances separately (p<0.0001 by unpaired t-test).
translated by 谷歌翻译
在包装交付,交通监控,搜索和救援操作以及军事战斗订婚等不同应用中,对使用无人驾驶汽车(UAV)(无人机)的需求越来越不断增加。在所有这些应用程序中,无人机用于自动导航环境 - 没有人类互动,执行特定任务并避免障碍。自主无人机导航通常是使用强化学习(RL)来完成的,在该学习中,代理在域中充当专家在避免障碍的同时导航环境。了解导航环境和算法限制在选择适当的RL算法以有效解决导航问题方面起着至关重要的作用。因此,本研究首先确定了无人机导航任务,并讨论导航框架和仿真软件。接下来,根据环境,算法特征,能力和不同无人机导航问题的应用程序对RL算法进行分类和讨论,这将帮助从业人员和研究人员为其无人机导航使用情况选择适当的RL算法。此外,确定的差距和机会将推动无人机导航研究。
translated by 谷歌翻译
基于深度学习的渠道代码设计最近引起了人们的兴趣,可以替代传统的编码算法,尤其是对于现有代码不提供有效解决方案的渠道。通过反馈渠道进行的沟通就是一个这样的问题,最近通过采用各种深度学习体系结构来获得有希望的结果。在本文中,我们为反馈渠道介绍了一种新颖的学习辅助代码设计,称为广义块注意反馈(GBAF)代码,i)使用模块化体系结构,可以使用不同的神经网络体系结构实现;ii)与现有设计相比,错误的可能性提高了误顺序;iii)可以以所需的代码速率传输。
translated by 谷歌翻译
Ultra-reliable short-packet communication is a major challenge in future wireless networks with critical applications. To achieve ultra-reliable communications beyond 99.999%, this paper envisions a new interaction-based communication paradigm that exploits feedback from the receiver. We present AttentionCode, a new class of feedback codes leveraging deep learning (DL) technologies. The underpinnings of AttentionCode are three architectural innovations: AttentionNet, input restructuring, and adaptation to fading channels, accompanied by several training methods, including large-batch training, distributed learning, look-ahead optimizer, training-test signal-to-noise ratio (SNR) mismatch, and curriculum learning. The training methods can potentially be generalized to other wireless communication applications with machine learning. Numerical experiments verify that AttentionCode establishes a new state of the art among all DL-based feedback codes in both additive white Gaussian noise (AWGN) channels and fading channels. In AWGN channels with noiseless feedback, for example, AttentionCode achieves a block error rate (BLER) of $10^{-7}$ when the forward channel SNR is 0 dB for a block size of 50 bits, demonstrating the potential of AttentionCode to provide ultra-reliable short-packet communications.
translated by 谷歌翻译
我们提出了弗雷多(Fredo),几张文档级别的关系提取(FSDLRE)基准。与基于句子级别的关系提取语料库建立的现有基准相反,我们认为文档级的语料库提供了更多的现实主义,尤其是关于无原始的(nota)分布。因此,我们建议一组FSDLRE任务,并基于两个现有的监督学习数据集(DOCRED和SCIERC)构建基准测试。我们将最先进的句子级方法MNAV调整为文档级别,并进一步开发它以改善域的适应性。我们发现FSDLRE是一个充满挑战的环境,具有有趣的新特征,例如从支持集中进行nota实例的能力。数据,代码和训练的模型可在线获得(https://github.com/nicpopovic/fredo)。
translated by 谷歌翻译
在大多数情况下,有条件的图像生成可以被认为是对图像理解过程的反转。由于通用图像理解涉及解决多个任务,因此自然要通过多条件来生成图像。但是,由于异质性和(实际上)可用条件标签的稀疏性,多条件图像生成是一个非常具有挑战性的问题。在这项工作中,我们提出了一种新型的神经结构,以解决空间多条件标签的异质性和稀疏性问题。我们选择的空间条件(例如语义和深度)是由它具有更好地控制图像生成过程的承诺所驱动的。所提出的方法使用类似变压器的体系结构操作像素,该架构将可用的标签作为输入令牌接收,以将其合并在学习的标签均匀空间中。然后,合并的标签用于通过有条件的生成对抗训练进行图像生成。在此过程中,通过简单地将与所需位置的缺失标签相对应的输入令牌掉下来处理标签的稀疏性,这要归功于提议的像素操作架构。我们在三个基准数据集上进行的实验证明了我们的方法比最新的基准和比较基线的明显优势。源代码将公开可用。
translated by 谷歌翻译
制定了具有机器学习模拟(骆驼)项目的宇宙学和天体物理学,通过数千名宇宙的流体动力模拟和机器学习将宇宙学与天体物理学结合起来。骆驼包含4,233个宇宙学仿真,2,049个n-body和2,184个最先进的流体动力模拟,在参数空间中采样巨大的体积。在本文中,我们介绍了骆驼公共数据发布,描述了骆驼模拟的特性和由它们产生的各种数据产品,包括光环,次麦,银河系和空隙目录,功率谱,Bispectra,Lyman - $ \ Alpha $光谱,概率分布函数,光环径向轮廓和X射线光子列表。我们还释放了超过骆驼 - 山姆的数十亿个星系的目录:与Santa Cruz半分析模型相结合的大量N身体模拟。我们释放包含350多个Terabytes的所有数据,并包含143,922个快照,数百万光环,星系和摘要统计数据。我们提供有关如何访问,下载,读取和处理数据AT \ URL {https://camels.readthedocs.io}的进一步技术详细信息。
translated by 谷歌翻译
我们在视觉变压器中引入完全随机层,而不会导致任何严重的性能下降。额外的随机性提高了视觉特征的鲁棒性,并加强了隐私。在该过程中,在训练和推理期间使用具有完全随机参数的线性层,以改变每个多层Perceptron的特征激活。这种随机线性操作保留了由通过共用多层Perceptron的令牌形成的拓扑结构。此操作鼓励学习识别任务依赖令牌的拓扑结构,而不是它们的值,而不是它们的值,这反过来提供了可视化功能的所需的鲁棒性和隐私。在本文中,我们使用我们的特性进行三种不同的应用程序,即对抗鲁棒性,网络校准和特征隐私。我们的功能为这些任务提供令人兴奋的结果。此外,我们展示了联合和转移学习的实验设置,其中具有随机层的视觉变压器再次显示出良好的表现。我们的源代码将公开可用。
translated by 谷歌翻译