Covid-19疫苗是我们最好的赌注,用于减轻大流行的持续冲击。但是,疫苗也预计将是有限的资源。最佳分配策略,特别是在具有访问不公平的国家和热点的时间分离,可能是停留疾病传播的有效方式。我们通过提出一种新的管道VACSIM来实现这个问题,将深度加强学习模型延装到用于优化Covid-19疫苗的分布的上下文的匪徒方法中。虽然加强学习模型建议了更好的行动和奖励,但上下文匪徒允许在现实世界场景中每天到日常实施的在线修改。我们评估此框架,防止与印度五个不同状态的Covid-19案例发生比例分配疫苗的天真分配方法(Assam,Delhi,Jharkhand,Maharashtra和Nagaland),并展示高达9039潜力的潜在感染,并增加了显着增加在通过VacSim方法的45天内限制差异的疗效。我们的型号和平台对印度所有国家和潜在的全球范围内都是可扩张的。我们还提出了新的评估策略,包括标准的基于区间模型的预测和对我们模型的因果关系评估。由于所有模型都携带可能需要在各种情况下进行测试的假设,因此我们开源我们的模型Vackim并贡献了与Openai健身房兼容的新型加固学习环境,以使其在全球的现实世界应用中可扩展。 (http://vacsim.tavlab.iiitd.edu.in:8000/)。
translated by 谷歌翻译
Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real-world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self-supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with RepDIB can lead to strong performance improvements, as the learned bottlenecks help predict only the relevant state while ignoring irrelevant information.
translated by 谷歌翻译
As information extraction (IE) systems have grown more capable at whole-document extraction, the classic task of \emph{template filling} has seen renewed interest as a benchmark for evaluating them. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of \emph{event individuation} -- the problem of distinguishing distinct events -- about which even human experts disagree. We show through annotation studies and error analysis that this raises concerns about the usefulness of template filling evaluation metrics, the quality of datasets for the task, and the ability of models to learn it. Finally, we consider possible solutions.
translated by 谷歌翻译
We address the problem of few-shot classification where the goal is to learn a classifier from a limited set of samples. While data-driven learning is shown to be effective in various applications, learning from less data still remains challenging. To address this challenge, existing approaches consider various data augmentation techniques for increasing the number of training samples. Pseudo-labeling is commonly used in a few-shot setup, where approximate labels are estimated for a large set of unlabeled images. We propose DiffAlign which focuses on generating images from class labels. Specifically, we leverage the recent success of the generative models (e.g., DALL-E and diffusion models) that can generate realistic images from texts. However, naive learning on synthetic images is not adequate due to the domain gap between real and synthetic images. Thus, we employ a maximum mean discrepancy (MMD) loss to align the synthetic images to the real images minimizing the domain gap. We evaluate our method on the standard few-shot classification benchmarks: CIFAR-FS, FC100, miniImageNet, tieredImageNet and a cross-domain few-shot classification benchmark: miniImageNet to CUB. The proposed approach significantly outperforms the stateof-the-art in both 5-shot and 1-shot setups on these benchmarks. Our approach is also shown to be effective in the zero-shot classification setup
translated by 谷歌翻译
Soft actuators have attracted a great deal of interest in the context of rehabilitative and assistive robots for increasing safety and lowering costs as compared to rigid-body robotic systems. During actuation, soft actuators experience high levels of deformation, which can lead to microscale fractures in their elastomeric structure, which fatigues the system over time and eventually leads to macroscale damages and eventually failure. This paper reports finite element modeling (FEM) of pneu-nets at high angles, along with repetitive experimentation at high deformation rates, in order to study the effect and behavior of fatigue in soft robotic actuators, which would result in deviation from the ideal behavior. Comparing the FEM model and experimental data, we show that FEM can model the performance of the actuator before fatigue to a bending angle of 167 degrees with ~96% accuracy. We also show that the FEM model performance will drop to 80% due to fatigue after repetitive high-angle bending. The results of this paper objectively highlight the emergence of fatigue over cyclic activation of the system and the resulting deviation from the computational FEM model. Such behavior can be considered in future controllers to adapt the system with time-variable and non-autonomous response dynamics of soft robots.
translated by 谷歌翻译
我们提出了一种新颖的方法,可以将3D人类动画放入3D场景中,同时保持动画中的任何人类场景相互作用。我们使用计算动画中最重要的网格的概念,以与场景进行交互,我们称之为“键框”。这些关键框架使我们能够更好地优化动画在场景中的位置,从而使动画中的互动(站立,铺设,坐着等)与场景的负担相匹配(例如,站在地板上或躺在床上)。我们将我们称为PAAK的方法与先前的方法进行了比较,包括POSA,Prox地面真理和运动合成方法,并通过感知研究突出了我们方法的好处。人类评估者更喜欢我们的PAAK方法,而不是Prox地面真相数据64.6 \%。此外,在直接比较中,与POSA相比,评估者比竞争方法比包括61.5%的竞争方法更喜欢PAAK。
translated by 谷歌翻译
我们在定期马尔可夫决策过程(MDP)中学习学习,这是一种特殊类型的非平稳MDP,在平均奖励最大化设置下,状态过渡概率和奖励功能都定期变化。我们通过使用周期指数来扩大状态空间来将问题作为固定的MDP提出,并提出了定期上限置信度结合增强学习2(PUCRL2)算法。我们表明,pucrl2的遗憾随着时期和地平线长度的次线性而变化。数值结果证明了PUCRL2的功效。
translated by 谷歌翻译
一个沿着城市街道行走的人试图对世界各个方面进行建模,这很快就会被许多商店,汽车和人们遵循自己的复杂且难以理解的动态所淹没。在这种环境中的探索和导航是一项日常任务,不需要大量精神资源。是否可以将这种感官信息的消防软管转变为最小的潜在状态,这是代理在世界上成功采取行动的必要和足够的?我们具体地提出了这个问题,并提出了可控制的状态发现算法(AC-State),该算法具有理论保证,并且实际上被证明可以发现\ textit {最小可控的潜在状态},其中包含所有用于控制控制的信息代理,同时完全丢弃所有无关的信息。该算法由一个具有信息瓶颈的多步逆模型(预测遥远观察结果的动作)组成。 AC-State可以在没有奖励或示威的情况下实现本地化,探索和导航。我们证明了在三个领域中发现可控潜在状态的发现:将机器人组分散注意力(例如,照明条件和背景变化),与其他代理商一起在迷宫中进行探索,并在Matterport House Simulator中导航。
translated by 谷歌翻译
Computational imaging has been revolutionized by compressed sensing algorithms, which offer guaranteed uniqueness, convergence, and stability properties. In recent years, model-based deep learning methods that combine imaging physics with learned regularization priors have been emerging as more powerful alternatives for image recovery. The main focus of this paper is to introduce a memory efficient model-based algorithm with similar theoretical guarantees as CS methods. The proposed iterative algorithm alternates between a gradient descent involving the score function and a conjugate gradient algorithm to encourage data consistency. The score function is modeled as a monotone convolutional neural network. Our analysis shows that the monotone constraint is necessary and sufficient to enforce the uniqueness of the fixed point in arbitrary inverse problems. In addition, it also guarantees the convergence to a fixed point, which is robust to input perturbations. Current algorithms including RED and MoDL are special cases of the proposed algorithm; the proposed theoretical tools enable the optimization of the framework for the deep equilibrium setting. The proposed deep equilibrium formulation is significantly more memory efficient than unrolled methods, which allows us to apply it to 3D or 2D+time problems that current unrolled algorithms cannot handle.
translated by 谷歌翻译
Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas \textit{outside} such domains. In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification. To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains. Furthermore, end-users can `zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters. Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration.
translated by 谷歌翻译