了解强化学习(RL)代理的新兴行为可能很困难,因为这种代理通常使用高度复杂的决策程序在复杂的环境中进行训练。这引起了RL中解释性的多种方法,旨在调和可能在主体行为与观察者预期的行为之间产生的差异。最近的方法取决于域知识,这可能并非总是可用的,分析代理商的策略,或者是对基础环境的特定要素的分析,通常被建模为马尔可夫决策过程(MDP)。我们的主要主张是,即使基本的MDP尚不完全了解(例如,尚未准确地了解过渡概率),也没有由代理商维护(即,在使用无模型方法时),但仍可以利用它为自动生成解释。为此,我们建议使用以前在文献中使用的正式MDP抽象和转换来加快寻找最佳策略的搜索,以自动产生解释。由于这种转换通常基于环境的符号表示,因此它们可能代表了预期和实际代理行为之间差距的有意义的解释。我们正式定义了这个问题,建议一类可用于解释新兴行为的转换,并提出了有效搜索解释的方法。我们演示了一组标准基准测试的方法。
translated by 谷歌翻译
多智能体增强学习(MARL)的最新进展提供了各种工具,支持代理能力适应其环境中的意外变化,并鉴于环境的动态性质(可能会通过其他情况加剧代理商)。在这项工作中,我们强调了集团有效合作的能力与集团的弹性之间的关系,我们衡量了该集团适应环境扰动的能力。为了促进恢复力,我们建议通过新的基于混乱的通信协议进行协作,这是根据其以前经验中未对准的观察结果。我们允许有关代理人自主学习的信息的宽度和频率的决定,这被激活以减少混淆。我们在各种MARL设置中展示了我们的方法的实证评估。
translated by 谷歌翻译
Modern deep neural networks tend to be evaluated on static test sets. One shortcoming of this is the fact that these deep neural networks cannot be easily evaluated for robustness issues with respect to specific scene variations. For example, it is hard to study the robustness of these networks to variations of object scale, object pose, scene lighting and 3D occlusions. The main reason is that collecting real datasets with fine-grained naturalistic variations of sufficient scale can be extremely time-consuming and expensive. In this work, we present Counterfactual Simulation Testing, a counterfactual framework that allows us to study the robustness of neural networks with respect to some of these naturalistic variations by building realistic synthetic scenes that allow us to ask counterfactual questions to the models, ultimately providing answers to questions such as "Would your classification still be correct if the object were viewed from the top?" or "Would your classification still be correct if the object were partially occluded by another object?". Our method allows for a fair comparison of the robustness of recently released, state-of-the-art Convolutional Neural Networks and Vision Transformers, with respect to these naturalistic variations. We find evidence that ConvNext is more robust to pose and scale variations than Swin, that ConvNext generalizes better to our simulated domain and that Swin handles partial occlusion better than ConvNext. We also find that robustness for all networks improves with network scale and with data scale and variety. We release the Naturalistic Variation Object Dataset (NVD), a large simulated dataset of 272k images of everyday objects with naturalistic variations such as object pose, scale, viewpoint, lighting and occlusions. Project page: https://counterfactualsimulation.github.io
translated by 谷歌翻译
Understanding of the pathophysiology of obstructive lung disease (OLD) is limited by available methods to examine the relationship between multi-omic molecular phenomena and clinical outcomes. Integrative factorization methods for multi-omic data can reveal latent patterns of variation describing important biological signal. However, most methods do not provide a framework for inference on the estimated factorization, simultaneously predict important disease phenotypes or clinical outcomes, nor accommodate multiple imputation. To address these gaps, we propose Bayesian Simultaneous Factorization (BSF). We use conjugate normal priors and show that the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. We then extend BSF to simultaneously predict a continuous or binary response, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation and full posterior inference for missing data, including "blockwise" missingness, and BSFP offers prediction of unobserved outcomes. We show via simulation that BSFP is competitive in recovering latent variation structure, as well as the importance of propagating uncertainty from the estimated factorization to prediction. We also study the imputation performance of BSF via simulation under missing-at-random and missing-not-at-random assumptions. Lastly, we use BSFP to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated OLD. Our analysis reveals a distinct cluster of patients with OLD driven by shared metabolomic and proteomic expression patterns, as well as multi-omic patterns related to lung function decline. Software is freely available at https://github.com/sarahsamorodnitsky/BSFP .
translated by 谷歌翻译
Automatic sign language processing is gaining popularity in Natural Language Processing (NLP) research (Yin et al., 2021). In machine translation (MT) in particular, sign language translation based on glosses is a prominent approach. In this paper, we review recent works on neural gloss translation. We find that limitations of glosses in general and limitations of specific datasets are not discussed in a transparent manner and that there is no common standard for evaluation. To address these issues, we put forward concrete recommendations for future research on gloss translation. Our suggestions advocate awareness of the inherent limitations of gloss-based approaches, realistic datasets, stronger baselines and convincing evaluation.
translated by 谷歌翻译
Image quality assessment (IQA) forms a natural and often straightforward undertaking for humans, yet effective automation of the task remains highly challenging. Recent metrics from the deep learning community commonly compare image pairs during training to improve upon traditional metrics such as PSNR or SSIM. However, current comparisons ignore the fact that image content affects quality assessment as comparisons only occur between images of similar content. This restricts the diversity and number of image pairs that the model is exposed to during training. In this paper, we strive to enrich these comparisons with content diversity. Firstly, we relax comparison constraints, and compare pairs of images with differing content. This increases the variety of available comparisons. Secondly, we introduce listwise comparisons to provide a holistic view to the model. By including differentiable regularizers, derived from correlation coefficients, models can better adjust predicted scores relative to one another. Evaluation on multiple benchmarks, covering a wide range of distortions and image content, shows the effectiveness of our learning scheme for training image quality assessment models.
translated by 谷歌翻译
We apply computer vision pose estimation techniques developed expressly for the data-scarce infant domain to the study of torticollis, a common condition in infants for which early identification and treatment is critical. Specifically, we use a combination of facial landmark and body joint estimation techniques designed for infants to estimate a range of geometric measures pertaining to face and upper body symmetry, drawn from an array of sources in the physical therapy and ophthalmology research literature in torticollis. We gauge performance with a range of metrics and show that the estimates of most these geometric measures are successful, yielding strong to very strong Spearman's $\rho$ correlation with ground truth values. Furthermore, we show that these estimates, derived from pose estimation neural networks designed for the infant domain, cleanly outperform estimates derived from more widely known networks designed for the adult domain
translated by 谷歌翻译
Eye movements are known to reflect cognitive processes in reading, and psychological reading research has shown that eye gaze patterns differ between readers with and without dyslexia. In recent years, researchers have attempted to classify readers with dyslexia based on their eye movements using Support Vector Machines (SVMs). However, these approaches (i) are based on highly aggregated features averaged over all words read by a participant, thus disregarding the sequential nature of the eye movements, and (ii) do not consider the linguistic stimulus and its interaction with the reader's eye movements. In the present work, we propose two simple sequence models that process eye movements on the entire stimulus without the need of aggregating features across the sentence. Additionally, we incorporate the linguistic stimulus into the model in two ways -- contextualized word embeddings and manually extracted linguistic features. The models are evaluated on a Mandarin Chinese dataset containing eye movements from children with and without dyslexia. Our results show that (i) even for a logographic script such as Chinese, sequence models are able to classify dyslexia on eye gaze sequences, reaching state-of-the-art performance, and (ii) incorporating the linguistic stimulus does not help to improve classification performance.
translated by 谷歌翻译
现在,越来越多的人依靠在线平台来满足其健康信息需求。因此,确定不一致或矛盾的文本健康信息已成为一项关键的任务。健康建议数据提出了一个独特的挑战,在一个诊断的背景下,在另一个诊断的背景下是准确的信息。例如,患有糖尿病和高血压的人通常会在饮食方面得到矛盾的健康建议。这激发了对可以提供上下文化的,特定于用户的健康建议的技术的需求。朝着情境化建议迈出的关键一步是能够比较健康建议陈述并检测它们是否以及如何冲突的能力。这是健康冲突检测(HCD)的任务。鉴于两个健康建议,HCD的目标是检测和分类冲突的类型。这是一项具有挑战性的任务,因为(i)自动识别和分类冲突需要更深入地了解文本的语义,并且(ii)可用数据的数量非常有限。在这项研究中,我们是第一个在预先训练的语言模型的背景下探索HCD的人。我们发现,Deberta-V3在所有实验中的平均F1得分为0.68。我们还研究了不同冲突类型所带来的挑战,以及合成数据如何改善模型对冲突特定语义的理解。最后,我们强调了收集实际健康冲突的困难,并提出了一种人类的合成数据增强方法来扩展现有的HCD数据集。我们的HCD培训数据集比现有的HCD数据集大2倍以上,并在GitHub上公开可用。
translated by 谷歌翻译
我们考虑无上行赠款非正交多访问(NOMA)中的多用户检测(MUD)问题,其中访问点必须确定活动互联网(IoT)设备的总数和正确的身份他们传输的数据。我们假设IoT设备使用复杂的扩散序列并以随机访问的方式传输信息,按照爆发 - 距离模型,其中一些物联网设备以高概率在多个相邻的时间插槽中传输其数据,而另一些物联网设备在帧中仅传输一次。利用时间相关性,我们提出了一个基于注意力的双向长期记忆(BILSTM)网络来解决泥浆问题。 Bilstm网络使用前向和反向通过LSTM创建设备激活历史记录的模式,而注意机制为设备激活点提供了基本背景。通过这样做,遵循了层次途径,以在无拨款方案中检测主动设备。然后,通过利用复杂的扩散序列,对估计的活动设备进行了盲数据检测。所提出的框架不需要对设备稀疏水平和执行泥浆的通道的先验知识。结果表明,与现有的基准方案相比,提议的网络的性能更好。
translated by 谷歌翻译