尽管Shapley值为DNN模型预测提供了有效的解释,但该计算依赖于所有可能的输入特征联盟的枚举,这导致了指数增长的复杂性。为了解决这个问题,我们提出了一种新颖的方法剪切,以显着加速DNN模型的Shapley解释,其中计算中只有几个输入特征的联盟。特征联盟的选择遵循我们提出的Shapley链规则,以最大程度地减少地面shapley值的绝对误差,从而使计算既有效又准确。为了证明有效性,我们全面评估了跨多个指标的剪切,包括地面真相shapley价值的绝对误差,解释的忠诚和跑步速度。实验结果表明,剪切始终优于不同评估指标的最先进的基线方法,这证明了其在计算资源受到限制的现实应用程序中的潜力。
translated by 谷歌翻译
Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or from IMU sensors worn on the body, effectively turning human movement in a "scanner" of the 3D world. Intuitively, human movement indicates the free-space in a room and human contact indicates surfaces or objects that support activities such as sitting, lying or touching. We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement. MIME uses an auto-regressive transformer architecture that takes the already generated objects in the scene as well as the human motion as input, and outputs the next plausible object. To train MIME, we build a dataset by populating the 3D FRONT scene dataset with 3D humans. Our experiments show that MIME produces more diverse and plausible 3D scenes than a recent generative scene method that does not know about human movement. Code and data will be available for research at https://mime.is.tue.mpg.de.
translated by 谷歌翻译
The Age-of-Information (AoI) metric has been widely studied in the theoretical communication networks and queuing systems literature. However, experimental evaluation of its applicability to complex real-world time-sensitive systems is largely lacking. In this work, we develop, implement, and evaluate an AoI-based application layer middleware that enables the customization of WiFi networks to the needs of time-sensitive applications. By controlling the storage and flow of information in the underlying WiFi network, our middleware can: (i) prevent packet collisions; (ii) discard stale packets that are no longer useful; and (iii) dynamically prioritize the transmission of the most relevant information. To demonstrate the benefits of our middleware, we implement a mobility tracking application using a swarm of UAVs communicating with a central controller via WiFi. Our experimental results show that, when compared to WiFi-UDP/WiFi-TCP, the middleware can improve information freshness by a factor of 109x/48x and tracking accuracy by a factor of 4x/6x, respectively. Most importantly, our results also show that the performance gains of our approach increase as the system scales and/or the traffic load increases.
translated by 谷歌翻译
Consider a scenario in one-shot query-guided object localization where neither an image of the object nor the object category name is available as a query. In such a scenario, a hand-drawn sketch of the object could be a choice for a query. However, hand-drawn crude sketches alone, when used as queries, might be ambiguous for object localization, e.g., a sketch of a laptop could be confused for a sofa. On the other hand, a linguistic definition of the category, e.g., a small portable computer small enough to use in your lap" along with the sketch query, gives better visual and semantic cues for object localization. In this work, we present a multimodal query-guided object localization approach under the challenging open-set setting. In particular, we use queries from two modalities, namely, hand-drawn sketch and description of the object (also known as gloss), to perform object localization. Multimodal query-guided object localization is a challenging task, especially when a large domain gap exists between the queries and the natural images, as well as due to the challenge of combining the complementary and minimal information present across the queries. For example, hand-drawn crude sketches contain abstract shape information of an object, while the text descriptions often capture partial semantic information about a given object category. To address the aforementioned challenges, we present a novel cross-modal attention scheme that guides the region proposal network to generate object proposals relevant to the input queries and a novel orthogonal projection-based proposal scoring technique that scores each proposal with respect to the queries, thereby yielding the final localization results. ...
translated by 谷歌翻译
This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing systematic mappings of multiple attributes and relational structures across dissimilar domains. Using this dataset, we test the analogical reasoning capabilities of several widely-used pretrained language models (LMs). We find that state-of-the-art LMs achieve low performance on these complex analogy tasks, highlighting the challenges still posed by analogy understanding.
translated by 谷歌翻译
Although prediction models for delirium, a commonly occurring condition during general hospitalization or post-surgery, have not gained huge popularity, their algorithmic bias evaluation is crucial due to the existing association between social determinants of health and delirium risk. In this context, using MIMIC-III and another academic hospital dataset, we present some initial experimental evidence showing how sociodemographic features such as sex and race can impact the model performance across subgroups. With this work, our intent is to initiate a discussion about the intersectionality effects of old age, race and socioeconomic factors on the early-stage detection and prevention of delirium using ML.
translated by 谷歌翻译
This paper presents a framework for jointly grounding objects that follow certain semantic relationship constraints given in a scene graph. A typical natural scene contains several objects, often exhibiting visual relationships of varied complexities between them. These inter-object relationships provide strong contextual cues toward improving grounding performance compared to a traditional object query-only-based localization task. A scene graph is an efficient and structured way to represent all the objects and their semantic relationships in the image. In an attempt towards bridging these two modalities representing scenes and utilizing contextual information for improving object localization, we rigorously study the problem of grounding scene graphs on natural images. To this end, we propose a novel graph neural network-based approach referred to as Visio-Lingual Message PAssing Graph Neural Network (VL-MPAG Net). In VL-MPAG Net, we first construct a directed graph with object proposals as nodes and an edge between a pair of nodes representing a plausible relation between them. Then a three-step inter-graph and intra-graph message passing is performed to learn the context-dependent representation of the proposals and query objects. These object representations are used to score the proposals to generate object localization. The proposed method significantly outperforms the baselines on four public datasets.
translated by 谷歌翻译
We consider stochastic gradient descents on the space of large symmetric matrices of suitable functions that are invariant under permuting the rows and columns using the same permutation. We establish deterministic limits of these random curves as the dimensions of the matrices go to infinity while the entries remain bounded. Under a "small noise" assumption the limit is shown to be the gradient flow of functions on graphons whose existence was established in arXiv:2111.09459. We also consider limits of stochastic gradient descents with added properly scaled reflected Brownian noise. The limiting curve of graphons is characterized by a family of stochastic differential equations with reflections and can be thought of as an extension of the classical McKean-Vlasov limit for interacting diffusions. The proofs introduce a family of infinite-dimensional exchangeable arrays of reflected diffusions and a novel notion of propagation of chaos for large matrices of interacting diffusions.
translated by 谷歌翻译
紧固件在确保机械的各个部位方面起着至关重要的作用。紧固件表面的凹痕,裂缝和划痕等变形是由材料特性和生产过程中设备的错误处理引起的。结果,需要质量控制以确保安全可靠的操作。现有的缺陷检查方法依赖于手动检查,该检查消耗了大量时间,金钱和其他资源;同样,由于人为错误,无法保证准确性。自动缺陷检测系统已证明对缺陷分析的手动检查技术有影响。但是,诸如卷积神经网络(CNN)和基于深度学习的方法之类的计算技术是进化方法。通过仔细选择设计参数值,可以实现CNN的全部电势。使用基于Taguchi的实验和分析设计,已经尝试在本研究中开发强大的自动系统。用于训练系统的数据集是为具有两个标记类别的M14尺寸螺母手动创建的:有缺陷且无缺陷。数据集中共有264张图像。所提出的顺序CNN的验证精度为96.3%,在0.001学习率下的验证损失为0.277。
translated by 谷歌翻译
基于分解的模型(FMS),例如Distmult,在知识图完成(KGC)任务中享有持久的成功,通常优于图形神经网络(GNNS)。但是,与GNN不同,FMS难以合并节点特征并概括在归纳环境中看不见的节点。我们的工作通过提出重构GNN来弥合FMS和GNN之间的差距。这种新的体系结构借鉴了两种建模范式,以前在很大程度上被认为是不结合的。具体地说,使用消息通讯的形式主义,我们通过将梯度下降程序重新定义为消息传播操作来展示如何将FMS施加为GNN,这构成了我们重构GNN的基础。在众多成熟的KGC基准测试中,我们的重构GNN可以实现与FMS相当的转导性能以及最先进的归纳性能,同时使用较少的参数阶数。
translated by 谷歌翻译