文本生成的广泛使用的评估指标要么与更长的文本效果不错,要么无法评估文本质量的所有方面。在本文中,我们引入了一个名为SMART的新指标,以减轻此类限制。具体而言,我们将句子视为匹配的基本单位,而不是代币,并使用句子匹配函数来匹配匹配候选和参考句子。还将候选句子与源文件中的句子进行了比较,以允许接地(例如,事实)评估。我们的结果表明,我们提出的指标与基于模型的匹配函数的系统级相关性优于萨姆瓦尔摘要元评估数据集上的所有竞争指标指标。后者不使用任何神经模型,这在模型开发阶段很有用,在这些阶段,资源可以受到限制且需要快速评估。最后,我们还进行了广泛的分析,表明我们提出的指标与较长的摘要很好地运行,并且对特定模型的偏见较小。
translated by 谷歌翻译
传达相关和忠实信息的能力对于有条件生成的许多任务至关重要,但对于神经SEQ-seq seq模型仍然难以捉摸,这些模型的输出通常显示出幻觉,并且无法正确涵盖重要细节。在这项工作中,我们主张规划作为有用的中间表示,以使有条件的一代减少不透明和扎根。我们的作品提出了将文本计划作为一系列提问(QA)对的新概念化。我们用QA蓝图作为内容选择(即〜说什么)和计划(即〜按什么顺序)来增强现有数据集(例如,用于摘要)。我们通过利用最先进的问题生成技术并将输入输出对自动获取蓝图,并将其转换为输入 - 蓝图输出输出元组。我们开发了基于变压器的模型,每个模型都在它们如何将蓝图合并到生成的输出中(例如,作为全局计划或迭代)。跨指标和数据集的评估表明,蓝图模型比不采取计划并允许对生成输出进行更严格控制的替代方案更为事实。
translated by 谷歌翻译
常识性推理系统应该能够推广到各种推理案例。但是,大多数最先进的方法都取决于昂贵的数据注释,并且在不学习如何执行一般语义推理的情况下过度适合特定基准。为了克服这些缺点,零射击质量检查系统通过将常识性知识图(kg)转换为合成质量质量质量质量验证样本进行模型训练,已将有望作为强大的学习方案显示出来。考虑到不断增加的不同常识性KG类型,本文旨在将零拍传输的学习方案扩展到多种源设置,在这种设置中,可以协同使用不同的KGS。为了实现这一目标,我们建议通过将知识聚合的模块化变体作为一个新的零摄影常识性推理框架来减轻不同知识源之间的干扰丧失。五个常识性推理基准的结果证明了我们框架的功效,从而改善了多个公斤的性能。
translated by 谷歌翻译
对颅面畸形的评估需要稀疏可用的患者数据。统计形状模型提供了现实和合成数据,从而实现了公共数据集上现有方法的比较。我们建立了第一个公开可获得的颅骨肌肤肤化患者的统计3D头号,并将重点关注比1.5年更年轻的婴儿。对于通信建立,我们测试和评估四种模板变形方法。我们进一步提出了一种基于模型的基于模型的基于模型的分类方法,用于摄影测图表面扫描。据我们所知,我们的研究使用最大的Craniosynosisosis患者数据集,以迄今为止的粗糙化和统计形状建模的分类研究。我们展示了我们的形状模型与人头的其他统计形状模型类似。特异性抗皱性病理学在该模型的第一个特征模具中表示。关于Craniosynostis的自动分类,我们的分类方法能够提供97.3%的精度,与使用两种计算机断层扫描扫描和立体测量法进行的其他最先进的方法相当。我们公开的颅骨弯曲特异性统计形状模型能够评估粗糙化和合成数据的颅骨。我们进一步提出了一种基于最先进的形状模型的分类方法,用于无放射诊断性的颅骨。
translated by 谷歌翻译
The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.
translated by 谷歌翻译
In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.
translated by 谷歌翻译
Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.
translated by 谷歌翻译
The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.
translated by 谷歌翻译
According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.
translated by 谷歌翻译
In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.
translated by 谷歌翻译