We introduce an approach for the answer-aware question generation problem. Instead of only relying on the capability of strong pre-trained language models, we observe that the information of answers and questions can be found in some relevant sentences in the context. Based on that, we design a model which includes two modules: a selector and a generator. The selector forces the model to more focus on relevant sentences regarding an answer to provide implicit local information. The generator generates questions by implicitly combining local information from the selector and global information from the whole context encoded by the encoder. The model is trained jointly to take advantage of latent interactions between the two modules. Experimental results on two benchmark datasets show that our model is better than strong pre-trained models for the question generation task. The code is also available (shorturl.at/lV567).
translated by 谷歌翻译
Air pollution is an emerging problem that needs to be solved especially in developed and developing countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on developing a solution that can estimate the emitted PM2.5 pollutants by counting the number of vehicles in the traffic. We first investigated among the recent object detection models and developed our own traffic surveillance system. The observed traffic density showed a similar trend to the measured PM2.5 with a certain lagging in time, suggesting a relation between traffic density and PM2.5. We further express this relationship with a mathematical model which can estimate the PM2.5 value based on the observed traffic density. The estimated result showed a great correlation with the measured PM2.5 plots in the urban area context.
translated by 谷歌翻译
Pareto Front Learning (PFL) was recently introduced as an effective approach to obtain a mapping function from a given trade-off vector to a solution on the Pareto front, which solves the multi-objective optimization (MOO) problem. Due to the inherent trade-off between conflicting objectives, PFL offers a flexible approach in many scenarios in which the decision makers can not specify the preference of one Pareto solution over another, and must switch between them depending on the situation. However, existing PFL methods ignore the relationship between the solutions during the optimization process, which hinders the quality of the obtained front. To overcome this issue, we propose a novel PFL framework namely \ourmodel, which employs a hypernetwork to generate multiple solutions from a set of diverse trade-off preferences and enhance the quality of the Pareto front by maximizing the Hypervolume indicator defined by these solutions. The experimental results on several MOO machine learning tasks show that the proposed framework significantly outperforms the baselines in producing the trade-off Pareto front.
translated by 谷歌翻译
Online Class Incremental learning (CIL) is a challenging setting in Continual Learning (CL), wherein data of new tasks arrive in incoming streams and online learning models need to handle incoming data streams without revisiting previous ones. Existing works used a single centroid adapted with incoming data streams to characterize a class. This approach possibly exposes limitations when the incoming data stream of a class is naturally multimodal. To address this issue, in this work, we first propose an online mixture model learning approach based on nice properties of the mature optimal transport theory (OT-MM). Specifically, the centroids and covariance matrices of the mixture model are adapted incrementally according to incoming data streams. The advantages are two-fold: (i) we can characterize more accurately complex data streams and (ii) by using centroids for each class produced by OT-MM, we can estimate the similarity of an unseen example to each class more reasonably when doing inference. Moreover, to combat the catastrophic forgetting in the CIL scenario, we further propose Dynamic Preservation. Particularly, after performing the dynamic preservation technique across data streams, the latent representations of the classes in the old and new tasks become more condensed themselves and more separate from each other. Together with a contraction feature extractor, this technique facilitates the model in mitigating the catastrophic forgetting. The experimental results on real-world datasets show that our proposed method can significantly outperform the current state-of-the-art baselines.
translated by 谷歌翻译
药物误解是可能导致对患者造成不可预测后果的风险之一。为了减轻这种风险,我们开发了一个自动系统,该系统可以正确识别移动图像中的药丸的处方。具体来说,我们定义了所谓的药丸匹配任务,该任务试图匹配处方药中药丸所拍摄的药丸的图像。然后,我们提出了PIMA,这是一种使用图神经网络(GNN)和对比度学习来解决目标问题的新方法。特别是,GNN用于学习处方中文本框之间的空间相关性,从而突出显示带有药丸名称的文本框。此外,采用对比度学习来促进药丸名称的文本表示与药丸图像的视觉表示之间的跨模式相似性的建模。我们进行了广泛的实验,并证明PIMA在我们构建的药丸和处方图像的现实数据集上优于基线模型。具体而言,与其他基线相比,PIMA的准确性从19.09%提高到46.95%。我们认为,我们的工作可以为建立新的临床应用并改善药物安全和患者护理提供新的机会。
translated by 谷歌翻译
在视频压缩中,通过运动和剩余补偿从先前解码的帧重复使用像素来提高编码效率。我们在视频帧中定义了两个层次冗余的两个级别:1)一阶:像素空间中的冗余,即跨相邻帧的像素值的相似性,该框架的相似性是通过运动和残差补偿有效捕获的,2)二阶:二阶:冗余:自然视频中的平稳运动引起的运动和残留地图。尽管大多数现有的神经视频编码文献都涉及一阶冗余,但我们解决了通过预测变量在神经视频编解码器中捕获二阶冗余的问题。我们引入了通用运动和残留预测因子,这些预测因素学会从先前解码的数据中推断出来。这些预测因子是轻量级的,可以使用大多数神经视频编解码器来提高其率延伸性能。此外,虽然RGB是神经视频编码文献中的主导色彩空间,但我们引入了神经视频编解码器的一般修改,以包含YUV420 Colorspace并报告YUV420的结果。我们的实验表明,使用众所周知的神经视频编解码器使用我们的预测因子可在UVG数据集中测得的RGB和YUV420 Colorspace中节省38%和34%的比特率。
translated by 谷歌翻译
由于深层网络的计算复杂性和功率约束的移动硬件的计算复杂性,因此在移动设备上实现神经视频编解码器的潜力是一项巨大的技术挑战。我们通过利用高通公司的技术和创新来证明可行性,从而弥合了从基于神经网络的编解码器模拟在壁式工作站运行的差距,再到由Snapdragon技术供电的移动设备上的实时操作。我们显示有史以来第一个在商用手机上运行的框架间神经视频解码器,实时解码高清视频,同时保持低比特率和高视觉质量。
translated by 谷歌翻译
步态识别是指根据人的身体形状和步行方式对人的识别或识别,这些视频数据是从远处捕获的视频数据中得出的,被广泛用于预防犯罪,法医身份和社会保障中。但是,据我们所知,大多数现有方法都使用外观,姿势和时间feautures,而无需考虑用于全球和局部信息融合的学习时间关注机制。在本文中,我们提出了一个新型的步态识别框架,称为“时间关注”和“关键”引导的嵌入(Gaittake),该框架有效地融合了基于时间注意的全球和局部外观特征以及时间聚集的人类姿势特征。实验结果表明,我们所提出的方法在步态识别中获得了新的SOTA,排名1的准确性为98.0%(正常),97.5%(袋)和92.2%(涂层)(涂层)在CASIA-B GAIT数据集中;OU-MVLP步态数据集的精度为90.4%。
translated by 谷歌翻译
我们使用线性时间逻辑(LTL)约束研究策略优化问题(PO)。LTL的语言允许灵活描述可能不自然的任务,以编码为标量成本函数。我们将LTL受限的PO视为系统框架,将任务规范与策略选择解耦,以及成本塑造标准的替代方案。通过访问生成模型,我们开发了一种基于模型的方法,该方法享有样本复杂性分析,以确保任务满意度和成本最佳性(通过减少到可达性问题)。从经验上讲,即使在低样本制度中,我们的算法也可以实现强大的性能。
translated by 谷歌翻译
从非规范目标分布中抽样是概率推断中许多应用的基本问题。 Stein变异梯度下降(SVGD)已被证明是一种强大的方法,它迭代地更新一组粒子以近似关注的分布。此外,在分析其渐近性特性时,SVGD会准确地减少到单目标优化问题,并可以看作是此单目标优化问题的概率版本。然后出现一个自然的问题:“我们可以得出多目标优化的概率版本吗?”。为了回答这个问题,我们提出了随机多重目标采样梯度下降(MT-SGD),从而使我们能够从多个非差异目标分布中采样。具体而言,我们的MT-SGD进行了中间分布的流动,逐渐取向多个目标分布,这使采样颗粒可以移动到目标分布的关节高样区域。有趣的是,渐近分析表明,正如预期的那样,我们的方法准确地减少了多级下降算法以进行多目标优化。最后,我们进行全面的实验,以证明我们进行多任务学习方法的优点。
translated by 谷歌翻译