The Annals of Joseon Dynasty (AJD) contain the daily records of the Kings of Joseon, the 500-year kingdom preceding the modern nation of Korea. The Annals were originally written in an archaic Korean writing system, `Hanja', and were translated into Korean from 1968 to 1993. The resulting translation was however too literal and contained many archaic Korean words; thus, a new expert translation effort began in 2012. Since then, the records of only one king have been completed in a decade. In parallel, expert translators are working on English translation, also at a slow pace and produced only one king's records in English so far. Thus, we propose H2KE, a neural machine translation model, that translates historical documents in Hanja to more easily understandable Korean and to English. Built on top of multilingual neural machine translation, H2KE learns to translate a historical document written in Hanja, from both a full dataset of outdated Korean translation and a small dataset of more recently translated contemporary Korean and English. We compare our method against two baselines: a recent model that simultaneously learns to restore and translate Hanja historical document and a Transformer based model trained only on newly translated corpora. The experiments reveal that our method significantly outperforms the baselines in terms of BLEU scores for both contemporary Korean and English translations. We further conduct extensive human evaluation which shows that our translation is preferred over the original expert translations by both experts and non-expert Korean speakers.
translated by 谷歌翻译
This report summarizes the 3rd International Verification of Neural Networks Competition (VNN-COMP 2022), held as a part of the 5th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), which was collocated with the 34th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2022 iteration, 11 teams participated on a diverse set of 12 scored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
translated by 谷歌翻译
Machine learning models are increasingly deployed for critical decision-making tasks, making it important to verify that they do not contain gender or racial biases picked up from training data. Typical approaches to achieve fairness revolve around efforts to clean or curate training data, with post-hoc statistical evaluation of the fairness of the model on evaluation data. In contrast, we propose techniques to \emph{prove} fairness using recently developed formal methods that verify properties of neural network models.Beyond the strength of guarantee implied by a formal proof, our methods have the advantage that we do not need explicit training or evaluation data (which is often proprietary) in order to analyze a given trained model. In experiments on two familiar datasets in the fairness literature (COMPAS and ADULTS), we show that through proper training, we can reduce unfairness by an average of 65.4\% at a cost of less than 1\% in AUC score.
translated by 谷歌翻译
基于知识的视觉问题答案(KVQA)任务旨在回答需要其他外部知识以及对图像和问题的理解的问题。关于KVQA的最新研究以多模式形式注入外部知识,并且随着更多的知识,可能会添加无关紧要的信息,并且可能会混淆问题的回答。为了正确使用知识,本研究提出了以下内容:1)我们介绍了根据标题不确定性和语义相似性计算出的新型语义不一致度量;2)我们建议一种基于语义不一致度量的新的外部知识同化方法,并将其应用于集成KVQA的明确知识和隐性知识;3)使用OK-VQA数据集评估所提出的方法并实现最新性能。
translated by 谷歌翻译
我们提出了一种新的方法,可以通过具有relu,sigmoid或双曲线切线激活功能的神经网络有效地计算图像的紧密非凸面。特别是,我们通过多项式近似来抽象每个神经元的输入输出关系,该近似是使用多项式界定的基于设定的方式进行评估的。我们提出的方法特别适合于对神经网络控制系统的可及性分析,因为多项式地位型能够捕获两者中的非共鸣性,通过神经网络以及可触及的集合。与各种基准系统上的其他最新方法相比,我们证明了方法的卓越性能。
translated by 谷歌翻译
基于生成对抗神经网络(GAN)的神经声码器由于其快速推理速度和轻量级网络而被广泛使用,同时产生了高质量的语音波形。由于感知上重要的语音成分主要集中在低频频段中,因此大多数基于GAN的神经声码器进行了多尺度分析,以评估降压化采样的语音波形。这种多尺度分析有助于发电机提高语音清晰度。然而,在初步实验中,我们观察到,重点放在低频频段的多尺度分析会导致意外的伪影,例如,混叠和成像伪像,这些文物降低了合成的语音波形质量。因此,在本文中,我们研究了这些伪影与基于GAN的神经声码器之间的关系,并提出了一个基于GAN的神经声码器,称为Avocodo,该机器人允许合成具有减少伪影的高保真语音。我们介绍了两种歧视者,以各种视角评估波形:协作多波段歧视者和一个子兰歧视器。我们还利用伪正常的镜像滤波器库来获得下采样的多频段波形,同时避免混音。实验结果表明,在语音和唱歌语音合成任务中,鳄梨的表现优于常规的基于GAN的神经声码器,并且可以合成无伪影的语音。尤其是,鳄梨甚至能够复制看不见的扬声器的高质量波形。
translated by 谷歌翻译
了解文档图像(例如,发票)是一个重要的研究主题,并在文档处理自动化中具有许多应用。通过基于深度学习的光学字符识别(OCR)的最新进展,目前的视觉文档了解(VDU)系统已经基于OCR设计。虽然这种基于OCR的方法承诺合理的性能,但它们遭受了由OCR引起的关键问题,例如(1)(1)昂贵的计算成本和(2)由于OCR误差传播而导致的性能下降。在本文中,我们提出了一种新颖的VDU模型,即结束可训练而不支撑OCR框架。为此,我们提出了一个新的任务和合成文档图像生成器,以预先列车,以减轻大规模实体文档图像上的依赖关系。我们的方法在公共基准数据集和私营工业服务数据集中了解各种文档的最先进的性能。通过广泛的实验和分析,我们展示了拟议模型的有效性,特别是考虑到真实世界的应用。
translated by 谷歌翻译
灵感来自HTTPS://Doi.org/10.1515/Jagi-2016-0001中呈现的“认知时间玻璃”模型,我们为开发旨在认知机器人的认知架构提出了一个新的框架。拟议框架的目的是通过鼓励和减轻合作和重复使用现有结果来缓解认知架构的发展。这是通过提出将认知架构的发展分成一系列层的框架来完成,该层可以部分地被认为是隔离的,其中一些可以与其他研究领域直接相关。最后,我们向拟议框架介绍和审查一些主题。
translated by 谷歌翻译
Figure 1: We introduce datasets for 3D tracking and motion forecasting with rich maps for autonomous driving. Our 3D tracking dataset contains sequences of LiDAR measurements, 360 • RGB video, front-facing stereo (middle-right), and 6-dof localization. All sequences are aligned with maps containing lane center lines (magenta), driveable region (orange), and ground height. Sequences are annotated with 3D cuboid tracks (green). A wider map view is shown in the bottom-right.
translated by 谷歌翻译