建筑物的表面裂缝,天然墙壁和地下矿井隧道可以表示严重的结构完整性问题,威胁到环境中的结构和人们的安全。及时检测和监测裂缝对于管理这些风险至关重要,特别是如果系统可以通过机器人提供高度自动化。使用深神经网络的视觉裂缝检测算法表现出墙壁或土木工程隧道等结构表面的承诺,但是工作的少量工作已经解决了高度非结构化的环境,例如岩石悬崖和裸露的采矿隧道。为了解决这一挑战,本文介绍了一个用于非结构化表面的新的3D点云的裂缝检测算法。该方法包括三个关键组件:一种自适应的下采样方法,其保持足够的裂缝点密度,将每个点作为裂缝或非裂缝分类的DNN,以及将裂缝点分成裂缝的后处理聚类方法。该方法在新的大型天然岩数据集上通过实验验证,包括跨越900米^ 2和412个单独裂缝的彩色激光雷达云。结果证明裂缝检出率为97%,最大宽度为3厘米以上的裂缝100%,显着优于现有技术。此外,对于交叉验证,PointCrack3D应用于在不同位置获取的完全新数据集,并且在培训中根本不使用,并显示为检测其100%的裂缝实例。我们还表征了检测性能,裂缝宽度和点数的点数之间的关系,为其提供了对实际部署和未来研究方向作出决策的基础。
translated by 谷歌翻译
Convolutional Neural Networks (CNNs) have demonstrated superiority in learning patterns, but are sensitive to label noises and may overfit noisy labels during training. The early stopping strategy averts updating CNNs during the early training phase and is widely employed in the presence of noisy labels. Motivated by biological findings that the amplitude spectrum (AS) and phase spectrum (PS) in the frequency domain play different roles in the animal's vision system, we observe that PS, which captures more semantic information, can increase the robustness of DNNs to label noise, more so than AS can. We thus propose early stops at different times for AS and PS by disentangling the features of some layer(s) into AS and PS using Discrete Fourier Transform (DFT) during training. Our proposed Phase-AmplituDe DisentangLed Early Stopping (PADDLES) method is shown to be effective on both synthetic and real-world label-noise datasets. PADDLES outperforms other early stopping methods and obtains state-of-the-art performance.
translated by 谷歌翻译
We present a novel neural model for modern poetry generation in French. The model consists of two pretrained neural models that are fine-tuned for the poem generation task. The encoder of the model is a RoBERTa based one while the decoder is based on GPT-2. This way the model can benefit from the superior natural language understanding performance of RoBERTa and the good natural language generation performance of GPT-2. Our evaluation shows that the model can create French poetry successfully. On a 5 point scale, the lowest score of 3.57 was given by human judges to typicality and emotionality of the output poetry while the best score of 3.79 was given to understandability.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have been successfully applied in many applications in computer sciences. Despite the success of deep learning architectures in other domains, deep GNNs still underperform their shallow counterparts. There are many open questions about deep GNNs, but over-smoothing and over-squashing are perhaps the most intriguing issues. When stacking multiple graph convolutional layers, the over-smoothing and over-squashing problems arise and have been defined as the inability of GNNs to learn deep representations and propagate information from distant nodes, respectively. Even though the widespread definitions of both problems are similar, these phenomena have been studied independently. This work strives to understand the underlying relationship between over-smoothing and over-squashing from a topological perspective. We show that both problems are intrinsically related to the spectral gap of the Laplacian of the graph. Therefore, there is a trade-off between these two problems, i.e., we cannot simultaneously alleviate both over-smoothing and over-squashing. We also propose a Stochastic Jost and Liu curvature Rewiring (SJLR) algorithm based on a bound of the Ollivier's Ricci curvature. SJLR is less expensive than previous curvature-based rewiring methods while retaining fundamental properties. Finally, we perform a thorough comparison of SJLR with previous techniques to alleviate over-smoothing or over-squashing, seeking to gain a better understanding of both problems.
translated by 谷歌翻译
We present a novel approach to generating news headlines in Finnish for a given news story. We model this as a summarization task where a model is given a news article, and its task is to produce a concise headline describing the main topic of the article. Because there are no openly available GPT-2 models for Finnish, we will first build such a model using several corpora. The model is then fine-tuned for the headline generation task using a massive news corpus. The system is evaluated by 3 expert journalists working in a Finnish media house. The results showcase the usability of the presented approach as a headline suggestion tool to facilitate the news production process.
translated by 谷歌翻译
We present a method for extracting a multilingual sentiment annotated dialog data set from Fallout New Vegas. The game developers have preannotated every line of dialog in the game in one of the 8 different sentiments: \textit{anger, disgust, fear, happy, neutral, pained, sad } and \textit{surprised}. The game has been translated into English, Spanish, German, French and Italian. We conduct experiments on multilingual, multilabel sentiment analysis on the extracted data set using multilingual BERT, XLMRoBERTa and language specific BERT models. In our experiments, multilingual BERT outperformed XLMRoBERTa for most of the languages, also language specific models were slightly better than multilingual BERT for most of the languages. The best overall accuracy was 54\% and it was achieved by using multilingual BERT on Spanish data. The extracted data set presents a challenging task for sentiment analysis. We have released the data, including the testing and training splits, openly on Zenodo. The data set has been shuffled for copyright reasons.
translated by 谷歌翻译
Word order, an essential property of natural languages, is injected in Transformer-based neural language models using position encoding. However, recent experiments have shown that explicit position encoding is not always useful, since some models without such feature managed to achieve state-of-the art performance on some tasks. To understand better this phenomenon, we examine the effect of removing position encodings on the pre-training objective itself (i.e., masked language modelling), to test whether models can reconstruct position information from co-occurrences alone. We do so by controlling the amount of masked tokens in the input sentence, as a proxy to affect the importance of position information for the task. We find that the necessity of position information increases with the amount of masking, and that masked language models without position encodings are not able to reconstruct this information on the task. These findings point towards a direct relationship between the amount of masking and the ability of Transformers to capture order-sensitive aspects of language using position encoding.
translated by 谷歌翻译
人类和神经语言模型都能够执行主题 - 动词数协议(SVA)。原则上,语义不应干扰此任务,这仅需要句法知识。在这项工作中,我们测试含义是否干扰了各种复杂性的句法结构中的英语一致性。为此,我们同时生成语义上良好的和荒谬的项目。我们将Bert Base与人类的表现进行了比较,该表现是通过心理语言在线众包实验获得的。我们发现伯特和人类都对我们的语义操纵敏感:出现荒谬的项目时,它们的频率更高,尤其是当它们的句法结构具有吸引子(主题和动词之间的名词短语和与该数字不同的名词短语)时主题)。我们还发现,有意义性对SVA错误的影响对于BERT而言比对人类的影响更强,显示前者对这项任务的词汇敏感性更高。
translated by 谷歌翻译
基于草图的图像检索(SBIR)是检索与语义和手绘草图查询的空间配置相匹配的自然图像(照片)的任务。草图的普遍性扩大了可能的应用程序的范围,并增加了对有效SBIR解决方案的需求。在本文中,我们研究了经典的基于三胞胎的SBIR解决方案,并表明对水平翻转(即使在模型登录之后)的持续不变性也损害了性能。为了克服这一限制,我们提出了几种方法,并深入评估它们每个方法以检查其有效性。我们的主要贡献是双重的:我们提出并评估几种直观的修改,以构建具有更好的翻转均衡性的SBIR解决方案。我们表明,视觉变压器更适合SBIR任务,并且它们的优于CNN的优于较大的CNN。我们进行了许多实验,并引入了第一个模型,以优于大规模SBIR基准(粗略)的人类表现。与以前的最新方法相比,我们的最佳模型在粗略的基准测试中达到了62.25%(在k = 1)的召回率为46.2%。
translated by 谷歌翻译
分类数据存在于健康或供应链等关键领域,此数据需要特定的治疗方法。为了将最新的机器学习模型应用于此类数据,需要编码。为了构建可解释的模型,单次编码仍然是一个很好的解决方案,但是这样的编码会创建稀疏的数据。梯度估计器不适合稀疏数据:梯度主要视为零,而它并不总是存在,因此引入了新型的梯度估计器。我们显示了该估计值在理论上最小化的内容,并在具有多个模型体系结构的不同数据集上显示了其效率。在相似的设置下,这种新的估计器的性能优于常见估计器。匿名后,现实世界零售数据集也会发布。总体而言,本文的目的是彻底考虑分类数据,并适应这些关键功能。
translated by 谷歌翻译