Recurrent neural networks are capable of learning the dynamics of an unknown nonlinear system purely from input-output measurements. However, the resulting models do not provide any stability guarantees on the input-output mapping. In this work, we represent a recurrent neural network as a linear time-invariant system with nonlinear disturbances. By introducing constraints on the parameters, we can guarantee finite gain stability and incremental finite gain stability. We apply this identification method to learn the motion of a four-degrees-of-freedom ship that is moving in open water and compare it against other purely learning-based approaches with unconstrained parameters. Our analysis shows that the constrained recurrent neural network has a lower prediction accuracy on the test set, but it achieves comparable results on an out-of-distribution set and respects stability conditions.
translated by 谷歌翻译
知识图,例如Wikidata,包括结构和文本知识,以表示知识。对于图形嵌入和语言模型的两种方式中的每种方法都可以学习预测新型结构知识的模式。很少有方法与模式结合学习和推断,而这些现有的方法只能部分利用结构和文本知识的相互作用。在我们的方法中,我们以单个方式的现有强烈表示为基础,并使用超复杂代数来表示(i),(i),单模式嵌入以及(ii),不同方式之间的相互作用及其互补的知识表示手段。更具体地说,我们建议4D超复合数的二脑和四个元素表示,以整合四个模态,即结构知识图形嵌入,单词级表示(例如\ word2vec,fastText,fastText),句子级表示(句子transformer)和文档级表示(句子级别)(句子级别)(句子级表示)(句子变压器,doc2vec)。我们的统一矢量表示通过汉密尔顿和二脑产物进行标记的边缘的合理性,从而对不同模态之间的成对相互作用进行建模。对标准基准数据集的广泛实验评估显示了我们两个新模型的优越性,除了稀疏的结构知识外,还可以提高链接预测任务中的性能。
translated by 谷歌翻译
最近,越来越多的努力用于学习符号知识库(KB)的持续表示。但是,这些方法要么仅嵌入数据级知识(ABOX),要么在处理概念级知识(Tbox)时受到固有的局限性,即它们不能忠实地对KBS中存在的逻辑结构进行建模。我们提出了Boxel,这是一种几何KB嵌入方法,可以更好地捕获描述逻辑EL ++中的逻辑结构(即Abox和Tbox Axioms)。 Boxel模型在Kb中作为轴平行框,适用于建模概念交叉点,作为点内部的实体以及概念/实体之间的关系作为仿射转换。我们展示了Boxel的理论保证(声音),以保存逻辑结构。也就是说,有损耗0的框嵌入模型是KB​​的(逻辑)模型。实验结果(合理)补充推理和用于蛋白质 - 蛋白质预测的现实世界应用的结果表明,Boxel的表现优于传统知识图嵌入方法以及最先进的EL ++嵌入方法。
translated by 谷歌翻译
Wikidata是公开可用的最大的一般兴趣知识库。自2012年成立以来,这是由数千名志愿者编辑的协同编辑。在本文中,我们展示了WikiData的全部修订历史数据集的WikiDate 1.0,它将Wikidata修订版的更改为删除和添加RDF三元组。据我们所知,它构成了一个在语义网络社区中最近出现的研究主题的不断发展知识图表的第一个大型数据集。我们介绍了从Wikidata的转储生成WikiDated 1.0的方法,讨论其实现和限制,以及数据集的统计特征。
translated by 谷歌翻译
Graph convolutional networks (GCNs) are powerful frameworks for learning embeddings of graph-structured data. GCNs are traditionally studied through the lens of Euclidean geometry. Recent works find that non-Euclidean Riemannian manifolds provide specific inductive biases for embedding hierarchical or spherical data. However, they cannot align well with data of mixed graph topologies. We consider a larger class of pseudo-Riemannian manifolds that generalize hyperboloid and sphere. We develop new geodesic tools that allow for extending neural network operations into geodesically disconnected pseudo-Riemannian manifolds. As a consequence, we derive a pseudo-Riemannian GCN that models data in pseudo-Riemannian manifolds of constant nonzero curvature in the context of graph neural networks. Our method provides a geometric inductive bias that is sufficiently flexible to model mixed heterogeneous topologies like hierarchical graphs with cycles. We demonstrate the representational capabilities of this method by applying it to the tasks of graph reconstruction, node classification and link prediction on a series of standard graphs with mixed topologies. Empirical results demonstrate that our method outperforms Riemannian counterparts when embedding graphs of complex topologies.
translated by 谷歌翻译
物理运动模型为车辆运动提供了可解释的预测。但是,某些模型参数(例如与空气动力学和流体动力学相关的参数)非常昂贵,并且通常仅大致近似降低预测准确性。经常性的神经网络以低成本的价格实现了高预测准确性,因为它们可以使用车辆常规操作期间收集的廉价测量值,但是它们的结果很难解释。为了精确预测车辆状态,没有昂贵的物理参数测量,我们提出了一种混合方法,结合了深度学习和物理运动模型,包括新型的两阶段训练程序。我们通过将深神经网络的输出范围限制为混合模型的一部分来实现可解释性,这将神经网络引入的不确定性限制为已知数量。我们已经评估了船用和四轮运动的用例。结果表明,与现有的深度学习方法相比,我们的混合模型可以提高模型的解释性,而准确性没有降低。
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
We consider the end-to-end abstract-to-title generation problem, exploring seven recent transformer based models (including ChatGPT) fine-tuned on more than 30k abstract-title pairs from NLP and machine learning venues. As an extension, we also consider the harder problem of generating humorous paper titles. For the latter, we compile the first large-scale humor annotated dataset for scientific papers in the NLP/ML domains, comprising almost 2.5k titles. We evaluate all models using human and automatic metrics. Our human evaluation suggests that our best end-to-end system performs similarly to human authors (but arguably slightly worse). Generating funny titles is more difficult, however, and our automatic systems clearly underperform relative to humans and often learn dataset artefacts of humor. Finally, ChatGPT, without any fine-tuning, performs on the level of our best fine-tuned system.
translated by 谷歌翻译
State-of-the-art poetry generation systems are often complex. They either consist of task-specific model pipelines, incorporate prior knowledge in the form of manually created constraints or both. In contrast, end-to-end models would not suffer from the overhead of having to model prior knowledge and could learn the nuances of poetry from data alone, reducing the degree of human supervision required. In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. We identify and address lack of training data and mismatching tokenization algorithms as possible limitations of past attempts. In particular, we successfully pre-train and release ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with our styles. We show that ByGPT5 outperforms other models such as mT5, ByT5, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans. In addition, we analyze its runtime performance and introspect the model's understanding of style conditions. We make our code, models, and datasets publicly available.
translated by 谷歌翻译
State-of-the-art machine translation evaluation metrics are based on black-box language models. Hence, recent works consider their explainability with the goals of better understandability for humans and better metric analysis, including failure cases. In contrast, we explicitly leverage explanations to boost the metrics' performance. In particular, we perceive explanations as word-level scores, which we convert, via power means, into sentence-level scores. We combine this sentence-level score with the original metric to obtain a better metric. Our extensive evaluation and analysis across 5 datasets, 5 metrics and 4 explainability techniques shows that some configurations reliably improve the original metrics' correlation with human judgment. On two held datasets for testing, we obtain improvements in 15/18 resp. 4/4 cases. The gains in Pearson correlation are up to 0.032 resp. 0.055. We make our code available.
translated by 谷歌翻译