智能论文笔记

Stochastic Functional Analysis and Multilevel Vector Field Anomaly Detection

Julio E Castrillon-Candas , Mark Kon

分类： (统计)机器学习 | 机器学习

2022-07-11

大量矢量场数据集在多光谱光学传感器和雷达传感器以及现代多模式MRI数据中很常见，以及许多其他应用领域。在本文中，我们开发了一种新型的随机功能分析方法，用于基于具有多波段矢量磁场数据的域名域名随机行为的协方差结构来检测异常。最佳矢量场karhunen-loeve（KL）扩展应用于此类随机字段数据。一系列多级正交功能子空间是根据域的几何形状构建的，该域的几何形状是根据KL扩展的。通过在多级基础上检查随机场的投影来实现检测。可以根据本地和全球信息在适当的规范空间中量化异常。此外，可靠的假设检验是由可控的分布形成的，这些分布不需要事先对数据的概率分布的假设。仅需要协方差函数，这可以显着简单地估计。此外，这种方法允许基于随机向量的异常融合而不会丢失任何信息。该方法应用于亚马逊森林中森林砍伐和退化的重要问题。这是一个复杂的非单调过程，因为森林可以降解和恢复。通过当前掩盖算法难以消除的云的存在进一步使这个特殊的问题更加复杂。使用Sentinel 2的多光谱卫星数据，构造了多级过滤器，并将异常视为与森林初始状态的偏差。森林异常通过可靠的假设检验量化，并与诸如云覆盖之类的错误变化区分开。我们的方法显示了在矢量化的复合物中使用多个数据频段的优点，从而超过了基于标量的方法的能力，从而使更好的异常检测。

translated by 谷歌翻译

Unifying Model-Based and Neural Network Feedforward: Physics-Guided Neural Networks with Linear Autoregressive Dynamics

Johan Kon , Dennis Bruijnen , Jeroen van de Wijdeven , Marcel Heertjes , Tom Oomen

分类：机器学习

2022-09-26

未知的非线性动力学通常会限制前馈控制的跟踪性能。本文的目的是开发一个可以使用通用函数近似器来补偿这些未知非线性动力学的前馈控制框架。前馈控制器被参数化为基于物理模型和神经网络的平行组合，在该组合中，两者都共享相同的线性自回旋（AR）动力学。该参数化允许通过Sanathanan-Koerner（SK）迭代进行有效的输出误差优化。在每个Sk-itteration中，神经网络的输出在基于物理模型的子空间中通过基于正交投影的正则化受到惩罚，从而使神经网络仅捕获未建模的动力学，从而产生可解释的模型。

translated by 谷歌翻译

Using Active Learning Methods to Strategically Select Essays for Automated Scoring

Tahereh Firoozi , Hamid Mohammadi , Mark J. Gierl

分类：自然语言处理

2023-01-02

Research on automated essay scoring has become increasing important because it serves as a method for evaluating students' written-responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods than can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern automated essay scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included as part of the Automated Student Assessment Prize competition that were then classified using a scoring model that was training with the bidirectional encoder representations from transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.

translated by 谷歌翻译

Planning Paths through Occlusions in Urban Environments

Yutao Han , Youya Xia , Guo-Jun Qi , Mark Campbell

分类：机器人

2022-12-29

This paper presents a novel framework for planning in unknown and occluded urban spaces. We specifically focus on turns and intersections where occlusions significantly impact navigability. Our approach uses an inpainting model to fill in a sparse, occluded, semantic lidar point cloud and plans dynamically feasible paths for a vehicle to traverse through the open and inpainted spaces. We demonstrate our approach using a car's lidar data with real-time occlusions, and show that by inpainting occluded areas, we can plan longer paths, with more turn options compared to without inpainting; in addition, our approach more closely follows paths derived from a planner with no occlusions (called the ground truth) compared to other state of the art approaches.

translated by 谷歌翻译

Feature Acquisition using Monte Carlo Tree Search

Sungsoo Lim , Diego Klabjan , Mark Shapiro

分类：机器学习

2022-12-21

Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. Other approaches formulated the problem as a Markov Decision Process (MDP) and applied reinforcement learning based algorithms. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and Deep Q-Network algorithms as benchmark, we show the effectiveness of our proposed approach with experimental study.

translated by 谷歌翻译

Universal versus system-specific features of punctuation usage patterns in~major Western~languages

Tomasz Stanisz , Stanislaw Drozdz , Jaroslaw Kwapien

分类：自然语言处理

2022-12-21

The celebrated proverb that "speech is silver, silence is golden" has a long multinational history and multiple specific meanings. In written texts punctuation can in fact be considered one of its manifestations. Indeed, the virtue of effectively speaking and writing involves - often decisively - the capacity to apply the properly placed breaks. In the present study, based on a large corpus of world-famous and representative literary texts in seven major Western languages, it is shown that the distribution of intervals between consecutive punctuation marks in almost all texts can universally be characterised by only two parameters of the discrete Weibull distribution which can be given an intuitive interpretation in terms of the so-called hazard function. The values of these two parameters tend to be language-specific, however, and even appear to navigate translations. The properties of the computed hazard functions indicate that among the studied languages, English turns out to be the least constrained by the necessity to place a consecutive punctuation mark to partition a sequence of words. This may suggest that when compared to other studied languages, English is more flexible, in the sense of allowing longer uninterrupted sequences of words. Spanish reveals similar tendency to only a bit lesser extent.

translated by 谷歌翻译

The Third International Verification of Neural Networks Competition (VNN-COMP 2022): Summary and Results

Mark Niklas Müller , Christopher Brix , Stanley Bak , Changliu Liu , Taylor T. Johnson

分类：机器学习 | 人工智能

2022-12-20

This report summarizes the 3rd International Verification of Neural Networks Competition (VNN-COMP 2022), held as a part of the 5th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), which was collocated with the 34th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2022 iteration, 11 teams participated on a diverse set of 12 scored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.

translated by 谷歌翻译

Extrinsic Evaluation of Machine Translation Metrics

Nikita Moghe , Tom Sherborne , Mark Steedman , Alexandra Birch

分类：自然语言处理 | 人工智能

2022-12-20

Automatic machine translation (MT) metrics are widely used to distinguish the translation qualities of machine translation systems across relatively large test sets (system-level evaluation). However, it is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level (segment-level evaluation). In this paper, we investigate how useful MT metrics are at detecting the success of a machine translation component when placed in a larger platform with a downstream task. We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks (dialogue state tracking, question answering, and semantic parsing). For each task, we only have access to a monolingual task-specific model. We calculate the correlation between the metric's ability to predict a good/bad translation with the success/failure on the final task for the Translate-Test setup. Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes. We also find that the scores provided by neural metrics are not interpretable mostly because of undefined ranges. Our analysis suggests that future MT metrics be designed to produce error labels rather than scores to facilitate extrinsic evaluation.

translated by 谷歌翻译

Eff-3DPSeg: 3D organ-level plant shoot segmentation using annotation-efficient point clouds

Liyi Luo , Xintong Jiang , Yu Yang , Eugene Roy Antony Samy , Mark Lefsrud , Valerio Hoyos-Villegas , Shangpeng Sun

分类：计算机视觉 | 人工智能

2022-12-20

Reliable and automated 3D plant shoot segmentation is a core prerequisite for the extraction of plant phenotypic traits at the organ level. Combining deep learning and point clouds can provide effective ways to address the challenge. However, fully supervised deep learning methods require datasets to be point-wise annotated, which is extremely expensive and time-consuming. In our work, we proposed a novel weakly supervised framework, Eff-3DPSeg, for 3D plant shoot segmentation. First, high-resolution point clouds of soybean were reconstructed using a low-cost photogrammetry system, and the Meshlab-based Plant Annotator was developed for plant point cloud annotation. Second, a weakly-supervised deep learning method was proposed for plant organ segmentation. The method contained: (1) Pretraining a self-supervised network using Viewpoint Bottleneck loss to learn meaningful intrinsic structure representation from the raw point clouds; (2) Fine-tuning the pre-trained model with about only 0.5% points being annotated to implement plant organ segmentation. After, three phenotypic traits (stem diameter, leaf width, and leaf length) were extracted. To test the generality of the proposed method, the public dataset Pheno4D was included in this study. Experimental results showed that the weakly-supervised network obtained similar segmentation performance compared with the fully-supervised setting. Our method achieved 95.1%, 96.6%, 95.8% and 92.2% in the Precision, Recall, F1-score, and mIoU for stem leaf segmentation and 53%, 62.8% and 70.3% in the AP, AP@25, and AP@50 for leaf instance segmentation. This study provides an effective way for characterizing 3D plant architecture, which will become useful for plant breeders to enhance selection processes.

translated by 谷歌翻译

True Detective: A Challenging Benchmark for Deep Abductive Reasoning \\in Foundation Models

Maksym Del , Mark Fishel

分类：自然语言处理

2022-12-20

Large language models (LLMs) have demonstrated strong performance in zero-shot reasoning tasks, including abductive reasoning. This is reflected in their ability to perform well on current benchmarks in this area. However, to truly test the limits of LLMs in abductive reasoning, a more challenging benchmark is needed. In this paper, we present such a benchmark, consisting of 191 long-form mystery stories, each approximately 1200 words in length and presented in the form of detective puzzles. Each puzzle includes a multiple-choice question for evaluation sourced from the "5 Minute Mystery" platform. Our results show that state-of-the-art GPT models perform significantly worse than human solvers on this benchmark, with an accuracy of 28\% compared to 47\% for humans. This indicates that there is still a significant gap in the abductive reasoning abilities of LLMs and highlights the need for further research in this area. Our work provides a challenging benchmark for future studies on reasoning in language models and contributes to a better understanding of the limits of LLMs' abilities.

translated by 谷歌翻译