智能论文笔记

Machine Learning Product State Distributions from Initial Reactant States for a Reactive Atom-Diatom Collision System

Julian Arnold , Juan Carlos San Vicente Veliz , Debasish Koner , Narendra Singh , Raymond J. Bemish , Markus Meuwly

分类：机器学习

2021-11-05

为N（$ ^ 4 $ s）+ o $ _呈现和定量测试了一种用于预测来自特定初始状态（状态为分布或STD）的产品状态分布的机器学习（ML）模型。 {2} $（x $ ^ 3 \ sigma _ {\ rm g} ^ { - } $）$ \ lightarrow $ no（x $ ^ 2 \ pi $）+ o（$ ^ 3 $ p）反应。用于训练神经网络（NN）的参考数据集由用于$ \ SIM 2000 $初始条件的显式准古典轨迹（QCT）模拟确定的最终状态分布。总体而言，通过根均方平方差价量化的预测精度$（\ SIM 0.003）$和$ r ^ 2 $ $（\ SIM 0.99）$之间的参考QCT和STD模型的预测很高测试集和离网状态特定的初始条件和从反应性状态分布中汲取的初始条件，其特征在于通过平移，旋转和振动温度。与在相同的初始状态分布上评估的更粗糙的粒度分布 - 分布（DTD）模型相比，STD模型表明了在反应物制剂中的状态分辨率的额外益处具有相当的性能。从特定的初始状态开始，还导致更多样化的最终状态分布，需要更具表现力的神经网络与DTD相比。显式QCT模拟之间的直接比较，STD模型和广泛使用的Larsen-Borgnakke（LB）模型表明，STD模型是定量的，而LB模型最适合旋转分布$ P（J'）$和失败振动分布$ p（v'）$。因此，STD模型可以非常适合模拟非预测高速流，例如，使用直接仿真蒙特卡罗方法。

translated by 谷歌翻译

Periocular Biometrics: A Modality for Unconstrained Scenarios

Fernando Alonso-Fernandez , Josef Bigun , Julian Fierrez , Naser Damer , Hugo Proença , Arun Ross

分类：计算机视觉

2022-12-28

Periocular refers to the region of the face that surrounds the eye socket. This is a feature-rich area that can be used by itself to determine the identity of an individual. It is especially useful when the iris or the face cannot be reliably acquired. This can be the case of unconstrained or uncooperative scenarios, where the face may appear partially occluded, or the subject-to-camera distance may be high. However, it has received revived attention during the pandemic due to masked faces, leaving the ocular region as the only visible facial area, even in controlled scenarios. This paper discusses the state-of-the-art of periocular biometrics, giving an overall framework of its most significant research aspects.

translated by 谷歌翻译

MyI-Net: Fully Automatic Detection and Quantification of Myocardial Infarction from Cardiovascular MRI Images

Shuihua Wang , Ahmed M. S. E. K Abdelaty , Kelly Parke , J Ranjit Arnold , Gerry P McCann , Ivan Y Tyukin

分类：计算机视觉 | 机器学习

2022-12-28

A "heart attack" or myocardial infarction (MI), occurs when an artery supplying blood to the heart is abruptly occluded. The "gold standard" method for imaging MI is Cardiovascular Magnetic Resonance Imaging (MRI), with intravenously administered gadolinium-based contrast (late gadolinium enhancement). However, no "gold standard" fully automated method for the quantification of MI exists. In this work, we propose an end-to-end fully automatic system (MyI-Net) for the detection and quantification of MI in MRI images. This has the potential to reduce the uncertainty due to the technical variability across labs and inherent problems of the data and labels. Our system consists of four processing stages designed to maintain the flow of information across scales. First, features from raw MRI images are generated using feature extractors built on ResNet and MoblieNet architectures. This is followed by the Atrous Spatial Pyramid Pooling (ASPP) to produce spatial information at different scales to preserve more image context. High-level features from ASPP and initial low-level features are concatenated at the third stage and then passed to the fourth stage where spatial information is recovered via up-sampling to produce final image segmentation output into: i) background, ii) heart muscle, iii) blood and iv) scar areas. New models were compared with state-of-art models and manual quantification. Our models showed favorable performance in global segmentation and scar tissue detection relative to state-of-the-art work, including a four-fold better performance in matching scar pixels to contours produced by clinicians.

translated by 谷歌翻译

Push-the-Boundary: Boundary-aware Feature Propagation for Semantic Segmentation of 3D Point Clouds

Shenglan Du , Nail Ibrahimli , Jantien Stoter , Julian Kooij , Liangliang Nan

分类：计算机视觉

2022-12-23

Feedforward fully convolutional neural networks currently dominate in semantic segmentation of 3D point clouds. Despite their great success, they suffer from the loss of local information at low-level layers, posing significant challenges to accurate scene segmentation and precise object boundary delineation. Prior works either address this issue by post-processing or jointly learn object boundaries to implicitly improve feature encoding of the networks. These approaches often require additional modules which are difficult to integrate into the original architecture. To improve the segmentation near object boundaries, we propose a boundary-aware feature propagation mechanism. This mechanism is achieved by exploiting a multi-task learning framework that aims to explicitly guide the boundaries to their original locations. With one shared encoder, our network outputs (i) boundary localization, (ii) prediction of directions pointing to the object's interior, and (iii) semantic segmentation, in three parallel streams. The predicted boundaries and directions are fused to propagate the learned features to refine the segmentation. We conduct extensive experiments on the S3DIS and SensatUrban datasets against various baseline methods, demonstrating that our proposed approach yields consistent improvements by reducing boundary errors. Our code is available at https://github.com/shenglandu/PushBoundary.

translated by 谷歌翻译

hxtorch.snn: Machine-learning-inspired Spiking Neural Network Modeling on BrainScaleS-2

Philipp Spilger , Elias Arnold , Luca Blessing , Christian Mauch , Christian Pehle , Eric Müller , Johannes Schemmel

分类：神经与进化计算

2022-12-23

Neuromorphic systems require user-friendly software to support the design and optimization of experiments. In this work, we address this need by presenting our development of a machine learning-based modeling framework for the BrainScaleS-2 neuromorphic system. This work represents an improvement over previous efforts, which either focused on the matrix-multiplication mode of BrainScaleS-2 or lacked full automation. Our framework, called hxtorch.snn, enables the hardware-in-the-loop training of spiking neural networks within PyTorch, including support for auto differentiation in a fully-automated hardware experiment workflow. In addition, hxtorch.snn facilitates seamless transitions between emulating on hardware and simulating in software. We demonstrate the capabilities of hxtorch.snn on a classification task using the Yin-Yang dataset employing a gradient-based approach with surrogate gradients and densely sampled membrane observations from the BrainScaleS-2 hardware system.

translated by 谷歌翻译

A C++ Implementation of a Cartesian Impedance Controller for Robotic Manipulators

Matthias Mayr , Julian M. Salt-Ducaju

分类：机器人

2022-12-21

Cartesian impedance control is a type of motion control strategy for robots that improves safety in partially unknown environments by achieving a compliant behavior of the robot with respect to its external forces. This compliant robot behavior has the added benefit of allowing physical human guidance of the robot. In this paper, we propose a C++ implementation of compliance control valid for any torque-commanded robotic manipulator. The proposed controller implements Cartesian impedance control to track a desired end-effector pose. Additionally, joint impedance is projected in the nullspace of the Cartesian robot motion to track a desired robot joint configuration without perturbing the Cartesian motion of the robot. The proposed implementation also allows the robot to apply desired forces and torques to its environment. Several safety features such as filtering, rate limiting, and saturation are included in the proposed implementation. The core functionalities are in a re-usable base library and a Robot Operating System (ROS) ros_control integration is provided on top of that. The implementation was tested with the KUKA LBR iiwa robot and the Franka Emika Robot (Panda) both in simulation and with the physical robots.

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

DePlot: One-shot visual language reasoning by plot-to-table translation

Fangyu Liu , Julian Martin Eisenschlos , Francesco Piccinno , Syrine Krichene , Chenxi Pang , Kenton Lee , Mandar Joshi , Wenhu Chen , Nigel Collier , Yasemin Altun

分类：自然语言处理 | 人工智能 | 计算机视觉

2022-12-20

Visual language such as charts and plots is ubiquitous in the human world. Comprehending plots and charts requires strong reasoning skills. Prior state-of-the-art (SOTA) models require at least tens of thousands of training examples and their reasoning capabilities are still much limited, especially on complex human-written queries. This paper presents the first one-shot solution to visual language reasoning. We decompose the challenge of visual language reasoning into two steps: (1) plot-to-text translation, and (2) reasoning over the translated text. The key in this method is a modality conversion module, named as DePlot, which translates the image of a plot or chart to a linearized table. The output of DePlot can then be directly used to prompt a pretrained large language model (LLM), exploiting the few-shot reasoning capabilities of LLMs. To obtain DePlot, we standardize the plot-to-table task by establishing unified task formats and metrics, and train DePlot end-to-end on this task. DePlot can then be used off-the-shelf together with LLMs in a plug-and-play fashion. Compared with a SOTA model finetuned on more than >28k data points, DePlot+LLM with just one-shot prompting achieves a 24.0% improvement over finetuned SOTA on human-written queries from the task of chart QA.

translated by 谷歌翻译

Synthetic Pre-Training Tasks for Neural Machine Translation

Zexue He , Graeme Blackwood , Rameswar Panda , Julian McAuley , Rogerio Feris

分类：自然语言处理 | 人工智能

2022-12-19

Pre-training is an effective technique for ensuring robust performance on a variety of machine learning tasks. It typically depends on large-scale crawled corpora that can result in toxic or biased models. Such data can also be problematic with respect to copyright, attribution, and privacy. Pre-training with synthetic tasks and data is a promising way of alleviating such concerns since no real-world information is ingested by the model. Our goal in this paper is to understand what makes for a good pre-trained model when using synthetic resources. We answer this question in the context of neural machine translation by considering two novel approaches to translation model pre-training. Our first approach studies the effect of pre-training on obfuscated data derived from a parallel corpus by mapping words to a vocabulary of 'nonsense' tokens. Our second approach explores the effect of pre-training on procedurally generated synthetic parallel data that does not depend on any real human language corpus. Our empirical evaluation on multiple language pairs shows that, to a surprising degree, the benefits of pre-training can be realized even with obfuscated or purely synthetic parallel data. In our analysis, we consider the extent to which obfuscated and synthetic pre-training techniques can be used to mitigate the issue of hallucinated model toxicity.

translated by 谷歌翻译

MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering

Fangyu Liu , Francesco Piccinno , Syrine Krichene , Chenxi Pang , Kenton Lee , Mandar Joshi , Yasemin Altun , Nigel Collier , Julian Martin Eisenschlos

分类：自然语言处理 | 人工智能 | 计算机视觉

2022-12-19

Visual language data such as plots, charts, and infographics are ubiquitous in the human world. However, state-of-the-art vision-language models do not perform well on these data. We propose MatCha (Math reasoning and Chart derendering pretraining) to enhance visual language models' capabilities in jointly modeling charts/plots and language data. Specifically, we propose several pretraining tasks that cover plot deconstruction and numerical reasoning which are the key capabilities in visual language modeling. We perform the MatCha pretraining starting from Pix2Struct, a recently proposed image-to-text visual language model. On standard benchmarks such as PlotQA and ChartQA, the MatCha model outperforms state-of-the-art methods by as much as nearly 20%. We also examine how well MatCha pretraining transfers to domains such as screenshots, textbook diagrams, and document figures and observe overall improvement, verifying the usefulness of MatCha pretraining on broader visual language tasks.

translated by 谷歌翻译