智能论文笔记

Predicting Osteoarthritis Progression in Radiographs via Unsupervised Representation Learning

Tianyu Han , Jakob Nikolas Kather , Federico Pedersoli , Markus Zimmermann , Sebastian Keil , Maximilian Schulze-Hagen , Marc Terwoelbeck , Peter Isfort , Christoph Haarburger , Fabian Kiessling

分类：计算机视觉 | 机器学习

2021-11-22

骨关节炎（OA）是影响全球人口大量比例的最常见的联合障碍，主要是老年人。尽管其个人和社会经济负担，但仍然无法可靠地预测OA的发病和进展。旨在填补这种诊断缺口，我们介绍了基于生成模型的无监督学习计划，以预测基于膝关节X线本的OA的未来发展。使用来自骨关节炎研究的纵向数据，我们探讨了潜在的时间轨迹，以预测患者未来的射线照片，达到八年的随访访问。我们的模型预测了对OA的进展的风险，并超越了其监督对应物，其投入由七位经验丰富的放射科医师提供。通过支持模型，灵敏度，特异性，阳性预测值和负预测值显着增加到42.1％至51.6％，从72.3％到88.6％，从28.4％到57.6％，83.9％至88.4％，分别在没有这种支撑的情况下，放射科医生仅比随机猜测更好地进行。尽管需要在训练阶段没有人为注释，但我们的预测模型可以提高对OA发作和进展的预测。

translated by 谷歌翻译

Discovering Efficient Periodic Behaviours in Mechanical Systems via Neural Approximators

Yannik Wotte , Sven Dummer , Nicolò Botteghi , Christoph Brune , Stefano Stramigioli , Federico Califano

分类：机器人

2022-12-29

It is well known that conservative mechanical systems exhibit local oscillatory behaviours due to their elastic and gravitational potentials, which completely characterise these periodic motions together with the inertial properties of the system. The classification of these periodic behaviours and their geometric characterisation are in an on-going secular debate, which recently led to the so-called eigenmanifold theory. The eigenmanifold characterises nonlinear oscillations as a generalisation of linear eigenspaces. With the motivation of performing periodic tasks efficiently, we use tools coming from this theory to construct an optimization problem aimed at inducing desired closed-loop oscillations through a state feedback law. We solve the constructed optimization problem via gradient-descent methods involving neural networks. Extensive simulations show the validity of the approach.

translated by 谷歌翻译

Behavioral Cloning via Search in Video PreTraining Latent Space

Federico Malato , Florian Leopold , Amogh Raut , Ville Hautamäki , Andrew Melnik

分类：机器学习 | 人工智能 | 计算机视觉

2022-12-27

Our aim is to build autonomous agents that can solve tasks in environments like Minecraft. To do so, we used an imitation learning-based approach. We formulate our control problem as a search problem over a dataset of experts' demonstrations, where the agent copies actions from a similar demonstration trajectory of image-action pairs. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge. Then the proximity search is repeated. Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.

translated by 谷歌翻译

Re-basin via implicit Sinkhorn differentiation

Fidel A. Guerrero Peña , Heitor Rapela Medeiros , Thomas Dubail , Masih Aminbeidokhti , Eric Granger , Marco Pedersoli

分类：计算机视觉

2022-12-22

The recent emergence of new algorithms for permuting models into functionally equivalent regions of the solution space has shed some light on the complexity of error surfaces, and some promising properties like mode connectivity. However, finding the right permutation is challenging, and current optimization techniques are not differentiable, which makes it difficult to integrate into a gradient-based optimization, and often leads to sub-optimal solutions. In this paper, we propose a Sinkhorn re-basin network with the ability to obtain the transportation plan that better suits a given objective. Unlike the current state-of-art, our method is differentiable and, therefore, easy to adapt to any task within the deep learning domain. Furthermore, we show the advantage of our re-basin method by proposing a new cost function that allows performing incremental learning by exploiting the linear mode connectivity property. The benefit of our method is compared against similar approaches from the literature, under several conditions for both optimal transport finding and linear mode connectivity. The effectiveness of our continual learning method based on re-basin is also shown for several common benchmark datasets, providing experimental results that are competitive with state-of-art results from the literature.

translated by 谷歌翻译

SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

Evin Pınar Örnek , Aravindhan K Krishnan , Shreekant Gayaka , Cheng-Hao Kuo , Arnie Sen , Nassir Navab , Federico Tombari

分类：计算机视觉

2022-12-22

Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the ``objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments. Our method, SupeRGB-D, groups pixels into small patches based on geometric cues and learns to merge the patches in a deep agglomerative clustering fashion. SupeRGB-D outperforms existing baselines on unseen objects while achieving similar performance on seen objects. Additionally, it is extremely lightweight (0.4 MB memory requirement) and suitable for mobile and robotic applications. The dataset split and code will be made publicly available upon acceptance.

translated by 谷歌翻译

Beyond Digital "Echo Chambers": The Role of Viewpoint Diversity in Political Discussion

Rishav Hada , Amir Ebrahimi Fard , Sarah Shugars , Federico Bianchi , Patricia Rossini , Dirk Hovy , Rebekah Tromble , Nava Tintarev

分类：自然语言处理

2022-12-18

Increasingly taking place in online spaces, modern political conversations are typically perceived to be unproductively affirming -- siloed in so called ``echo chambers'' of exclusively like-minded discussants. Yet, to date we lack sufficient means to measure viewpoint diversity in conversations. To this end, in this paper, we operationalize two viewpoint metrics proposed for recommender systems and adapt them to the context of social media conversations. This is the first study to apply these two metrics (Representation and Fragmentation) to real world data and to consider the implications for online conversations specifically. We apply these measures to two topics -- daylight savings time (DST), which serves as a control, and the more politically polarized topic of immigration. We find that the diversity scores for both Fragmentation and Representation are lower for immigration than for DST. Further, we find that while pro-immigrant views receive consistent pushback on the platform, anti-immigrant views largely operate within echo chambers. We observe less severe yet similar patterns for DST. Taken together, Representation and Fragmentation paint a meaningful and important new picture of viewpoint diversity.

translated by 谷歌翻译

Online Handbook of Argumentation for AI: Volume 3

Lars Bengel , Elfia Bezou-Vrakatseli , Lydia Blümel , Federico Castagna , Giulia D'Agostino , Daphne Odekerken , Minal Suresh Patil , Jordan Robinson , Hao Wu , Andreas Xydis

分类：人工智能

2022-12-15

This volume contains revised versions of the papers selected for the third volume of the Online Handbook of Argumentation for AI (OHAAI). Previously, formal theories of argument and argument interaction have been proposed and studied, and this has led to the more recent study of computational models of argument. Argumentation, as a field within artificial intelligence (AI), is highly relevant for researchers interested in symbolic representations of knowledge and defeasible reasoning. The purpose of this handbook is to provide an open access and curated anthology for the argumentation research community. OHAAI is designed to serve as a research hub to keep track of the latest and upcoming PhD-driven research on the theory and application of argumentation in all areas related to AI.

translated by 谷歌翻译

TeTIm-Eval: a novel curated evaluation data set for comparing text-to-image models

Federico A. Galatolo , Mario G. C. A. Cimino , Edoardo Cogotti

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-12-15

Evaluating and comparing text-to-image models is a challenging problem. Significant advances in the field have recently been made, piquing interest of various industrial sectors. As a consequence, a gold standard in the field should cover a variety of tasks and application contexts. In this paper a novel evaluation approach is experimented, on the basis of: (i) a curated data set, made by high-quality royalty-free image-text pairs, divided into ten categories; (ii) a quantitative metric, the CLIP-score, (iii) a human evaluation task to distinguish, for a given text, the real and the generated images. The proposed method has been applied to the most recent models, i.e., DALLE2, Latent Diffusion, Stable Diffusion, GLIDE and Craiyon. Early experimental results show that the accuracy of the human judgement is fully coherent with the CLIP-score. The dataset has been made available to the public.

translated by 谷歌翻译

SST: Real-time End-to-end Monocular 3D Reconstruction via Sparse Spatial-Temporal Guidance

Chenyangguang Zhang , Zhiqiang Lou , Yan Di , Federico Tombari , Xiangyang Ji

分类：计算机视觉

2022-12-13

Real-time monocular 3D reconstruction is a challenging problem that remains unsolved. Although recent end-to-end methods have demonstrated promising results, tiny structures and geometric boundaries are hardly captured due to their insufficient supervision neglecting spatial details and oversimplified feature fusion ignoring temporal cues. To address the problems, we propose an end-to-end 3D reconstruction network SST, which utilizes Sparse estimated points from visual SLAM system as additional Spatial guidance and fuses Temporal features via a novel cross-modal attention mechanism, achieving more detailed reconstruction results. We propose a Local Spatial-Temporal Fusion module to exploit more informative spatial-temporal cues from multi-view color information and sparse priors, as well a Global Spatial-Temporal Fusion module to refine the local TSDF volumes with the world-frame model from coarse to fine. Extensive experiments on ScanNet and 7-Scenes demonstrate that SST outperforms all state-of-the-art competitors, whilst keeping a high inference speed at 59 FPS, enabling real-world applications with real-time requirements.

translated by 谷歌翻译

Industry Best Practices in Robotics Software Engineering

Robert Bocchino , Arne Nordmann , Allison Thackston , Andreas Angerer , Federico Ciccozzi , Ivano Malavolta , Andreas Wortmann

分类：机器人

2022-12-09

Robotics software is pushing the limits of software engineering practice. The 3rd International Workshop on Robotics Software Engineering held a panel on "the best practices for robotic software engineering". This article shares the key takeaways that emerged from the discussion among the panelists and the workshop, ranging from architecting practices at the NASA/Caltech Jet Propulsion Laboratory, model-driven development at Bosch, development and testing of autonomous driving systems at Waymo, and testing of robotics software at XITASO. Researchers and practitioners can build on the contents of this paper to gain a fresh perspective on their activities and focus on the most pressing practices and challenges in developing robotics software today.

translated by 谷歌翻译