In recent years distributional reinforcement learning has produced many state of the art results. Increasingly sample efficient Distributional algorithms for the discrete action domain have been developed over time that vary primarily in the way they parameterize their approximations of value distributions, and how they quantify the differences between those distributions. In this work we transfer three of the most well-known and successful of those algorithms (QR-DQN, IQN and FQF) to the continuous action domain by extending two powerful actor-critic algorithms (TD3 and SAC) with distributional critics. We investigate whether the relative performance of the methods for the discrete action space translates to the continuous case. To that end we compare them empirically on the pybullet implementations of a set of continuous control tasks. Our results indicate qualitative invariance regarding the number and placement of distributional atoms in the deterministic, continuous action setting.
translated by 谷歌翻译
目的:脑电图(EEG)和肌电图(EMG)是两个非侵入性的生物信号,它们在人类机器界面(HMI)技术(EEG-HMI和EMG-HMI范式)中广泛用于康复,用于康复的物理残疾人。将脑电图和EMG信号成功解码为各自的控制命令是康复过程中的关键步骤。最近,提出了几个基于卷积的神经网络(CNN)架构,它们直接将原始的时间序列信号映射到决策空间中,并同时执行有意义的特征提取和分类的过程。但是,这些网络是根据学习给定生物信号的预期特征量身定制的,并且仅限于单个范式。在这项工作中,我们解决了一个问题,即我们可以构建一个单个体系结构,该架构能够从不同的HMI范式中学习不同的功能并仍然成功地对其进行分类。方法:在这项工作中,我们引入了一个称为Controanet的单个混合模型,该模型基于CNN和Transformer架构,该模型对EEG-HMI和EMG-HMI范式同样有用。 Contranet使用CNN块在模型中引入电感偏置并学习局部依赖性,而变压器块则使用自我注意机制来学习信号中的长距离依赖性,这对于EEG和EMG信号的分类至关重要。主要结果:我们在三个属于EEG-HMI和EMG-HMI范式的公开数据集上评估并比较了Contronet与最先进的方法。 Contranet在所有不同类别任务(2级,3类,4级和10级解码任务)中的表现优于其对应。意义:结果表明,与当前的最新算法状态相比,从不同的HMI范式中学习不同的特征并概述了矛盾。
translated by 谷歌翻译
表面肌电学(SEMG)的精确解码是肌肉到机器接口(MMI)的关键和它们的应用。康复治疗。由于各种因素,包括皮肤厚度,体脂百分比和电极放置,SEMG信号具有高的互类互变异性。因此,获得训练有素的SEMG解码器的高泛化质量非常具有挑战性。通常,基于机器学习的SEMG解码器可以在特定于对象的数据上培训,或者单独地为每个用户验证或至少重新校准。即使,深度学习算法也产生了几种最新的SEMG解码结果,然而,由于SEMG数据的可用性有限,深度学习模型容易过度拟合。最近,转移学习域适应改善了各种机器学习任务的培训时间减少的泛化质量。在这项研究中,我们调查了使用权重初始化进行转移学习的有效性,以重新校正在新的科目数据上的两个不同预磨削的深度学习模型,并将它们的性能与特定于学科的模型进行比较。据我们所知,这是第一项研究,即彻底调查基于体重初始化的转移学习,并比较了对象特异性建模的转移学习。我们在各种设置下在三个公开的数据库上测试了我们的模型。平均过度通过所有设置,我们的转移学习方法改善了预训练模型的5〜\%,在没有微调的情况下,在特定于课程的型号上的12〜\%点,同时平均培训22〜\%较少的时期。我们的结果表明,转让学习可以更快地培训比用户特定的型号更少,并且只要有足够的数据,可以提高预磨料模型的性能。
translated by 谷歌翻译
Diabetic Retinopathy (DR) is considered one of the primary concerns due to its effect on vision loss among most people with diabetes globally. The severity of DR is mostly comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this paper, we adopt transformer-based learning models to capture the crucial features of retinal images to understand DR severity better. We work with ensembling image transformers, where we adopt four models, namely ViT (Vision Transformer), BEiT (Bidirectional Encoder representation for image Transformer), CaiT (Class-Attention in Image Transformers), and DeiT (Data efficient image Transformers), to infer the degree of DR severity from fundus photographs. For experiments, we used the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.
translated by 谷歌翻译
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
translated by 谷歌翻译
Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.
translated by 谷歌翻译
Uncertainty quantification is crucial to inverse problems, as it could provide decision-makers with valuable information about the inversion results. For example, seismic inversion is a notoriously ill-posed inverse problem due to the band-limited and noisy nature of seismic data. It is therefore of paramount importance to quantify the uncertainties associated to the inversion process to ease the subsequent interpretation and decision making processes. Within this framework of reference, sampling from a target posterior provides a fundamental approach to quantifying the uncertainty in seismic inversion. However, selecting appropriate prior information in a probabilistic inversion is crucial, yet non-trivial, as it influences the ability of a sampling-based inference in providing geological realism in the posterior samples. To overcome such limitations, we present a regularized variational inference framework that performs posterior inference by implicitly regularizing the Kullback-Leibler divergence loss with a CNN-based denoiser by means of the Plug-and-Play methods. We call this new algorithm Plug-and-Play Stein Variational Gradient Descent (PnP-SVGD) and demonstrate its ability in producing high-resolution, trustworthy samples representative of the subsurface structures, which we argue could be used for post-inference tasks such as reservoir modelling and history matching. To validate the proposed method, numerical tests are performed on both synthetic and field post-stack seismic data.
translated by 谷歌翻译
Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task.
translated by 谷歌翻译
Autonomous mobile agents such as unmanned aerial vehicles (UAVs) and mobile robots have shown huge potential for improving human productivity. These mobile agents require low power/energy consumption to have a long lifespan since they are usually powered by batteries. These agents also need to adapt to changing/dynamic environments, especially when deployed in far or dangerous locations, thus requiring efficient online learning capabilities. These requirements can be fulfilled by employing Spiking Neural Networks (SNNs) since SNNs offer low power/energy consumption due to sparse computations and efficient online learning due to bio-inspired learning mechanisms. However, a methodology is still required to employ appropriate SNN models on autonomous mobile agents. Towards this, we propose a Mantis methodology to systematically employ SNNs on autonomous mobile agents to enable energy-efficient processing and adaptive capabilities in dynamic environments. The key ideas of our Mantis include the optimization of SNN operations, the employment of a bio-plausible online learning mechanism, and the SNN model selection. The experimental results demonstrate that our methodology maintains high accuracy with a significantly smaller memory footprint and energy consumption (i.e., 3.32x memory reduction and 2.9x energy saving for an SNN model with 8-bit weights) compared to the baseline network with 32-bit weights. In this manner, our Mantis enables the employment of SNNs for resource- and energy-constrained mobile agents.
translated by 谷歌翻译
In this paper, we present an evolved version of the Situational Graphs, which jointly models in a single optimizable factor graph, a SLAM graph, as a set of robot keyframes, containing its associated measurements and robot poses, and a 3D scene graph, as a high-level representation of the environment that encodes its different geometric elements with semantic attributes and the relational information between those elements. Our proposed S-Graphs+ is a novel four-layered factor graph that includes: (1) a keyframes layer with robot pose estimates, (2) a walls layer representing wall surfaces, (3) a rooms layer encompassing sets of wall planes, and (4) a floors layer gathering the rooms within a given floor level. The above graph is optimized in real-time to obtain a robust and accurate estimate of the robot's pose and its map, simultaneously constructing and leveraging the high-level information of the environment. To extract such high-level information, we present novel room and floor segmentation algorithms utilizing the mapped wall planes and free-space clusters. We tested S-Graphs+ on multiple datasets including, simulations of distinct indoor environments, on real datasets captured over several construction sites and office environments, and on a real public dataset of indoor office environments. S-Graphs+ outperforms relevant baselines in the majority of the datasets while extending the robot situational awareness by a four-layered scene model. Moreover, we make the algorithm available as a docker file.
translated by 谷歌翻译