智能论文笔记

Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review

Enes Altuncu , Virginia N. L. Franqueira , Shujun Li

分类：计算机视觉 | 人工智能 | 机器学习

2022-08-21

AI的最新进展，尤其是深度学习，导致创建新的现实合成媒体（视频，图像和音频）以及对现有媒体的操纵的创建显着增加，这导致了新术语的创建。 'deepfake'。基于英语和中文中的研究文献和资源，本文对Deepfake进行了全面的概述，涵盖了这一新兴概念的多个重要方面，包括1）不同的定义，2）常用的性能指标和标准以及3）与DeepFake相关的数据集，挑战，比赛和基准。此外，该论文还报告了2020年和2021年发表的12条与DeepFake相关的调查论文的元评估，不仅关注上述方面，而且集中在对关键挑战和建议的分析上。我们认为，就涵盖的各个方面而言，本文是对深层的最全面评论，也是第一个涵盖英语和中国文学和资源的文章。

translated by 谷歌翻译

A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception

Keenan Jones , Enes Altuncu , Virginia N. L. Franqueira , Yichao Wang , Shujun Li

分类：自然语言处理

2022-08-11

近年来，旨在生成模仿人类语言流利性和连贯性的文本的系统能力的实质性增长。由此，已经进行了大量研究，旨在检查这些自然语言发生器（NLG）对广泛任务的潜在用途。强大的文本生成器能够令人信服地模仿人类写作的能力越来越多地提高了欺骗和其他形式的危险滥用的潜力。随着这些系统的改善，很难区分人文编写和机器生成的文本，恶意演员可以将这些强大的NLG系统利用到各种各样的目的，包括创建假新闻和错误信息，一代假货在线产品评论，或通过聊天机器人作为说服用户泄露私人信息的手段。在本文中，我们通过对NLG研究的119条类似调查的论文进行识别和检查，概述了NLG领域。从这些已确定的论文中，我们概述了构成NLG的中心概念的拟议高级分类法，包括用于开发广义NLG系统的方法，评估了这些系统的方法以及存在的流行NLG任务和存在的子任务和子任务。反过来，我们就当前的研究提供了对这些项目的概述和讨论，并提供了NLG在欺骗和检测系统中的潜在作用以抵消这些威胁的潜在作用。此外，我们讨论了NLG的更广泛挑战，包括现有文本生成系统经常表现出的偏见风险。这项工作为NLG领域的滥用潜力提供了广泛的概述，旨在提供对这一快速发展的研究领域的高级了解。

translated by 谷歌翻译

Neural source/sink phase connectivity in developmental dyslexia by means of interchannel causality

I. RodrÍguez-RodrÍguez , A. Ortiz , N. J. Gallego-Molina , M. A. Formoso , W. L. Woo

分类：人工智能

2023-01-02

While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.

translated by 谷歌翻译

General multi-fidelity surrogate models: Framework and active learning strategies for efficient rare event simulation

Promit Chakroborty , Somayajulu L. N. Dhulipala , Yifeng Che , Wen Jiang , Benjamin W. Spencer , Jason D. Hales , Michael D. Shields

分类：机器学习 | (统计)机器学习

2022-12-07

Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity surrogate modeling strategy in which the multi-fidelity surrogate is assembled using an active learning strategy using an on-the-fly model adequacy assessment set within a subset simulation framework for efficient reliability analysis. The multi-fidelity surrogate is assembled by first applying a Gaussian process correction to each low-fidelity model and assigning a model probability based on the model's local predictive accuracy and cost. Three strategies are proposed to fuse these individual surrogates into an overall surrogate model based on model averaging and deterministic/stochastic model selection. The strategies also dictate which model evaluations are necessary. No assumptions are made about the relationships between low-fidelity models, while the high-fidelity model is assumed to be the most accurate and most computationally expensive model. Through two analytical and two numerical case studies, including a case study evaluating the failure probability of Tristructural isotropic-coated (TRISO) nuclear fuels, the algorithm is shown to be highly accurate while drastically reducing the number of high-fidelity model calls (and hence computational cost).

translated by 谷歌翻译

Automated segmentation of microvessels in intravascular OCT images using deep learning

Juhwan Lee , Justin N. Kim , Lia Gomez-Perez , Yazan Gharaibeh , Issam Motairek , Ga-briel T. R. Pereira , Vladislav N. Zimin , Luis A. P. Dallan , Ammar Hoori , Sadeer Al-Kindi

分类：计算机视觉 | 机器学习

2022-10-01

To analyze this characteristic of vulnerability, we developed an automated deep learning method for detecting microvessels in intravascular optical coherence tomography (IVOCT) images. A total of 8,403 IVOCT image frames from 85 lesions and 37 normal segments were analyzed. Manual annotation was done using a dedicated software (OCTOPUS) previously developed by our group. Data augmentation in the polar (r,{\theta}) domain was applied to raw IVOCT images to ensure that microvessels appear at all possible angles. Pre-processing methods included guidewire/shadow detection, lumen segmentation, pixel shifting, and noise reduction. DeepLab v3+ was used to segment microvessel candidates. A bounding box on each candidate was classified as either microvessel or non-microvessel using a shallow convolutional neural network. For better classification, we used data augmentation (i.e., angle rotation) on bounding boxes with a microvessel during network training. Data augmentation and pre-processing steps improved microvessel segmentation performance significantly, yielding a method with Dice of 0.71+/-0.10 and pixel-wise sensitivity/specificity of 87.7+/-6.6%/99.8+/-0.1%. The network for classifying microvessels from candidates performed exceptionally well, with sensitivity of 99.5+/-0.3%, specificity of 98.8+/-1.0%, and accuracy of 99.1+/-0.5%. The classification step eliminated the majority of residual false positives, and the Dice coefficient increased from 0.71 to 0.73. In addition, our method produced 698 image frames with microvessels present, compared to 730 from manual analysis, representing a 4.4% difference. When compared to the manual method, the automated method improved microvessel continuity, implying improved segmentation performance. The method will be useful for research purposes as well as potential future treatment planning.

translated by 谷歌翻译

Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter

Ruben Villarreal , Nikolaos N. Vlassis , Nhon N. Phan , Tommie A. Catanach , Reese E. Jones , Nathaniel A. Trask , Sharlotte L. B. Kramer , WaiChing Sun

分类：机器学习

2022-09-27

实验数据的获取成本很高，这使得很难校准复杂模型。对于许多型号而言，鉴于有限的实验预算，可以产生最佳校准的实验设计并不明显。本文介绍了用于设计实验的深钢筋学习（RL）算法，该算法通过Kalman Filter（KF）获得的Kullback-Leibler（KL）差异测量的信息增益最大化。这种组合实现了传统方法太昂贵的快速在线实验的实验设计。我们将实验的可能配置作为决策树和马尔可夫决策过程（MDP），其中每个增量步骤都有有限的操作选择。一旦采取了动作，就会使用各种测量来更新实验状态。该新数据导致KF对参数进行贝叶斯更新，该参数用于增强状态表示。与NASH-SUTCLIFFE效率（NSE）指数相反，该指数需要额外的抽样来检验前进预测的假设，KF可以通过直接估计通过其他操作获得的新数据值来降低实验的成本。在这项工作中，我们的应用集中在材料的机械测试上。使用复杂的历史依赖模型的数值实验用于验证RL设计实验的性能并基准测试实现。

translated by 谷歌翻译

Physics-Informed Machine Learning of Dynamical Systems for Efficient Bayesian Inference

Somayajulu L. N. Dhulipala , Yifeng Che , Michael D. Shields

分类： (统计)机器学习 | 机器学习

2022-09-19

尽管No-U-Turn采样器（螺母）是执行贝叶斯推断的广泛采用方法，但它需要许多后梯度，在实践中计算可能很昂贵。最近，人们对基于物理的动力学（或哈密顿）系统和哈密顿神经网络（HNNS）的机器学习引起了重大兴趣。但是，这些类型的体系结构尚未应用于有效地解决贝叶斯推论问题。我们建议使用HNN有效地进行贝叶斯推断，而无需大量的后梯度。我们向HNNS（L-HNN）引入潜在变量输出，以提高表达性和减少的集成误差。我们将L-HNN集成在坚果中，并进一步提出一种在线错误监控方案，以防止L-HNNS可能几乎没有培训数据的区域中采样堕落。考虑到几种复杂的高维后密度，并将其性能与螺母进行比较，我们证明了在线错误监测中的L-HNN。

translated by 谷歌翻译

Brain Imaging Generation with Latent Diffusion Models

Walter H. L. Pinaya , Petru-Daniel Tudosiu , Jessica Dafflon , Pedro F da Costa , Virginia Fernandez , Parashkev Nachev , Sebastien Ourselin , M. Jorge Cardoso

分类：计算机视觉

2022-09-15

深度神经网络在医学图像分析中带来了显着突破。但是，由于其渴望数据的性质，医学成像项目中适度的数据集大小可能会阻碍其全部潜力。生成合成数据提供了一种有希望的替代方案，可以补充培训数据集并进行更大范围的医学图像研究。最近，扩散模型通过产生逼真的合成图像引起了计算机视觉社区的注意。在这项研究中，我们使用潜在扩散模型探索从高分辨率3D脑图像中生成合成图像。我们使用来自英国生物银行数据集的T1W MRI图像（n = 31,740）来训练我们的模型，以了解脑图像的概率分布，该脑图像以协变量为基础，例如年龄，性别和大脑结构量。我们发现我们的模型创建了现实的数据，并且可以使用条件变量有效地控制数据生成。除此之外，我们创建了一个带有100,000次脑图像的合成数据集，并使科学界公开使用。

translated by 谷歌翻译

Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality

Tiffany J. Callahan , Adrianne L. Stefanski , Jordan M. Wyrwa , Chenjie Zeng , Anna Ostropolets , Juan M. Banda , William A. Baumgartner Jr. , Richard D. Boyce , Elena Casiraghi , Ben D. Coleman

分类：人工智能

2022-09-10

通用数据模型解决了标准化电子健康记录（EHR）数据的许多挑战，但无法将其集成深度表型所需的资源。开放的生物学和生物医学本体论（OBO）铸造本体论提供了可用于生物学知识的语义计算表示，并能够整合多种生物医学数据。但是，将EHR数据映射到OBO Foundry本体论需要大量的手动策展和域专业知识。我们介绍了一个框架，用于将观察性医学成果合作伙伴关系（OMOP）标准词汇介绍给OBO铸造本体。使用此框架，我们制作了92,367条条件，8,615种药物成分和10,673个测量结果的映射。域专家验证了映射准确性，并且在24家医院进行检查时，映射覆盖了99％的条件和药物成分和68％的测量结果。最后，我们证明OMOP2OBO映射可以帮助系统地识别可能受益于基因检测的未诊断罕见病患者。

translated by 谷歌翻译

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

R. Abbasi , M. Ackermann , J. Adams , N. Aggarwal , J. A. Aguilar , M. Ahlers , M. Ahrens , J. M. Alameddine , A. A. Alves Jr. , N. M. Amin

分类：机器学习

2022-09-07

ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列，该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战，这是由于探测器的几何形状，不均匀的散射和冰中光的吸收，并且低于100 GEV的光，每个事件产生的信号光子数量相对较少。为了应对这一挑战，可以将ICECUBE事件表示为点云图形，并将图形神经网络（GNN）作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开，对不同的中微子事件类型进行分类，并重建沉积的能量，方向和相互作用顶点。基于仿真，我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术，包括已知系统不确定性的影响。对于中微子事件分类，与当前的IceCube方法相比，GNN以固定的假阳性速率（FPR）提高了信号效率的18％。另外，GNN在固定信号效率下将FPR的降低超过8（低于半百分比）。对于能源，方向和相互作用顶点的重建，与当前最大似然技术相比，分辨率平均提高了13％-20％。当在GPU上运行时，GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件，这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。

translated by 谷歌翻译