智能论文笔记

AI-coupled HPC Workflows

Shantenu Jha , Vincent R. Pascuzzi , Matteo Turilli

分类：人工智能 | 机器学习

2022-08-24

越来越多的科学发现需要复杂而可扩展的工作流程。工作流程已成为``新应用程序''，其中多尺度计算活动包括多个和异构的可执行任务。特别是，将AI/ML模型引入传统的HPC工作流程已成为高度准确建模的推动力，与传统方法相比，通常会减少计算需求。本章将讨论将AI/ML模型集成到HPC计算的各种模式，从而导致不同类型的AI耦合HPC工作流程。激励了跨科学领域的AI/ML和HPC耦合的需求越来越多，然后以每种模式的许多生产级用例来体现。我们还讨论了极端尺度AI耦合的HPC广告系列的主要挑战 - 任务异质性，适应性，性能 - 以及旨在解决这些问题的几种框架和中间件解决方案。尽管HPC工作流程和AI/ML计算范例都是独立有效的，但我们强调了它们的整合和最终收敛如何导致一系列领域的科学性能的显着改善，最终导致了科学探索，否则就无法实现。

translated by 谷歌翻译

HTML版本

Coupling streaming AI and HPC ensembles to achieve 100-1000x faster biomolecular simulations

Alexander Brace , Igor Yakushin , Heng Ma , Anda Trifan , Todd Munson , Ian Foster , Arvind Ramanathan , Hyungro Lee , Matteo Turilli , Shantenu Jha

分类：机器学习

2021-04-10

基于机器学习（ML）的转向可以通过在线选择更科学意义的计算来提高基于合奏的模拟的性能。我们提出了DeepDrivemd，这是ML驱动的科学模拟转向的框架，我们用来通过在大型平行计算机上的有效耦合ML和HPC来实现分子动力学（MD）性能的稳定性提高。我们讨论了DeepDrivemd的设计，并描述了其性能。我们证明，与其他方法相对于其他方法，DeepDrivemd可以在100-1000倍加速度之间达到100-1000倍的加速度，这是通过执行的模拟时间量来衡量的，同时覆盖了模拟过程中采样的状态所量化的相同构象景观。实验是在最多1020个节点的领导级平台上进行的。该结果将DeepDrivemd作为ML驱动的HPC模拟方案的高性能框架建立，该场景支持不同的MD仿真和ML后端，并通过改善当前计算能力来改善长度和时间尺度来实现新的科学见解。

translated by 谷歌翻译

Asynchronous Execution of Heterogeneous Tasks in AI-coupled HPC Workflows

Vincent R. Pascuzzi , Matteo Turilli , Shantenu Jha

分类：人工智能 | 机器学习

2022-08-23

异质的科学工作流程包括许多类型的任务和依赖性。能够在异质平台上安排和提交不同任务类型的中间件必须允许对任务的异步执行，以改善资源利用，任务吞吐量和减少MakePAN。在本文中，我们介绍了一类重要的异构工作流程，即AI驱动的HPC工作流程，以调查异步任务执行要求和属性。我们对任意工作流程允许的异步性度进行了建模，并提出了关键指标，这些指标可用于确定使用异步执行时的定性利益。我们的实验代表了重要的科学驱动因素，在峰会上进行了大规模进行，并且由于异步执行而引起的性能增强与我们的模型一致。

translated by 谷歌翻译

Molecular Dynamics Simulations on Cloud Computing and Machine Learning Platforms

Prateek Sharma , Vikram Jadhao

分类：机器学习

2021-11-11

科学计算应用从高性能计算基础设施（如超级计算机）受益匪浅。但是，我们在这些应用程序的计算结构，设计和要求中看到了范式转变。越来越多地，数据驱动和机器学习方法正在用于支持，加速和增强科学计算应用，尤其是分子动力学模拟。同时，云计算平台越来越多地吸引科学计算，提供“无限”计算功率，更容易编程和部署模型，以及访问计算加速器，例如TPU（张量处理单元）。这种机器学习（ML）和云计算的这种汇合代表了云和系统研究人员的令人兴奋的机会。 ML辅助分子动力学模拟是一类新的工作量，并且具有独特的计算模式。这些模拟为低成本和高性能执行提供了新的挑战。我们认为，瞬态云资源，如低成本的抢占云VM，可以是这款新工作负载的可行平台。最后，我们在云资源管理中展示了一些低悬垂的水果和长期挑战，以及分子动力学模拟将分子动力学模拟的闪烁平台（如纹身流程）集成。

translated by 谷歌翻译

Recent Developments in Structure-Based Virtual Screening Approaches

Christoph Gorgulla

分类：机器学习

2022-11-06

Drug development is a wide scientific field that faces many challenges these days. Among them are extremely high development costs, long development times, as well as a low number of new drugs that are approved each year. To solve these problems, new and innovate technologies are needed that make the drug discovery process of small-molecules more time and cost-efficient, and which allow to target previously undruggable target classes such as protein-protein interactions. Structure-based virtual screenings have become a leading contender in this context. In this review, we give an introduction to the foundations of structure-based virtual screenings, and survey their progress in the past few years. We outline key principles, recent success stories, new methods, available software, and promising future research directions. Virtual screenings have an enormous potential for the development of new small-molecule drugs, and are already starting to transform early-stage drug discovery.

translated by 谷歌翻译

Snowmass 2021 Computational Frontier CompF03 Topical Group Report: Machine Learning

Phiala Shanahan , Kazuhiro Terao , Daniel Whiteson

分类：人工智能

2022-09-15

机器学习（ML）与高能物理学（HEP）的快速发展的交集给我们的社区带来了机会和挑战。远远超出了标准ML工具在HEP问题上的应用，这两个领域的一代人才素养正在开发真正的新的和潜在的革命性方法。迫切需要支持跨学科社区推动这些发展的需求，包括在这两个领域的交汇处为专门研究提供资金，在大学投资高性能计算以及调整分配政策以支持这项工作，开发社区工具和标准，并为年轻研究人员提供教育和职业道路，从而吸引了机器学习的智力活力，以吸引高能量物理学。

translated by 谷歌翻译

RLOps: Development Life-cycle of Reinforcement Learning Aided Open RAN

Peizheng Li , Jonathan Thomas , Xiaoyang Wang , Ahmed Khalil , Abdelrahim Ahmad , Rui Inacio , Shipra Kapoor , Arjun Parekh , Angela Doufexi , Arman Shojaeifard

分类：机器学习

2021-11-12

无线电接入网络（RAN）技术继续见证巨大的增长，开放式运行越来越最近的势头。在O-RAN规范中，RAN智能控制器（RIC）用作自动化主机。本文介绍了对O-RAN堆栈相关的机器学习（ML）的原则，特别是加强学习（RL）。此外，我们审查无线网络的最先进的研究，并将其投入到RAN框架和O-RAN架构的层次结构上。我们在整个开发生命周期中提供ML / RL模型面临的挑战的分类：从系统规范到生产部署（数据采集，模型设计，测试和管理等）。为了解决挑战，我们将一组现有的MLOPS原理整合，当考虑RL代理时，具有独特的特性。本文讨论了系统的生命周期模型开发，测试和验证管道，称为：RLOPS。我们讨论了RLOP的所有基本部分，包括：模型规范，开发和蒸馏，生产环境服务，运营监控，安全/安全和数据工程平台。根据这些原则，我们提出了最佳实践，以实现自动化和可重复的模型开发过程。

translated by 谷歌翻译

Data-Centric Engineering: integrating simulation, machine learning and statistics. Challenges and Opportunities

Indranil Pan , Lachlan Mason , Omar Matar

分类：机器学习

2021-11-07

机器学习的最新进展，加上低成本计算，廉价流传感器，数据存储和云技术的可用性导致了广泛的多学科研究活动，具有商业利益攸关方的重大兴趣和投资。基于物理方程式的机械模型，纯粹的数据驱动统计方法代表建模光谱的两端。新的混合动力车，以数据为中心的工程方法，利用世界各国和整合模拟和数据，都是一种强大的工具，具有对物理学科的变革影响。我们在集成模拟，机器学习和统计数据中审查了新兴领域的关键研究趋势和应用场景。我们突出了这种综合愿景可以解锁和概述阻止其实现的关键挑战的机会。我们还讨论了该领域的翻译方面的瓶颈以及现有劳动力和未来大学毕业生的长期上升要求。

translated by 谷歌翻译

Beyond 5G Networks: Integration of Communication, Computing, Caching, and Control

Musbahu Mohammed Adam , Liqiang Zhao , Kezhi Wang , Zhu Han

分类：机器学习

2022-12-26

In recent years, the exponential proliferation of smart devices with their intelligent applications poses severe challenges on conventional cellular networks. Such challenges can be potentially overcome by integrating communication, computing, caching, and control (i4C) technologies. In this survey, we first give a snapshot of different aspects of the i4C, comprising background, motivation, leading technological enablers, potential applications, and use cases. Next, we describe different models of communication, computing, caching, and control (4C) to lay the foundation of the integration approach. We review current state-of-the-art research efforts related to the i4C, focusing on recent trends of both conventional and artificial intelligence (AI)-based integration approaches. We also highlight the need for intelligence in resources integration. Then, we discuss integration of sensing and communication (ISAC) and classify the integration approaches into various classes. Finally, we propose open challenges and present future research directions for beyond 5G networks, such as 6G.

translated by 谷歌翻译

A Survey of Machine Learning for Computer Architecture and Systems

Nan Wu , Yuan Xie

分类：机器学习

2021-02-16

计算机架构和系统已优化了很长时间，以便高效执行机器学习（ML）模型。现在，是时候重新考虑ML和系统之间的关系，并让ML转换计算机架构和系统的设计方式。这有一个双重含义：改善设计师的生产力，以及完成良性周期。在这篇论文中，我们对应用ML进行计算机架构和系统设计的工作进行了全面的审查。首先，我们考虑ML技术在架构/系统设计中的典型作用，即快速预测建模或设计方法，我们执行高级分类学。然后，我们总结了通过ML技术解决的计算机架构/系统设计中的常见问题，并且所用典型的ML技术来解决它们中的每一个。除了在狭义中强调计算机架构外，我们采用数据中心可被认为是仓库规模计算机的概念;粗略的计算机系统中提供粗略讨论，例如代码生成和编译器;我们还注意ML技术如何帮助和改造设计自动化。我们进一步提供了对机会和潜在方向的未来愿景，并设想应用ML的计算机架构和系统将在社区中蓬勃发展。

translated by 谷歌翻译

Deep Learning-Driven Edge Video Analytics: A Survey

Renjie Xu , Saiedeh Razavi , Rong Zheng

分类：计算机视觉 | 机器学习

2022-11-28

Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.

translated by 谷歌翻译

Predictive Scale-Bridging Simulations through Active Learning

Satish Karra , Mohamed Mehana , Nicholas Lubbers , Yu Chen , Abdourahmane Diaw , Javier E. Santos , Aleksandra Pachalieva , Robert S. Pavel , Jeffrey R. Haack , Michael McKerns

分类：机器学习 | 人工智能 | (统计)机器学习

2022-09-20

在整个计算科学中，越来越需要利用原始计算马力的持续改进，通过对蛮力的尺度锻炼的尺度增加，以增加网状元素数量的增加。例如，如果不考虑分子水平的相互作用，就不可能对纳米多孔介质的转运进行定量预测，即从紧密的页岩地层提取至关重要的碳氢化合物。同样，惯性限制融合模拟依赖于数值扩散来模拟分子效应，例如非本地转运和混合，而无需真正考虑分子相互作用。考虑到这两个不同的应用程序，我们开发了一种新颖的功能，该功能使用主动学习方法来优化局部细尺度模拟的使用来告知粗尺度流体动力学。我们的方法解决了三个挑战：预测连续性粗尺度轨迹，以推测执行新的精细分子动力学计算，动态地更新细度计算中的粗尺度，并量化神经网络模型中的不确定性。

translated by 谷歌翻译

Technology Readiness Levels for Machine Learning Systems

Alexander Lavin , Ciarán M. Gilligan-Lee , Alessya Visnjic , Siddha Ganju , Dava Newman , Atılım Güneş Baydin , Sujoy Ganguly , Danny Lange , Amit Sharma , Stephan Zheng

分类：机器学习 | 人工智能

2021-01-11

机器学习（ML）系统的开发和部署可以用现代工具轻松执行，但该过程通常是匆忙和意思是结束的。缺乏勤奋会导致技术债务，范围蠕变和未对准的目标，模型滥用和失败，以及昂贵的后果。另一方面，工程系统遵循明确定义的流程和测试标准，以简化高质量，可靠的结果的开发。极端是航天器系统，其中关键任务措施和鲁棒性在开发过程中根深蒂固。借鉴航天器工程和ML的经验（通过域名通过产品的研究），我们开发了一种经过验证的机器学习开发和部署的系统工程方法。我们的“机器学习技术准备水平”（MLTRL）框架定义了一个原则的过程，以确保强大，可靠和负责的系统，同时为ML工作流程流线型，包括来自传统软件工程的关键区别。 MLTRL甚至更多，MLTRL为跨团队和组织的人们定义了一个人工智能和机器学习技术的人员。在这里，我们描述了通过生产化和部署在医学诊断，消费者计算机视觉，卫星图像和粒子物理学等领域，以通过生产和部署在基本研究中开发ML方法的几个现实世界使用情况的框架和阐明。

translated by 谷歌翻译

Innovations in Integrating Machine Learning and Agent-Based Modeling of Biomedical Systems

Nikita Sivakumar , Cameron Mura , Shayn M. Peirce

分类：机器学习

2022-06-02

Agent-based modeling (ABM) is a well-established paradigm for simulating complex systems via interactions between constituent entities. Machine learning (ML) refers to approaches whereby statistical algorithms 'learn' from data on their own, without imposing a priori theories of system behavior. Biological systems -- from molecules, to cells, to entire organisms -- consist of vast numbers of entities, governed by complex webs of interactions that span many spatiotemporal scales and exhibit nonlinearity, stochasticity and intricate coupling between entities. The macroscopic properties and collective dynamics of such systems are difficult to capture via continuum modelling and mean-field formalisms. ABM takes a 'bottom-up' approach that obviates these difficulties by enabling one to easily propose and test a set of well-defined 'rules' to be applied to the individual entities (agents) in a system. Evaluating a system and propagating its state over discrete time-steps effectively simulates the system, allowing observables to be computed and system properties to be analyzed. Because the rules that govern an ABM can be difficult to abstract and formulate from experimental data, there is an opportunity to use ML to help infer optimal, system-specific ABM rules. Once such rule-sets are devised, ABM calculations can generate a wealth of data, and ML can be applied there too -- e.g., to probe statistical measures that meaningfully describe a system's stochastic properties. As an example of synergy in the other direction (from ABM to ML), ABM simulations can generate realistic datasets for training ML algorithms (e.g., for regularization, to mitigate overfitting). In these ways, one can envision various synergistic ABM$\rightleftharpoons$ML loops. This review summarizes how ABM and ML have been integrated in contexts that span spatiotemporal scales, from cellular to population-level epidemiology.

translated by 谷歌翻译

Holistic Network Virtualization and Pervasive Network Intelligence for 6G

Xuemin , Shen , Jie Gao , Wen Wu , Mushu Li , Conghao Zhou , Weihua Zhuang

分类：人工智能

2023-01-02

In this tutorial paper, we look into the evolution and prospect of network architecture and propose a novel conceptual architecture for the 6th generation (6G) networks. The proposed architecture has two key elements, i.e., holistic network virtualization and pervasive artificial intelligence (AI). The holistic network virtualization consists of network slicing and digital twin, from the aspects of service provision and service demand, respectively, to incorporate service-centric and user-centric networking. The pervasive network intelligence integrates AI into future networks from the perspectives of networking for AI and AI for networking, respectively. Building on holistic network virtualization and pervasive network intelligence, the proposed architecture can facilitate three types of interplay, i.e., the interplay between digital twin and network slicing paradigms, between model-driven and data-driven methods for network management, and between virtualization and AI, to maximize the flexibility, scalability, adaptivity, and intelligence for 6G networks. We also identify challenges and open issues related to the proposed architecture. By providing our vision, we aim to inspire further discussions and developments on the potential architecture of 6G.

translated by 谷歌翻译

MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing Systems

Shreshth Tuli , Giuliano Casale , Nicholas R. Jennings

分类：人工智能

2021-12-14

工作流程调度是一个并行和分布式计算（PDC）的长期研究，旨在有效地利用计算资源来满足用户的服务要求。最近提出的调度方法利用边缘计算平台的低响应时间来优化服务质量（QoS）。然而，由于计算异质性，移动设备的延迟以及工作负载资源要求的挥发性，因此由于计算异质性而挑战，在移动边缘云系统中的调度工作流程应用是具有挑战性的。为了克服这些困难，它是必不可少的，但同时具有挑战性，开发一种有效地模拟QoS目标的长视力优化方案。在这项工作中，我们提出了MCDS：Monte Carlo学习使用Deep代理模型来有效地安排移动边缘云计算系统中的工作流程应用。 MCD是一种基于人工智能（AI）的调度方法，它使用基于树的搜索策略和基于深度神经网络的代理模型来估计即时动作的长期QoS影响，以实现调度决策的鲁棒优化。物理和模拟边缘云试验台的实验表明，MCD在能耗，响应时间，SLA违规方面可以改善最先进的方法，违规和成本分别至少为6.13,4.56,45.09和30.71％。

translated by 谷歌翻译

Towards Green Automated Machine Learning: Status Quo and Future Directions

Tanja Tornede , Alexander Tornede , Jonas Hanselle , Marcel Wever , Felix Mohr , Eyke Hüllermeier

分类：机器学习

2021-11-10

自动化机器学习（Automl）努力自动配置机器学习算法及其组合的整体（软件）解决方案 - 机器学习管道 - 针对手头的学习任务（数据集）量身定制。在过去十年中，Automl已成为具有数百个贡献的热门研究课题。虽然Automl提供了许多前景，但也称它也是相当资源密集的，这是其主要批评的主要观点之一。高资源消耗的主要原因是许多方法依赖于许多ML管道的（昂贵）评估，同时寻找良好的候选者。由于使用许多数据集和方法进行了大规模实验，因此在Automl方法研究的背景下放大了这个问题，每个数据都是用几种重复来排除随机效应的几个重复的实验。本文阐述了最近的绿色AI的精神，是为了提高对问题的自动化研究人员的意识，并详细阐述可能的补救措施。为此，我们确定了四类行动，社区可能采取更加可持续的自动化计划，即接近设计，基准，研究激励和透明度。

translated by 谷歌翻译

Globus Automation Services: Research process automation across the space-time continuum

Ryan Chard , Jim Pruyne , Kurt McKee , Josh Bryan , Brigitte Raumann , Rachana Ananthakrishnan , Kyle Chard , Ian Foster

分类：人工智能

2022-08-19

研究过程自动化 - 对科学仪器，计算机，数据存储和其他资源的可靠，高效和可重复执行的可靠，高效和可重复执行，这是现代科学的基本要素。我们在此处报告Globus研究数据管理平台内的新服务，该服务可以将各种研究过程的规范作为可重复使用的动作集，流量以及在异质研究环境中执行此类流动的集合。为了以广泛的空间范围（例如，从科学仪器到远程数据中心）和时间范围（从几秒钟到几周），这些Globus自动化服务功能：1）云托管以可靠地执行长期持久的流量，尽管零星的失败，但这些Globus自动化服务功能：1） ; 2）声明性符号和可扩展的异步行动提供商API，用于定义和执行涉及任意资源的各种行动和流动规范； 3）授权授权机制，用于安全调用动作。这些服务允许研究人员将广泛的研究任务的管理外包和自动化为可靠，可扩展和安全的云平台。我们向Globus自动化服务提供用例

translated by 谷歌翻译

Edge-centric Optimization of Multi-modal ML-driven eHealth Applications

Anil Kanduri , Sina Shahhosseini , Emad Kasaeyan Naeini , Hamidreza Alikhani , Pasi Liljeberg , Nikil Dutt , Amir M. Rahmani

分类：机器学习

2022-08-04

智能EHealth应用程序通过遥感，连续监控和数据分析为客户提供个性化和预防性的数字医疗服务。智能EHealth应用程序从多种模态感知输入数据，将数据传输到边缘和/或云节点，并使用计算密集型机器学习（ML）算法处理数据。连续的嘈杂输入数据，不可靠的网络连接，ML算法的计算要求以及传感器 - 边缘云层之间的计算放置选择会影响ML驱动的EHEADH应用程序的效率。在本章中，我们介绍了以优化的计算放置，准确性绩效权衡的探索以及用于ML驱动的EHEADH应用程序的跨层次感觉的合作式化的技术。我们通过传感器 - 边缘云框架进行客观疼痛评估案例研究，证明了在日常设置中智能eHealth应用程序的实际用例。

translated by 谷歌翻译

Systems Challenges for Trustworthy Embodied Systems

Harald Ruess

分类：人工智能

2022-01-10

即将开发我们呼叫所体现的系统的新一代越来越自主和自学习系统。在将这些系统部署到真实上下文中，我们面临各种工程挑战，因为它以有益的方式协调所体现的系统的行为至关重要，确保他们与我们以人为本的社会价值观的兼容性，并且设计可验证安全可靠的人类-Machine互动。我们正在争辩说，引发系统工程将来自嵌入到体现系统的温室，并确保动态联合的可信度，这种情况意识到的情境意识，意图，探索，探险，不断发展，主要是不可预测的，越来越自主的体现系统在不确定，复杂和不可预测的现实世界环境中。我们还识别了许多迫切性的系统挑战，包括可信赖的体现系统，包括强大而人为的AI，认知架构，不确定性量化，值得信赖的自融化以及持续的分析和保证。

translated by 谷歌翻译