云自动缩放机制通常基于缩放集群的无功自动化规则,每当某些指标,例如情况下的平均CPU使用量超过预定义阈值。调整这些规则在缩放群集时变得特别繁琐,群集涉及不可忽略的时间来引导新实例,因为它经常在生产云服务中发生。要处理此问题,我们提出了一种基于在不久的将来进化的系统的自动缩放云服务的架构。我们的方法利用时序预测技术,如基于机器学习和人工神经网络的那些,以预测关键指标的未来动态,例如资源消耗度量,并在它们上应用基于阈值的缩放策略。结果是一种预测自动化策略,例如,能够在云应用程序的负载中自动预测峰值,并提前触发适当的缩放操作以适应流量的预期增加。我们将我们的方法称为开源OpenStack组件,它依赖于并扩展,并扩展了Monasca所提供的监控能力,从而增加了可以通过散热或尖林等管制成分来利用的预测度量。我们使用经常性神经网络和多层的Perceptron显示实验结果,作为预测器,与简单的线性回归和传统的非预测自动缩放策略进行比较。但是,所提出的框架允许根据需要轻松定制预测政策。
translated by 谷歌翻译
As the number of distributed services (or microservices) of cloud-native applications grows, resource management becomes a challenging task. These applications tend to be user-facing and latency-sensitive, and our goal is to continuously minimize the amount of CPU resources allocated while still satisfying the application latency SLO. Although previous efforts have proposed simple heuristics and sophisticated ML-based techniques, we believe that a practical resource manager should accurately scale CPU resources for diverse applications, with minimum human efforts and operation overheads. To this end, we ask: can we systematically break resource management down to subproblems solvable by practical policies? Based on the notion of CPU-throttle-based performance target, we decouple the mechanisms of SLO feedback and resource control, and implement a two-level framework -- Autothrottle. It combines a lightweight learned controller at the global level, and agile per-microservice controllers at the local level. We evaluate Autothrottle on three microservice applications, with both short-term and 21-day production workload traces. Empirical results show Autothrottle's superior CPU core savings up to 26.21% over the best-performing baselines across applications, while maintaining the latency SLO.
translated by 谷歌翻译
Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.
translated by 谷歌翻译
计算机架构和系统已优化了很长时间,以便高效执行机器学习(ML)模型。现在,是时候重新考虑ML和系统之间的关系,并让ML转换计算机架构和系统的设计方式。这有一个双重含义:改善设计师的生产力,以及完成良性周期。在这篇论文中,我们对应用ML进行计算机架构和系统设计的工作进行了全面的审查。首先,我们考虑ML技术在架构/系统设计中的典型作用,即快速预测建模或设计方法,我们执行高级分类学。然后,我们总结了通过ML技术解决的计算机架构/系统设计中的常见问题,并且所用典型的ML技术来解决它们中的每一个。除了在狭义中强调计算机架构外,我们采用数据中心可被认为是仓库规模计算机的概念;粗略的计算机系统中提供粗略讨论,例如代码生成和编译器;我们还注意ML技术如何帮助和改造设计自动化。我们进一步提供了对机会和潜在方向的未来愿景,并设想应用ML的计算机架构和系统将在社区中蓬勃发展。
translated by 谷歌翻译
In this tutorial paper, we look into the evolution and prospect of network architecture and propose a novel conceptual architecture for the 6th generation (6G) networks. The proposed architecture has two key elements, i.e., holistic network virtualization and pervasive artificial intelligence (AI). The holistic network virtualization consists of network slicing and digital twin, from the aspects of service provision and service demand, respectively, to incorporate service-centric and user-centric networking. The pervasive network intelligence integrates AI into future networks from the perspectives of networking for AI and AI for networking, respectively. Building on holistic network virtualization and pervasive network intelligence, the proposed architecture can facilitate three types of interplay, i.e., the interplay between digital twin and network slicing paradigms, between model-driven and data-driven methods for network management, and between virtualization and AI, to maximize the flexibility, scalability, adaptivity, and intelligence for 6G networks. We also identify challenges and open issues related to the proposed architecture. By providing our vision, we aim to inspire further discussions and developments on the potential architecture of 6G.
translated by 谷歌翻译
In recent years, the exponential proliferation of smart devices with their intelligent applications poses severe challenges on conventional cellular networks. Such challenges can be potentially overcome by integrating communication, computing, caching, and control (i4C) technologies. In this survey, we first give a snapshot of different aspects of the i4C, comprising background, motivation, leading technological enablers, potential applications, and use cases. Next, we describe different models of communication, computing, caching, and control (4C) to lay the foundation of the integration approach. We review current state-of-the-art research efforts related to the i4C, focusing on recent trends of both conventional and artificial intelligence (AI)-based integration approaches. We also highlight the need for intelligence in resources integration. Then, we discuss integration of sensing and communication (ISAC) and classify the integration approaches into various classes. Finally, we propose open challenges and present future research directions for beyond 5G networks, such as 6G.
translated by 谷歌翻译
无线电接入网络(RAN)技术继续见证巨大的增长,开放式运行越来越最近的势头。在O-RAN规范中,RAN智能控制器(RIC)用作自动化主机。本文介绍了对O-RAN堆栈相关的机器学习(ML)的原则,特别是加强学习(RL)。此外,我们审查无线网络的最先进的研究,并将其投入到RAN框架和O-RAN架构的层次结构上。我们在整个开发生命周期中提供ML / RL模型面临的挑战的分类:从系统规范到生产部署(数据采集,模型设计,测试和管理等)。为了解决挑战,我们将一组现有的MLOPS原理整合,当考虑RL代理时,具有独特的特性。本文讨论了系统的生命周期模型开发,测试和验证管道,称为:RLOPS。我们讨论了RLOP的所有基本部分,包括:模型规范,开发和蒸馏,生产环境服务,运营监控,安全/安全和数据工程平台。根据这些原则,我们提出了最佳实践,以实现自动化和可重复的模型开发过程。
translated by 谷歌翻译
研究过程自动化 - 对科学仪器,计算机,数据存储和其他资源的可靠,高效和可重复执行的可靠,高效和可重复执行,这是现代科学的基本要素。我们在此处报告Globus研究数据管理平台内的新服务,该服务可以将各种研究过程的规范作为可重复使用的动作集,流量以及在异质研究环境中执行此类流动的集合。为了以广泛的空间范围(例如,从科学仪器到远程数据中心)和时间范围(从几秒钟到几周),这些Globus自动化服务功能:1)云托管以可靠地执行长期持久的流量,尽管零星的失败,但这些Globus自动化服务功能:1) ; 2)声明性符号和可扩展的异步行动提供商API,用于定义和执行涉及任意资源的各种行动和流动规范; 3)授权授权机制,用于安全调用动作。这些服务允许研究人员将广泛的研究任务的管理外包和自动化为可靠,可扩展和安全的云平台。我们向Globus自动化服务提供用例
translated by 谷歌翻译
关键性服务已被广泛部署在云环境中。为了成本效益,通常在服务器上共同介绍多个服务。因此,在这些复杂的共同定位案例中,运行时资源调度成为QoS控制的枢轴。但是,调度勘探空间随着服务器资源的增加而迅速扩大,使调度程序几乎无法迅速提供理想的解决方案。更重要的是,我们观察到计划探索空间中有“资源悬崖”。它们会影响勘探效率,并始终导致严重的QoS波动。在先前的调度程序中,无法轻松避免资源悬崖。为了解决这些问题,我们提出了一种基于ML的新型智能调度程序-OSML。它了解建筑提示(例如,IPC,Cache Misses,内存足迹等)之间的相关性,调度解决方案和QoS需求基于我们从在现成服务器上运行的11个广泛部署的服务中收集的数据集。 OSML采用多个ML模型来协作工作,以预测QoS变化,调整调度以及在复杂的共同定位案例中违反QoS违规行为。 OSML可以在调度期间明智地避免资源悬崖,并比以前的共同定位的LC服务更快地达到最佳解决方案。实验结果表明,与以前的研究相比,OSML支持较高的负载,并符合QoS目标较低的QoS目标,而收敛时间较短。
translated by 谷歌翻译
The core of the computer business now offers subscription-based on-demand services with the help of cloud computing. We may now share resources among multiple users by using virtualization, which creates a virtual instance of a computer system running in an abstracted hardware layer. It provides infinite computing capabilities through its massive cloud datacenters, in contrast to early distributed computing models, and has been incredibly popular in recent years because to its continually growing infrastructure, user base, and hosted data volume. This article suggests a conceptual framework for a workload management paradigm in cloud settings that is both safe and performance-efficient. A resource management unit is used in this paradigm for energy and performing virtual machine allocation with efficiency, assuring the safe execution of users' applications, and protecting against data breaches brought on by unauthorised virtual machine access real-time. A secure virtual machine management unit controls the resource management unit and is created to produce data on unlawful access or intercommunication. Additionally, a workload analyzer unit works simultaneously to estimate resource consumption data to help the resource management unit be more effective during virtual machine allocation. The suggested model functions differently to effectively serve the same objective, including data encryption and decryption prior to transfer, usage of trust access mechanism to prevent unauthorised access to virtual machines, which creates extra computational cost overhead.
translated by 谷歌翻译
事件处理是动态和响应互联网(物联网)的基石。该领域的最近方法基于代表性状态转移(REST)原则,其允许将事件处理任务放置在遵循相同原理的任何设备上。但是,任务应在边缘设备之间正确分布,以确保公平资源利用率和保证无缝执行。本文调查了深入学习的使用,以公平分配任务。提出了一种基于关注的神经网络模型,在不同场景下产生有效的负载平衡解决方案。所提出的模型基于变压器和指针网络架构,并通过Advantage演员批评批评学习算法训练。该模型旨在缩放到事件处理任务的数量和边缘设备的数量,不需要重新调整甚至再刷新。广泛的实验结果表明,拟议的模型在许多关键绩效指标中优于传统的启发式。通用设计和所获得的结果表明,所提出的模型可能适用于几个其他负载平衡问题变化,这使得该提案是由于其可扩展性和效率而在现实世界场景中使用的有吸引力的选择。
translated by 谷歌翻译
基于微服务的体系结构已成为云原生应用程序的普遍存在。每天利用越来越多的应用程序在云平台上部署的应用程序,需要进行更多的研究工作,以了解如何应用不同的策略来有效地管理各种云资源。大量研究已使用反应性和主动自动化策略部署了自动资源分配算法。但是,当前算法的效率仍然存在差距,例如从其体系结构和部署环境中捕获微服务的重要特征,例如,缺乏对图形依赖性的考虑。为了应对这一挑战,我们提出了Graph-PHPA,这是一种基于图的主动水平POD自动级别自动化策略,用于将云资源分配给微服务,以利用长期短期记忆(LSTM)和基于图形神经网络(GNN)的预测方法。我们使用BookInfo微服务在专用的测试环境中使用基于现实数据集生成的实时工作负载来评估图形phpa的性能。我们通过将图形PHPA与Kubernetes中基于规则的资源分配方案进行比较来证明了图形phpa的疗效。已经实施了广泛的实验,我们的结果说明了我们在不同测试方案中提出的资源节省方法优于基于反应性规则的基线算法的优势。
translated by 谷歌翻译
Energy consumption in buildings, both residential and commercial, accounts for approximately 40% of all energy usage in the U.S., and similar numbers are being reported from countries around the world. This significant amount of energy is used to maintain a comfortable, secure, and productive environment for the occupants. So, it is crucial that the energy consumption in buildings must be optimized, all the while maintaining satisfactory levels of occupant comfort, health, and safety. Recently, Machine Learning has been proven to be an invaluable tool in deriving important insights from data and optimizing various systems. In this work, we review the ways in which machine learning has been leveraged to make buildings smart and energy-efficient. For the convenience of readers, we provide a brief introduction of several machine learning paradigms and the components and functioning of each smart building system we cover. Finally, we discuss challenges faced while implementing machine learning algorithms in smart buildings and provide future avenues for research at the intersection of smart buildings and machine learning.
translated by 谷歌翻译
航空工业以及福利和与其相关的行业是在大数据分析的形式中创新的成熟。可用大数据技术的数量不断增长,而现有特征的同时则同时迅速发展并赋予授权。然而,大数据时代强加了如何在管理来自异构数据源的大规模和快速发展的数据的同时有效处理信息安全的关键挑战。虽然已经出现了多种技术,但需要在大型安全要求,隐私义务,系统性能和大型数据集的快速动态分析之间找到平衡。目前的纸张旨在介绍ICarus平台的ICARUS安全实验沙箱。 ICARUS平台旨在提供一个大型数据的平台,旨在成为航空数据和情报市场的“一站式商店”,提供了一个值得信赖和安全的“沙箱”分析工作空间,允许探索,集成和深度分析原始和衍生数据以可靠和公平的方式。在此目的,在ICARUS平台产品中设计并集成了一个安全的实验沙箱,可以提供能够完全保证数据安全性和保密性的复杂环境,允许任何涉及的律师利用平台进行分析的平台闭合实验室条件下的实验。
translated by 谷歌翻译
随着在各个领域中自适应系统的越来越多,对评估其正确行为的策略的需求越来越多。特别是旨在提供弹性和容忍性的自我修复系统,通常会在关键和高度动态的环境中处理意外的故障。它们的反应性和复杂行为使评估这些系统是否按照期望的目标执行起来挑战。最近,一些研究对缺乏自我修复行为的系统评估方法表示关注。在本文中,我们提出了国际象棋,这是一种基于混乱工程的自适应和自我修复系统系统评估的方法。混乱工程是一种使系统遇到意外条件和场景的方法。它在帮助开发人员构建有弹性的微服务体系结构和网络物理系统方面表现出了巨大的希望。国际象棋通过使用混乱工程来评估自我修复系统能够承受这种扰动的能力来解决这个想法。我们通过对自我修复的智能办公环境进行探索性研究来研究这种方法的可行性。该研究有助于我们探索方法的承诺和局限性,并确定需要额外工作的方向。我们总结了经验教训的摘要。
translated by 谷歌翻译
Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by enhancing the trust of end-users in machines. As the number of connected devices keeps on growing, the Internet of Things (IoT) market needs to be trustworthy for the end-users. However, existing literature still lacks a systematic and comprehensive survey work on the use of XAI for IoT. To bridge this lacking, in this paper, we address the XAI frameworks with a focus on their characteristics and support for IoT. We illustrate the widely-used XAI services for IoT applications, such as security enhancement, Internet of Medical Things (IoMT), Industrial IoT (IIoT), and Internet of City Things (IoCT). We also suggest the implementation choice of XAI models over IoT systems in these applications with appropriate examples and summarize the key inferences for future works. Moreover, we present the cutting-edge development in edge XAI structures and the support of sixth-generation (6G) communication services for IoT applications, along with key inferences. In a nutshell, this paper constitutes the first holistic compilation on the development of XAI-based frameworks tailored for the demands of future IoT use cases.
translated by 谷歌翻译
本文介绍了CAIR的设计和实施:为社会机器人和其他对话代理而设计的基于知识的自主互动的云系统。该系统对于低成本机器人和设备特别方便。为开发人员提供了一种可持续的解决方案,可以通过网络连接来管理口头和非语言互动,约有3,000个对话主题可以进行“闲聊”,并提供了一个预先煮熟的计划库,只需要将其接地到机器人的库中物理能力。该系统的结构为一组REST API端点,因此可以通过添加新的API来轻松扩展它,以提高连接到云的客户端的功能。该系统的另一个关键功能是它旨在使客户的开发变得直接:这样,可以轻松地赋予多个设备与用户自主交互的能力,了解何时执行特定的操作并利用云服务提供的所有信息。文章概述并讨论了为评估系统响应时间的性能而执行的实验结果,为研究和市场解决方案铺平了道路。提供了与ROS的客户的存储库的链接,并提供了诸如Pepper和Nao之类的流行机器人的链接。
translated by 谷歌翻译
现代软件系统和产品越来越依赖机器学习模型,以基于与用户和系统的交互进行数据驱动的决策,例如计算基础架构。对于更广泛的采用,这种做法必须(i)容纳没有ML背景的软件工程师,并提供(ii)提供优化产品目标的机制。在这项工作中,我们描述了一般原则和特定的端到端毫升平台,为决策和反馈集合提供易于使用的API。循环仪支持从在线数据收集到模拟培训,部署,推理的完整端到端ML生命周期,并扩展支持和调整产品目标的评估和调整。我们概述了平台架构和生产部署的整体影响 - 循环仪当前托管700毫升型号,每秒达到600万决定。我们还描述了学习曲线并总结了平台采用者的经验。
translated by 谷歌翻译
网络威胁情报(CTI)共享是减少攻击者和捍卫者之间信息不对称的重要活动。但是,由于数据共享和机密性之间的紧张关系,这项活动带来了挑战,这导致信息保留通常会导致自由骑士问题。因此,共享的信息仅代表冰山一角。当前的文献假设访问包含所有信息的集中数据库,但是由于上述张力,这并不总是可行的。这会导致不平衡或不完整的数据集,需要使用技术扩展它们。我们展示了这些技术如何导致结果和误导性能期望。我们提出了一个新颖的框架,用于从分布式数据中提取有关事件,漏洞和妥协指标的分布式数据,并与恶意软件信息共享平台(MISP)一起证明其在几种实际情况下的使用。提出和讨论了CTI共享的政策影响。拟议的系统依赖于隐私增强技术和联合处理的有效组合。这使组织能够控制其CTI,并最大程度地减少暴露或泄漏的风险,同时为共享的好处,更准确和代表性的结果以及更有效的预测性和预防性防御能力。
translated by 谷歌翻译
使用人工智能(AI)赋予无线网络中数据量的前所未有的数据量激增,为提供无处不在的数据驱动智能服务而开辟了新的视野。通过集中收集数据集和培训模型来实现传统的云彩中心学习(ML)基础的服务。然而,这种传统的训练技术包括两个挑战:(i)由于数据通信增加而导致的高通信和能源成本,(ii)通过允许不受信任的各方利用这些信息来威胁数据隐私。最近,鉴于这些限制,一种新兴的新兴技术,包括联合学习(FL),以使ML带到无线网络的边缘。通过以分布式方式培训全局模型,可以通过FL Server策划的全局模型来提取数据孤岛的好处。 FL利用分散的数据集和参与客户的计算资源,在不影响数据隐私的情况下开发广义ML模型。在本文中,我们介绍了对FL的基本面和能够实现技术的全面调查。此外,提出了一个广泛的研究,详细说明了无线网络中的流体的各种应用,并突出了他们的挑战和局限性。进一步探索了FL的疗效,其新兴的前瞻性超出了第五代(B5G)和第六代(6G)通信系统。本调查的目的是在关键的无线技术中概述了流动的技术,这些技术将作为建立对该主题的坚定了解的基础。最后,我们向未来的研究方向提供前进的道路。
translated by 谷歌翻译