360-degree panoramic videos have gained considerable attention in recent years due to the rapid development of head-mounted displays (HMDs) and panoramic cameras. One major problem in streaming panoramic videos is that panoramic videos are much larger in size compared to traditional ones. Moreover, the user devices are often in a wireless environment, with limited battery, computation power, and bandwidth. To reduce resource consumption, researchers have proposed ways to predict the users' viewports so that only part of the entire video needs to be transmitted from the server. However, the robustness of such prediction approaches has been overlooked in the literature: it is usually assumed that only a few models, pre-trained on past users' experiences, are applied for prediction to all users. We observe that those pre-trained models can perform poorly for some users because they might have drastically different behaviors from the majority, and the pre-trained models cannot capture the features in unseen videos. In this work, we propose a novel meta learning based viewport prediction paradigm to alleviate the worst prediction performance and ensure the robustness of viewport prediction. This paradigm uses two machine learning models, where the first model predicts the viewing direction, and the second model predicts the minimum video prefetch size that can include the actual viewport. We first train two meta models so that they are sensitive to new training data, and then quickly adapt them to users while they are watching the videos. Evaluation results reveal that the meta models can adapt quickly to each user, and can significantly increase the prediction accuracy, especially for the worst-performing predictions.
translated by 谷歌翻译
随着数据生成越来越多地在没有连接连接的设备上进行,因此与机器学习(ML)相关的流量将在无线网络中无处不在。许多研究表明,传统的无线协议高效或不可持续以支持ML,这创造了对新的无线通信方法的需求。在这项调查中,我们对最先进的无线方法进行了详尽的审查,这些方法是专门设计用于支持分布式数据集的ML服务的。当前,文献中有两个明确的主题,模拟的无线计算和针对ML优化的数字无线电资源管理。这项调查对这些方法进行了全面的介绍,回顾了最重要的作品,突出了开放问题并讨论了应用程序方案。
translated by 谷歌翻译
Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.
translated by 谷歌翻译
In recent years, mobile devices are equipped with increasingly advanced sensing and computing capabilities. Coupled with advancements in Deep Learning (DL), this opens up countless possibilities for meaningful applications, e.g., for medical purposes and in vehicular networks. Traditional cloudbased Machine Learning (ML) approaches require the data to be centralized in a cloud server or data center. However, this results in critical issues related to unacceptable latency and communication inefficiency. To this end, Mobile Edge Computing (MEC) has been proposed to bring intelligence closer to the edge, where data is produced. However, conventional enabling technologies for ML at mobile edge networks still require personal data to be shared with external parties, e.g., edge servers. Recently, in light of increasingly stringent data privacy legislations and growing privacy concerns, the concept of Federated Learning (FL) has been introduced. In FL, end devices use their local data to train an ML model required by the server. The end devices then send the model updates rather than raw data to the server for aggregation. FL can serve as an enabling technology in mobile edge networks since it enables the collaborative training of an ML model and also enables DL for mobile edge network optimization. However, in a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved. This raises challenges of communication costs, resource allocation, and privacy and security in the implementation of FL at scale. In this survey, we begin with an introduction to the background and fundamentals of FL. Then, we highlight the aforementioned challenges of FL implementation and review existing solutions. Furthermore, we present the applications of FL for mobile edge network optimization. Finally, we discuss the important challenges and future research directions in FL.
translated by 谷歌翻译
近年来,随着传感器和智能设备的广泛传播,物联网(IoT)系统的数据生成速度已大大增加。在物联网系统中,必须经常处理,转换和分析大量数据,以实现各种物联网服务和功能。机器学习(ML)方法已显示出其物联网数据分析的能力。但是,将ML模型应用于物联网数据分析任务仍然面临许多困难和挑战,特别是有效的模型选择,设计/调整和更新,这给经验丰富的数据科学家带来了巨大的需求。此外,物联网数据的动态性质可能引入概念漂移问题,从而导致模型性能降解。为了减少人类的努力,自动化机器学习(AUTOML)已成为一个流行的领域,旨在自动选择,构建,调整和更新机器学习模型,以在指定任务上实现最佳性能。在本文中,我们对Automl区域中模型选择,调整和更新过程中的现有方法进行了审查,以识别和总结将ML算法应用于IoT数据分析的每个步骤的最佳解决方案。为了证明我们的发现并帮助工业用户和研究人员更好地实施汽车方法,在这项工作中提出了将汽车应用于IoT异常检测问题的案例研究。最后,我们讨论并分类了该领域的挑战和研究方向。
translated by 谷歌翻译
非侵入性负载监控(NILM)是将总功率消耗分为单个子组件的任务。多年来,已经合并了信号处理和机器学习算法以实现这一目标。关于最先进的方法,进行了许多出版物和广泛的研究工作,以涉及最先进的方法。科学界最初使用机器学习工具的尼尔姆问题制定和描述的最初兴趣已经转变为更实用的尼尔姆。如今,我们正处于成熟的尼尔姆时期,在现实生活中的应用程序方案中尝试使用尼尔姆。因此,算法的复杂性,可转移性,可靠性,实用性和普遍的信任度是主要的关注问题。这篇评论缩小了早期未成熟的尼尔姆时代与成熟的差距。特别是,本文仅对住宅电器的尼尔姆方法提供了全面的文献综述。本文分析,总结并介绍了大量最近发表的学术文章的结果。此外,本文讨论了这些方法的亮点,并介绍了研究人员应考虑的研究困境,以应用尼尔姆方法。最后,我们表明需要将传统分类模型转移到一个实用且值得信赖的框架中。
translated by 谷歌翻译
The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical Music Recognition, with other researchers and practitioners that could benefit from such systems, like librarians or musicologists. The relevant topics of interest for the workshop include, but are not limited to: Music reading systems; Optical music recognition; Datasets and performance evaluation; Image processing on music scores; Writer identification; Authoring, editing, storing and presentation systems for music scores; Multi-modal systems; Novel input-methods for music to produce written music; Web-based Music Information Retrieval services; Applications and projects; Use-cases related to written music. These are the proceedings of the 3rd International Workshop on Reading Music Systems, held in Alicante on the 23rd of July 2021.
translated by 谷歌翻译
最近,已经努力将信号阶段和时机(SPAT)消息标准化。这些消息包含所有信号交叉方法的信号相时机。因此,这些信息可用于有效的运动计划,从而导致更多均匀的交通流量和均匀的速度轮廓。尽管努力为半活化的信号控制系统提供了可靠的预测,但预测完全驱动控制的信号相时仍具有挑战性。本文提出了使用聚合的流量信号和循环检测器数据的时间序列预测框架。我们利用最先进的机器学习模型来预测未来信号阶段的持续时间。线性回归(LR),随机森林(RF)和长期内存(LSTM)神经网络的性能是针对天真基线模型进行评估的。结果基于瑞士苏黎世的全面信号控制系统的经验数据集表明,机器学习模型的表现优于常规预测方法。此外,基于树木的决策模型(例如RF)的表现最佳,其准确性满足实用应用要求。
translated by 谷歌翻译
如今,无线通信正在迅速重塑整个行业。特别是,移动边缘计算(MEC)是一种用于工业互联网(IIOT)的促成技术,它使强大的计算/存储基础架构更靠近移动终端,从而大大降低了响应延迟。为了获得在网络边缘积极缓存的好处,对最终设备之间的受欢迎程度的精确知识至关重要。但是,在许多IIOT场景中,内容流行的内容流行以及数据私人关系的复杂性质对其获取构成了艰巨的挑战。在本文中,我们建议针对MEC启用的IIOT提供无监督和保护隐私的普及预测框架。引入了本地和全球流行的概念,并将每个用户的随时间变化为无模型的马尔可夫链。在此基础上,提出了一种新颖的无监督的复发性联合学习(URFL)算法,以预测分布式的流行,同时实现隐私保护和无监督的培训。仿真表明,提出的框架可以根据降低的根平方误差提高预测准确性,高达$ 60.5 \%-68.7 \%$。此外,避免了手动标签和违反用户数据隐私的行为。
translated by 谷歌翻译
使用人工智能(AI)赋予无线网络中数据量的前所未有的数据量激增,为提供无处不在的数据驱动智能服务而开辟了新的视野。通过集中收集数据集和培训模型来实现传统的云彩中心学习(ML)基础的服务。然而,这种传统的训练技术包括两个挑战:(i)由于数据通信增加而导致的高通信和能源成本,(ii)通过允许不受信任的各方利用这些信息来威胁数据隐私。最近,鉴于这些限制,一种新兴的新兴技术,包括联合学习(FL),以使ML带到无线网络的边缘。通过以分布式方式培训全局模型,可以通过FL Server策划的全局模型来提取数据孤岛的好处。 FL利用分散的数据集和参与客户的计算资源,在不影响数据隐私的情况下开发广义ML模型。在本文中,我们介绍了对FL的基本面和能够实现技术的全面调查。此外,提出了一个广泛的研究,详细说明了无线网络中的流体的各种应用,并突出了他们的挑战和局限性。进一步探索了FL的疗效,其新兴的前瞻性超出了第五代(B5G)和第六代(6G)通信系统。本调查的目的是在关键的无线技术中概述了流动的技术,这些技术将作为建立对该主题的坚定了解的基础。最后,我们向未来的研究方向提供前进的道路。
translated by 谷歌翻译
自适应视频流依靠构建高效的比特梯梯子来在带宽约束下为观众提供最佳的视觉质量。与内容相关的比特阶梯选择的传统方法需要预先编码多个编码参数的视频镜头,以找到由结果质量曲线的凸壳给出的最佳操作点。但是,此预编码步骤等同于在可能的编码参数的空间上进行详尽的搜索过程,这在计算和时间支出方面都会引起大量开销。为了减少此开销,我们提出了一种基于深度学习的内容凸面预测的深度学习方法。我们采用经常​​性的卷积网络(RCN)来隐式分析视频拍摄的时空复杂性,以预测其凸壳。采用了两步转移学习方案来培训我们提出的RCN救主模型,该模型确保了足够的内容多样性来分析场景复杂性,同时也可以捕获原始源视频的场景统计信息。我们的实验结果表明,我们提出的模型可以更好地近似最佳凸壳,并与现有方法相比提供竞争性的时间。平均而言,我们的方法平均将预编码时间缩短了58.0%,而预测的凸壳相对于地面真理的平均Bjontegaard三角洲比特率(BD率)为0.08%,而BD率的平均绝对偏差为分布为0.44%
translated by 谷歌翻译
Deep learning-based physical-layer secret key generation (PKG) has been used to overcome the imperfect uplink/downlink channel reciprocity in frequency division duplexing (FDD) orthogonal frequency division multiplexing (OFDM) systems. However, existing efforts have focused on key generation for users in a specific environment where the training samples and test samples obey the same distribution, which is unrealistic for real world applications. This paper formulates the PKG problem in multiple environments as a learning-based problem by learning the knowledge such as data and models from known environments to generate keys quickly and efficiently in multiple new environments. Specifically, we propose deep transfer learning (DTL) and meta-learning-based channel feature mapping algorithms for key generation. The two algorithms use different training methods to pre-train the model in the known environments, and then quickly adapt and deploy the model to new environments. Simulation results show that compared with the methods without adaptation, the DTL and meta-learning algorithms both can improve the performance of generated keys. In addition, the complexity analysis shows that the meta-learning algorithm can achieve better performance than the DTL algorithm with less time, lower CPU and GPU resources.
translated by 谷歌翻译
虚拟现实(VR)视频(通常以360美元$^\ Circ $视频形式)由于VR技术的快速开发以及消费级360 $^\ Circ $摄像机和显示器的显着普及而引起了人们的关注。因此,了解人们如何看待用户生成的VR视频,这些视频可能会受到混乱的真实扭曲,通常是在时空和时间上局部的。在本文中,我们建立了最大的360美元$^\ Circ $视频数据库之一,其中包含502个用户生成的视频,内容丰富和失真多样性。我们捕获了139位用户的观看行为(即扫描路径),并在四个不同的观看条件下(两个起点$ \ times $ $ $ $ $两个探索时间)收集了他们的意见分数。我们对记录的数据提供了详尽的统计分析,从而产生了一些有趣的观察结果,例如观看条件对观看行为和感知质量的重大影响。此外,我们还探讨了我们的数据和分析的其他用法,包括评估360 $^\ CIRC $视频的质量评估和显着性检测的计算模型。我们已经在https://github.com/yao-yiru/vr-video-database上提供了数据集和代码。
translated by 谷歌翻译
Emerging technologies and applications including Internet of Things (IoT), social networking, and crowd-sourcing generate large amounts of data at the network edge. Machine learning models are often built from the collected data, to enable the detection, classification, and prediction of future events. Due to bandwidth, storage, and privacy concerns, it is often impractical to send all the data to a centralized location. In this paper, we consider the problem of learning model parameters from data distributed across multiple edge nodes, without sending raw data to a centralized place. Our focus is on a generic class of machine learning models that are trained using gradientdescent based approaches. We analyze the convergence bound of distributed gradient descent from a theoretical point of view, based on which we propose a control algorithm that determines the best trade-off between local update and global parameter aggregation to minimize the loss function under a given resource budget. The performance of the proposed algorithm is evaluated via extensive experiments with real datasets, both on a networked prototype system and in a larger-scale simulated environment. The experimentation results show that our proposed approach performs near to the optimum with various machine learning models and different data distributions.
translated by 谷歌翻译
眼目光分析是计算机视觉和人类计算机相互作用领域的重要研究问题。即使在过去十年中取得了显着进展,由于眼睛外观,眼头相互作用,遮挡,图像质量和照明条件的独特性,自动凝视分析仍然具有挑战性。有几个开放的问题,包括在没有先验知识的情况下,在不受限制的环境中解释凝视方向的重要提示以及如何实时编码它们。我们回顾了一系列目光分析任务和应用程序的进展,以阐明这些基本问题,确定凝视分析中的有效方法并提供可能的未来方向。我们根据其优势和报告的评估指标分析了最近的凝视估计和分割方法,尤其是在无监督和弱监督的领域中。我们的分析表明,强大而通用的凝视分析方法的开发仍然需要解决现实世界中的挑战,例如不受限制的设置和学习,并减少了监督。最后,我们讨论了设计现实的目光分析系统的未来研究方向,该系统可以传播到其他领域,包括计算机视觉,增强现实(AR),虚拟现实(VR)和人类计算机交互(HCI)。项目页面:https://github.com/i-am-shreya/eyegazesurvey} {https://github.com/i-am-shreya/eyegazesurvey
translated by 谷歌翻译
The ubiquity of camera-embedded devices and the advances in deep learning have stimulated various intelligent mobile video applications. These applications often demand on-device processing of video streams to deliver real-time, high-quality services for privacy and robustness concerns. However, the performance of these applications is constrained by the raw video streams, which tend to be taken with small-aperture cameras of ubiquitous mobile platforms in dim light. Despite extensive low-light video enhancement solutions, they are unfit for deployment to mobile devices due to their complex models and and ignorance of system dynamics like energy budgets. In this paper, we propose AdaEnlight, an energy-aware low-light video stream enhancement system on mobile devices. It achieves real-time video enhancement with competitive visual quality while allowing runtime behavior adaptation to the platform-imposed dynamic energy budgets. We report extensive experiments on diverse datasets, scenarios, and platforms and demonstrate the superiority of AdaEnlight compared with state-of-the-art low-light image and video enhancement solutions.
translated by 谷歌翻译
由于数据可用性的偏见,基于学习的学生建模的传统方法对代表性不足的学生群体的推广不佳。在本文中,我们提出了一种方法,用于预测其在线学习活动中的学生表现,以优化与种族和性别等不同人口组的推论准确性。在我们的方法中,基于联合学习的最新基础,单个学生子组的个性化模型是从在所有学生模型中通过元学级更新汇总的全球模型得出的,该模型通过说明亚组异质性。为了了解学生活动的更好代表,我们通过一种自我监督的行为预处理方法来增强我们的方法,该方法利用了多种学生行为方式(例如,访问教授视频和在论坛上的参与),并在模型中包括神经网络注意力聚合阶段。通过从在线课程中对三个现实世界数据集进行实验,我们证明我们的方法在预测所有子组的学生学习成果方面对现有的学生建模基准进行了实质性改进。对最终学生嵌入的视觉分析证实,我们的个性化方法确实确定了不同亚组中的不同活动模式,与基准相比其更强的推理能力一致。
translated by 谷歌翻译
大多数机器学习算法由一个或多个超参数配置,必须仔细选择并且通常会影响性能。为避免耗时和不可递销的手动试验和错误过程来查找性能良好的超参数配置,可以采用各种自动超参数优化(HPO)方法,例如,基于监督机器学习的重新采样误差估计。本文介绍了HPO后,本文审查了重要的HPO方法,如网格或随机搜索,进化算法,贝叶斯优化,超带和赛车。它给出了关于进行HPO的重要选择的实用建议,包括HPO算法本身,性能评估,如何将HPO与ML管道,运行时改进和并行化结合起来。这项工作伴随着附录,其中包含关于R和Python的特定软件包的信息,以及用于特定学习算法的信息和推荐的超参数搜索空间。我们还提供笔记本电脑,这些笔记本展示了这项工作的概念作为补充文件。
translated by 谷歌翻译
在支持计算和通信技术的支持下,元评估有望为用户带来前所未有的服务体验。但是,元用户数量的增加对网络资源的需求量很大,尤其是用于基于图形扩展现实并需要渲染大量虚拟对象的荟萃分析服务。为了有效利用网络资源并改善体验质量(QOE),我们设计了一个注意力吸引网络资源分配方案,以实现定制的元评估服务。目的是将更多的网络资源分配给用户更感兴趣的虚拟对象。我们首先讨论与荟萃服务有关的几种关键技术,包括QOE分析,眼睛跟踪和远程渲染。然后,我们查看现有的数据集,并提出用户对象注意级别(UOAL)数据集,该数据集包含30个用户对1,000张图像中96个对象的地面意义。提供有关如何使用UOAL的教程。在UOAL的帮助下,我们提出了一种注意力感知的网络资源分配算法,该算法有两个步骤,即注意力预测和QOE最大化。特别是,我们概述了两种类型的注意力预测方法的设计,即兴趣感知和时间感知预测。通过使用预测的用户对象 - 注意值,可以最佳分配边缘设备的渲染能力等网络资源以最大化QOE。最后,我们提出了与荟萃服务有关的有前途的研究指示。
translated by 谷歌翻译
社交媒体,职业运动和视频游戏正在推动实时视频流的快速增长,在抽搐和YouTube Live等平台上。自动流媒体经验非常易于短时间级网络拥塞,因为客户端播放缓冲区通常不超过几秒钟。不幸的是,识别这些流和测量他们的QoE进行网络管理是具有挑战性的,因为内容提供商在很大程度上使用相同的交付基础设施来用于实时和视频点播(VOD)流,并且不能提供数据包检查技术(包括SNI / DNS查询监控)始终区分两者。在本文中,我们设计,构建和部署康复:基于网络级行为特征的实时视频检测和QoE测量的机器学习方法。我们的贡献是四倍:(1)我们从抽搐和YouTube分析约23,000个视频流,并在其流量配置文件中识别区分实时和按需流的关键功能。我们将我们的交通迹线释放为公众的开放数据; (2)我们开发基于LSTM的二进制分类器模型,该模型将Live从按需流实时区分,在提供商的高度超过95%的准确度; (3)我们开发了一种方法,估计实时流动流动的QoE度量,分辨率和缓冲率分别分别为93%和90%的总体精度; (4)最后,我们将我们的解决方案原型,将其培训在实验室中,并在服务于7,000多名订阅者的Live ISP网络中部署它。我们的方法提供了ISP,具有细粒度的可视性,进入实时视频流,使它们能够测量和改善用户体验。
translated by 谷歌翻译