Recent advances in computer vision-in the form of deep neural networks-have made it possible to query increasing volumes of video data with high accuracy. However, neural network inference is computationally expensive at scale: applying a state-of-the-art object detector in real time (i.e., 30+ frames per second) to a single video requires a $4000 GPU. In response, we present NOSCOPE, a system for querying videos that can reduce the cost of neural network video analysis by up to three orders of magnitude via inference-optimized model search. Given a target video, object to detect, and reference neural network, NOSCOPE automatically searches for and trains a sequence, or cascade, of models that preserves the accuracy of the reference network but is specialized to the target video and are therefore far less computationally expensive. NOSCOPE cascades two types of models: specialized models that forego the full generality of the reference model but faithfully mimic its behavior for the target video and object; and difference detectors that highlight temporal differences across frames. We show that the optimal cascade architecture differs across videos and objects, so NOSCOPE uses an efficient cost-based optimizer to search across models and cascades. With this approach, NOSCOPE achieves two to three order of magnitude speed-ups (265-15,500× real-time) on binary classification tasks over fixed-angle webcam and surveillance video while maintaining accuracy within 1-5% of state-of-the-art neural networks.
translated by 谷歌翻译
Large volumes of videos are continuously recorded from cameras deployed for traffic control and surveillance with the goal of answering "after the fact" queries: identify video frames with objects of certain classes (cars, bags) from many days of recorded video. While advancements in convolutional neural networks (CNNs) have enabled answering such queries with high accuracy, they are too expensive and slow. We build Focus, a system for low-latency and low-cost querying on large video datasets. Focus uses cheap ingestion techniques to index the videos by the objects occurring in them. At ingest-time, it uses compression and video-specific specialization of CNNs. Focus handles the lower accuracy of the cheap CNNs by judiciously leveraging expensive CNNs at query-time. To reduce query time latency, we cluster similar objects and hence avoid redundant processing. Using experiments on video streams from traffic, surveillance and news channels , we see that Focus uses 58× fewer GPU cycles than running expensive ingest processors and is 37× faster than processing all the video at query time.
translated by 谷歌翻译
Camera deployments are ubiquitous, but existing methods to analyze video feeds do not scale and are error-prone. We describe Optasia, a dataflow system that employs relational query optimization to efficiently process queries on video feeds from many cameras. Key gains of Optasia result from modularizing vision pipelines in such a manner that rela-tional query optimization can be applied. Specifically, Op-tasia can (i) de-duplicate the work of common modules, (ii) auto-parallelize the query plans based on the video input size, number of cameras and operation complexity, (iii) offers chunk-level parallelism that allows multiple tasks to process the feed of a single camera. Evaluation on traffic videos from a large city on complex vision queries shows high accuracy with many fold improvements in query completion time and resource usage relative to existing systems.
translated by 谷歌翻译
用于视频分析的大型摄像机网络的部署是既定且加速的趋势。许多真实的视频推断应用需要一个共同的问题模板:通过现场视频中的大型摄像机网络搜索感兴趣的对象或活动(例如,人,超速车辆)。这种称为跨摄像头分析的能力是计算和数据密集型的 - 在实时视频流的吞吐量下,要求跨摄像头和跨帧自动搜索。为了解决从大型部署处理每个原始视频帧的成本挑战,我们提出了一种新的系统高效的跨摄像头视频分析ReXCam。 ReXCam利用真实摄像机网络动态中的空间和时间位置来指导其对查询身份的推理时间搜索。在离线分析阶段,ReXCam构建跨摄像机相关模型,该模型对历史交通模式中观察到的位置进行编码。在推理时,ReXCam应用此模型来过滤在空间上和时间上与查询身份的当前位置相关的帧。在偶尔错过检测的情况下,ReXCam会对最近过滤的视频帧执行快速重播搜索,从而实现优雅的恢复。这些技术结合在一起,使得ReXCam可以将计算工作量减少4.6倍,并且通过八个摄像头的知名视频数据集将推理精度提高了27%,同时保持在基线召回率的1-2%之内。
translated by 谷歌翻译
Internet-enabled cameras pervade daily life, generating a huge amount of data, but most of the video they generate is transmitted over wires and analyzed offline with a human in the loop. The ubiq-uity of cameras limits the amount of video that can be sent to the cloud, especially on wireless networks where capacity is at a premium. In this paper, we present Vigil, a real-time distributed wireless surveillance system that leverages edge computing to support real-time tracking and surveillance in enterprise campuses, retail stores, and across smart cities. Vigil intelligently partitions video processing between edge computing nodes co-located with cameras and the cloud to save wireless capacity, which can then be dedicated to Wi-Fi hotspots, offsetting their cost. Novel video frame priori-tization and traffic scheduling algorithms further optimize Vigil's bandwidth utilization. We have deployed Vigil across three sites in both whitespace and Wi-Fi networks. Depending on the level of activity in the scene, experimental results show that Vigil allows a video surveillance system to support a geographical area of coverage between five and 200 times greater than an approach that simply streams video over the wireless network. For a fixed region of coverage and bandwidth, Vigil outperforms the default equal throughput allocation strategy of Wi-Fi by delivering up to 25% more objects relevant to a user's query.
translated by 谷歌翻译
Video cameras are pervasively deployed for security and smart city scenarios, with millions of them in large cities worldwide. Achieving the potential of these cameras requires efficiently analyzing the live videos in real-time. We describe VideoStorm, a video analytics system that processes thousands of video analytics queries on live video streams over large clusters. Given the high costs of vision processing, resource management is crucial. We consider two key characteristics of video ana-lytics: resource-quality tradeoff with multi-dimensional configurations, and variety in quality and lag goals. VideoStorm's offline profiler generates query resource-quality profile, while its online scheduler allocates resources to queries to maximize performance on quality and lag, in contrast to the commonly used fair sharing of resources in clusters. Deployment on an Azure cluster of 101 machines shows improvement by as much as 80% in quality of real-world queries and 7× better lag, processing video from operational traffic cameras.
translated by 谷歌翻译
The availability of low-cost hardware such as CMOS cameras and microphones has fostered the development of Wireless Multimedia Sensor Networks (WMSNs), i.e., networks of wirelessly interconnected devices that are able to ubiquitously retrieve multimedia content such as video and audio streams, still images, and scalar sensor data from the environment. In this paper, the state of the art in algorithms, protocols, and hardware for wireless multimedia sensor networks is surveyed, and open research issues are discussed in detail. Architectures for WMSNs are explored, along with their advantages and drawbacks. Currently off-the-shelf hardware as well as available research prototypes for WMSNs are listed and classified. Existing solutions and open research issues at the application, transport, network, link, and physical layers of the communication protocol stack are investigated, along with possible cross-layer synergies and optimizations.
translated by 谷歌翻译
高质量的计算机视觉模型通常解决了解真实世界图像的一般分布的问题。然而,大多数相机只观察到这种分布的很小一部分。这提供了通过将紧凑的低成本模型专门用于由单面板观察到的特定分布框架来实现更有效推断的可能性。在本文中,我们采用模型蒸馏技术(使用高成本教师的输出监督低成本学生模型),将精确,低成本的语义分割模型专门化为目标视频流。我们不是从视频流中学习离线数据的专业学生模型,而是通过实时视频在线培训学生,间歇性地运行教师以提供学习目标。 Onlinemodel蒸馏产生语义分割模型,即使目标视频的分布是非静态的,它们也会使Mask R-CNN教师接近7到17倍的推理运行时成本(11到26x FLOP)。我们的方法不需要对目标视频流进行离线预训练,并且比基于流或视频对象分割的解决方案实现更高的准确性和更低的成本。我们还提供了一个新的视频数据集,用于评估长时间运行的视频流的推理效率。
translated by 谷歌翻译
我们介绍了DeepCache,这是一种原理缓存设计,用于持续移动视觉中的深度学习推理。 DeepCache通过在输入视频流中开发时间局部性来提高模型执行效率。它解决了移动视觉引发的一个关键挑战:缓存必须在视频场景变化下运行,同时在缓存性,开销和模式精度损失之间进行权衡。在模型的输入端,DeepCache通过利用视频的内部结构发现视频时间局部性,为此借鉴了视频压缩的经验证据;在模型中,DeepCache通过利用模型的内部结构来传播可重用结果的区域。值得注意的是,DeepCache避免将视频启发式应用于模型内部模型,这些内部结构不是像素,而是高维,难以理解的数据。我们的DeepCache实现与未经修改的深度学习模型一起使用,需要零开发人员的手动工作,因此可立即部署在现成的移动设备上。我们的实验表明,DeepCache平均将推理执行时间节省了18%,最多可节省47%。 DeepCache平均将系统能耗降低20%。
translated by 谷歌翻译
In the era of the Internet of Things (IoT), an enormous amount of sensing devices collect and/or generate various sensory data over time for a wide range of fields and applications. Based on the nature of the application, these devices will result in big or fast/real-time data streams. Applying analytics over such data streams to discover new information, predict future insights, and make control decisions is a crucial process that makes IoT a worthy paradigm for businesses and a quality-of-life improving technology. In this paper, we provide a thorough overview on using a class of advanced machine learning techniques, namely Deep Learning (DL), to facilitate the analytics and learning in the IoT domain. We start by articulating IoT data characteristics and identifying two major treatments for IoT data from a machine learning perspective, namely IoT big data analytics and IoT streaming data analytics. We also discuss why DL is a promising approach to achieve the desired analytics in these types of data and applications. The potential of using emerging DL techniques for IoT data analytics are then discussed, and its promises and challenges are introduced. We present a comprehensive background on different DL architectures and algorithms. We also analyze and summarize major reported research attempts that leveraged DL in the IoT domain. The smart IoT devices that have incorporated DL in their intelligence background are also discussed. DL implementation approaches on the fog and cloud centers in support of IoT applications are also surveyed. Finally, we shed light on some challenges and potential directions for future research. At the end of each section, we highlight the lessons learned based on our experiments and review of the recent literature.
translated by 谷歌翻译
The World Wide Web has grown to be a primary source of information for millions of people. Due to the size of the Web, search engines have become the major access point for this information. However, "commercial" search engines use hidden algorithms that put the integrity of their results in doubt, collect user data that raises privacy concerns, and target the general public thus fail to serve the needs of specific search users. Open source search, like open source operating systems, offers alternatives. The goal of the Open Source Information Retrieval Workshop (OSIR) is to bring together practitioners developing open source search technologies in the context of a premier IR research conference to share their recent advances, and to coordinate their strategy and research plans. The intent is to foster community-based development, to promote distribution of transparent Web search tools, and to strengthen the interaction with the research community in IR. A workshop about Open Source Web Information Retrieval was held last year in Compigne, France as part of WI 2005. The focus of this worksop is broadened to the whole open source information retrieval community. We want to thank all the authors of the submitted papers, the members of the program committee:, and the several reviewers whose contributions have resulted in these high quality proceedings. ABSTRACT There has been a resurgence of interest in index maintenance (or incremental indexing) in the academic community in the last three years. Most of this work focuses on how to build indexes as quickly as possible, given the need to run queries during the build process. This work is based on a different set of assumptions than previous work. First, we focus on latency instead of through-put. We focus on reducing index latency (the amount of time between when a new document is available to be indexed and when it is available to be queried) and query latency (the amount of time that an incoming query must wait because of index processing). Additionally, we assume that users are unwilling to tune parameters to make the system more efficient. We show how this set of assumptions has driven the development of the Indri index maintenance strategy, and describe the details of our implementation.
translated by 谷歌翻译
边缘计算有效地将信息技术领域扩展到云计算范例所定义的边界之外。在源和目的地之间执行计算,边缘计算有望解决许多延迟敏感应用中的挑战,例如实时人员监视。利用无处不在的连接摄像头和智能移动设备,它可以在边缘实现视频分析。近年来,通过使用人工智能(AI)和机器学习(ML)算法,提出了许多用于对象检测和跟踪的智能视频监控方法。本文探讨了两种流行的人体物体检测方案Harr-Cascade和HOG特征提取和SVM分类器在边缘的可行性,并介绍了一种轻量级卷积神经网络(L-CNN),利用深度可分离卷积实现较少的计算,用于人体检测。单板计算机(SBC)用作测试的边缘设备,并且使用真实的校园监控视频流和开放数据集来验证算法。实验结果是有希望的,最终算法能够以实时方式在资源消耗可负担的边缘设备上以相当准确的方式跟踪人。
translated by 谷歌翻译
在计算机视觉的进步和相机硬件成本下降的推动下,组织正在整体部署视频摄像机,以便对其物理场所进行空间监控。然而,将视频分析扩展到大型摄像机部署会带来新的挑战,因为计算成本与摄像机源的数量成比例增长。本文由一个简单的问题驱动:我们可以扩展视频分析,以便在我们部署更多相机时,线性增加或甚至保持不变,同时推理精度保持稳定甚至提高。我们相信答案是。我们的主要观察是来自广域相机部署的视频馈送在空间和时间上证明了显着的内容相关性(例如,与其他地理上相关的馈送)。可以利用这些时空相关性来显着减小推理搜索空间的大小,从而降低多摄像机视频分析中的工作负载和误报率。通过讨论用例和技术挑战,我们提出了将视频分析扩展到大型摄像机网络的路线图,并概述了其实现的计划。
translated by 谷歌翻译
本报告描述了18个项目,这些项目探讨了如何在国家实验室中将商业云计算服务用于科学计算。这些演示包括在云环境中部署专有软件,以利用已建立的基于云的分析工作流来处理科学数据集。总的来说,这些项目非常成功,并且他们共同认为云计算可以成为国家实验室科学计算的宝贵计算资源。
translated by 谷歌翻译
Driven by the visions of Internet of Things and 5G communications, recentyears have seen a paradigm shift in mobile computing, from the centralizedMobile Cloud Computing towards Mobile Edge Computing (MEC). The main feature ofMEC is to push mobile computing, network control and storage to the networkedges (e.g., base stations and access points) so as to enablecomputation-intensive and latency-critical applications at the resource-limitedmobile devices. MEC promises dramatic reduction in latency and mobile energyconsumption, tackling the key challenges for materializing 5G vision. Thepromised gains of MEC have motivated extensive efforts in both academia andindustry on developing the technology. A main thrust of MEC research is toseamlessly merge the two disciplines of wireless communications and mobilecomputing, resulting in a wide-range of new designs ranging from techniques forcomputation offloading to network architectures. This paper provides acomprehensive survey of the state-of-the-art MEC research with a focus on jointradio-and-computational resource management. We also present a research outlookconsisting of a set of promising directions for MEC research, including MECsystem deployment, cache-enabled MEC, mobility management for MEC, green MEC,as well as privacy-aware MEC. Advancements in these directions will facilitatethe transformation of MEC from theory to practice. Finally, we introduce recentstandardization efforts on MEC as well as some typical MEC applicationscenarios.
translated by 谷歌翻译
Urbanization's rapid progress has modernized many people's lives but also engendered big issues, such as traffic congestion, energy consumption, and pollution. Urban computing aims to tackle these issues by using the data that has been generated in cities (e.g., traffic flow, human mobility, and geographical data). Urban computing connects urban sensing, data management, data analytics, and service providing into a recurrent process for an unobtrusive and continuous improvement of people's lives, city operation systems, and the environment. Urban computing is an interdisciplinary field where computer sciences meet conventional city-related fields, like transportation, civil engineering, environment, economy, ecology, and sociology in the context of urban spaces. This article first introduces the concept of urban computing, discussing its general framework and key challenges from the perspective of computer sciences. Second, we classify the applications of urban computing into seven categories, consisting of urban planning, transportation, the environment, energy, social, economy, and public safety and security, presenting representative scenarios in each category. Third, we summarize the typical technologies that are needed in urban computing into four folds, which are about urban sensing, urban data management, knowledge fusion across heterogeneous data, and urban data visualization. Finally, we give an outlook on the future of urban computing, suggesting a few research topics that are somehow missing in the community.
translated by 谷歌翻译
Mainstream is a new video analysis system that jointly adapts concurrent applications sharing fixed edge resources to maximize aggregate result quality. Mainstream exploits partial-DNN (deep neural network) compute sharing among applications trained through transfer learning from a common base DNN model, decreasing aggregate per-frame compute time. Based on the available resources and mix of applications running on an edge node, Mainstream automatically determines at deployment time the right trade-off between using more specialized DNNs to improve per-frame accuracy, and keeping more of the unspecialized base model to increase sharing and process more frames per second. Experiments with several datasets and event detection tasks on an edge node confirm that Mainstream improves mean event detection F1-scores by up to 47% relative to a static approach of retraining only the last DNN layer and sharing all others ("Max-Sharing") and by 87X relative to the common approach of using fully independent per-application DNNs ("No-Sharing").
translated by 谷歌翻译
以低功率实时地对3D环境进行视觉理解是一项巨大的计算挑战。通常被称为SLAM(同步本地化和映射),它是跨越国内和工业机器人,自动驾驶车辆,虚拟和增强现实的应用程序的核心。本文描述了通过支持应用程序专家选择和配置适当的算法以及适当的硬件和编译途径来组装实现SLAM交付所需的算法,体系结构,工具和系统软件的主要研究成果,以满足其性能,准确性和能耗目标。我们提出的主要贡献是(1)用于SLAM算法的系统定量评估的工具和方法,(2)针对多个目标的算法和实现设计空间的自动化,机器学习引导的探索,(3)端到端仿真工具能够针对各种SLAM算法方法的特定算法要求优化异构,加速架构,以及(4)在托管的JIT编译的自适应运行时环境中提供适当的加速自适应SLAM解决方案的工具。
translated by 谷歌翻译
在智能交通系统中,随着我们走向智能城市时代,监控和分析道路使用者的实时系统变得越来越重要。基于视觉的对象检测,多目标跟踪和交通事故近似检测框架是智能交通系统的重要应用,特别是在视频监控等领域。虽然深度神经网络最近在许多计算机视觉任务中取得了巨大成功,但所有的计算机框架都采用了统一的框架。从实时性能,复杂的城市环境,高度动态的交通事件和许多交通运动的需求中挑战成倍增加的三项任务仍然具有挑战性。在本文中,我们提出了一种双流卷积网络架构,可以对交通视频数据中的道路使用者进行实时检测,跟踪和近距离事故检测。双流模型包括用于对象检测的空间流网络和用于多对象跟踪的时间流网络容忍运动特征。我们通过结合双流网络的外观特征和运动特征来检测近距离事故。使用航拍视频,我们提出了一种交通事故近似数据集(TNAD),涵盖了适用于基于视觉的流量分析任务的各种类型的交通互动。我们的实验证明了我们的框架在TNAD数据集上具有高帧率的整体竞争定性和定量性能的优势。
translated by 谷歌翻译