To achieve accurate and low-cost 3D object detection, existing methods propose to benefit camera-based multi-view detectors with spatial cues provided by the LiDAR modality, e.g., dense depth supervision and bird-eye-view (BEV) feature distillation. However, they directly conduct point-to-point mimicking from LiDAR to camera, which neglects the inner-geometry of foreground targets and suffers from the modal gap between 2D-3D features. In this paper, we propose the learning scheme of Target Inner-Geometry from the LiDAR modality into camera-based BEV detectors for both dense depth and BEV features, termed as TiG-BEV. First, we introduce an inner-depth supervision module to learn the low-level relative depth relations between different foreground pixels. This enables the camera-based detector to better understand the object-wise spatial structures. Second, we design an inner-feature BEV distillation module to imitate the high-level semantics of different keypoints within foreground targets. To further alleviate the BEV feature gap between two modalities, we adopt both inter-channel and inter-keypoint distillation for feature-similarity modeling. With our target inner-geometry distillation, TiG-BEV can effectively boost BEVDepth by +2.3% NDS and +2.4% mAP, along with BEVDet by +9.1% NDS and +10.3% mAP on nuScenes val set. Code will be available at https://github.com/ADLab3Ds/TiG-BEV.
translated by 谷歌翻译
具有多传感器的3D对象检测对于自主驾驶和机器人技术的准确可靠感知系统至关重要。现有的3D探测器通过采用两阶段范式来显着提高准确性,这仅依靠激光点云进行3D提案的细化。尽管令人印象深刻,但点云的稀疏性,尤其是对于遥远的点,使得仅激光雷达的完善模块难以准确识别和定位对象。要解决这个问题,我们提出了一种新颖的多模式两阶段方法FusionRcnn,有效,有效地融合了感兴趣区域(ROI)的点云和摄像头图像。 FusionRcnn自适应地整合了LiDAR的稀疏几何信息和统一注意机制中相机的密集纹理信息。具体而言,它首先利用RoiPooling获得具有统一大小的图像集,并通过在ROI提取步骤中的建议中采样原始点来获取点设置;然后利用模式内的自我注意力来增强域特异性特征,此后通过精心设计的跨注意事项融合了来自两种模态的信息。FusionRCNN从根本上是插件,并支持不同的单阶段方法与不同的单阶段方法。几乎没有建筑变化。对Kitti和Waymo基准测试的广泛实验表明,我们的方法显着提高了流行探测器的性能。可取,FusionRCNN在Waymo上的FusionRCNN显着提高了强大的第二基线,而Waymo上的MAP则超过6.14%,并且优于竞争两阶段方法的表现。代码将很快在https://github.com/xxlbigbrother/fusion-rcnn上发布。
translated by 谷歌翻译
The fifth generation of the Radio Access Network (RAN) has brought new services, technologies, and paradigms with the corresponding societal benefits. However, the energy consumption of 5G networks is today a concern. In recent years, the design of new methods for decreasing the RAN power consumption has attracted interest from both the research community and standardization bodies, and many energy savings solutions have been proposed. However, there is still a need to understand the power consumption behavior of state-ofthe-art base station architectures, such as multi-carrier active antenna units (AAUs), as well as the impact of different network parameters. In this paper, we present a power consumption model for 5G AAUs based on artificial neural networks. We demonstrate that this model achieves good estimation performance, and it is able to capture the benefits of energy saving when dealing with the complexity of multi-carrier base stations architectures. Importantly, multiple experiments are carried out to show the advantage of designing a general model able to capture the power consumption behaviors of different types of AAUs. Finally, we provide an analysis of the model scalability and the training data requirements.
translated by 谷歌翻译
In this paper, we propose a novel primal-dual proximal splitting algorithm (PD-PSA), named BALPA, for the composite optimization problem with equality constraints, where the loss function consists of a smooth term and a nonsmooth term composed with a linear mapping. In BALPA, the dual update is designed as a proximal point for a time-varying quadratic function, which balances the implementation of primal and dual update and retains the proximity-induced feature of classic PD-PSAs. In addition, by this balance, BALPA eliminates the inefficiency of classic PD-PSAs for composite optimization problems in which the Euclidean norm of the linear mapping or the equality constraint mapping is large. Therefore, BALPA not only inherits the advantages of simple structure and easy implementation of classic PD-PSAs but also ensures a fast convergence when these norms are large. Moreover, we propose a stochastic version of BALPA (S-BALPA) and apply the developed BALPA to distributed optimization to devise a new distributed optimization algorithm. Furthermore, a comprehensive convergence analysis for BALPA and S-BALPA is conducted, respectively. Finally, numerical experiments demonstrate the efficiency of the proposed algorithms.
translated by 谷歌翻译
移动网络第五代(5G)的能源消耗是电信行业的主要关注点之一。但是,目前没有一种评估5G基站(BSS)功耗的准确且可进行的方法。在本文中,我们提出了一个新颖的模型,以实现5G多载波BSS功耗的现实表征,该模型以大型数据收集活动为基础。首先,我们定义了允许对多个5G BS产品进行建模的机器学习体系结构。然后,我们利用该框架收集的知识来得出一个现实且可分析的功耗模型,这可以帮助推动理论分析以及功能标准化,开发和优化框架。值得注意的是,我们证明了这种模型具有很高的精度,并且能够捕获节能机制的好处。我们认为,该分析模型是理解5G BSS功耗的基本工具,并准确地优化了网络能源效率。
translated by 谷歌翻译
本文提出了一种针对分布式凸复合优化问题的新型双重不精确拆分算法(DISA),其中本地损耗函数由$ L $ -SMOOTH的项组成,可能是由线性操作员组成的非平滑项。我们证明,当原始和双重尺寸$ \ tau $,$ \ beta $满足$ 0 <\ tau <{2}/{l} $和$ 0 <\ tau \ beta <1 $时,我们证明了DISA是收敛的。与现有的原始双侧近端分裂算法(PD-PSA)相比,DISA克服了收敛步骤范围对线性操作员欧几里得范围的依赖性。这意味着当欧几里得规范大时,DISA允许更大的步骤尺寸,从而确保其快速收敛。此外,我们分别在一般凸度和度量次级性下分别建立了disa的均值和线性收敛速率。此外,还提供了DISA的近似迭代版本,并证明了该近似版本的全局收敛性和sublinear收敛速率。最后,数值实验不仅证实了理论分析,而且还表明,与现有的PD-PSA相比,DISA达到了显着的加速度。
translated by 谷歌翻译
树木修剪过程是促进水果生长并改善其生产的关键,这是由于对分支机构水果和营养运输的光合作用效率的影响。目前,修剪仍然高度依赖人类劳动。工人的经验将强烈影响树修剪性能的稳健性。因此,对于工人和农民来说,评估修剪性能是一个挑战。本文旨在为了更好地解决该问题,提出了一种新型的修剪分类策略模型,称为“ OTSU-SVM”,以根据分支和叶子的阴影评估修剪性能。该模型不仅考虑了树的可用照明区域,还考虑了树的照明区域的均匀性。更重要的是,我们的小组将OTSU算法实现到该模型中,该算法高度增强了该模型评估的鲁棒性。此外,实验中还使用了来自Yuhang区的梨树的数据。在该实验中,我们证明了OTSU-SVM具有良好的精度,在评估梨树的修剪时具有80%的性能和高性能。如果应用于果园,它可以提供更成功的修剪。成功的修剪可以扩大单个水果的照明区域,并增加目标分支的营养运输,从而显着提高水果的重量和生产。
translated by 谷歌翻译
Background. Functional assessment of right ventricle (RV) using gated myocardial perfusion single-photon emission computed tomography (MPS) heavily relies on the precise extraction of right ventricular contours. In this paper, we present a new deep-learning-based model integrating both the spatial and temporal features in gated MPS images to perform the segmentation of the RV epicardium and endocardium. Methods. By integrating the spatial features from each cardiac frame of the gated MPS and the temporal features from the sequential cardiac frames of the gated MPS, we developed a Spatial-Temporal V-Net (ST-VNet) for automatic extraction of RV endocardial and epicardial contours. In the ST-VNet, a V-Net is employed to hierarchically extract spatial features, and convolutional long-term short-term memory (ConvLSTM) units are added to the skip-connection pathway to extract the temporal features. The input of the ST-VNet is ECG-gated sequential frames of the MPS images and the output is the probability map of the epicardial or endocardial masks. A Dice similarity coefficient (DSC) loss which penalizes the discrepancy between the model prediction and the ground truth was adopted to optimize the segmentation model. Results. Our segmentation model was trained and validated on a retrospective dataset with 45 subjects, and the cardiac cycle of each subject was divided into 8 gates. The proposed ST-VNet achieved a DSC of 0.8914 and 0.8157 for the RV epicardium and endocardium segmentation, respectively. The mean absolute error, the mean squared error, and the Pearson correlation coefficient of the RV ejection fraction (RVEF) between the ground truth and the model prediction were 0.0609, 0.0830, and 0.6985. Conclusion. Our proposed ST-VNet is an effective model for RV segmentation. It has great promise for clinical use in RV functional assessment.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.
translated by 谷歌翻译