病理学家需要结合不同染色病理切片的信息,以获得准确的诊断结果。可变形图像配准是融合多模式病理切片的必要技术。本文提出了一个基于混合特征的基于特征的可变形图像登记框架,用于染色的病理样品。我们首先提取密集的特征点,并通过两个深度学习功能网络执行匹配点。然后,为了进一步减少虚假匹配,提出了一种结合隔离森林统计模型和局部仿射校正模型的异常检测方法。最后,插值方法基于上述匹配点生成用于病理图像注册的DVF。我们在非刚性组织学图像注册(ANHIR)挑战的数据集上评估了我们的方法,该挑战与IEEE ISBI 2019会议共同组织。我们的技术的表现使传统方法的平均水平注册目标误差(RTRE)达到0.0034。所提出的方法实现了最先进的性能,并在评估测试数据集时将其排名1。提出的基于特征的混合特征的注册方法可能会成为病理图像注册的可靠方法。
translated by 谷歌翻译
随着深度学习的普及,深度学习的硬件实施平台引起了人们的兴趣。与通用设备,例如CPU或GPU不同,在软件级别执行深度学习算法,神经网络硬件加速器直接执行算法,以提高能源效率和性能提高。但是,随着深度学习算法的频繁发展,设计硬件加速器的工程工作和成本大大增加了。为了提高设计质量的同时,提出了神经网络加速器的设计自动化,在该设计空间探索算法被用于在设计空间内自动搜索优化的加速器设计。然而,神经网络加速器的复杂性增加为设计空间带来了不断增加的尺寸。结果,以前的设计空间探索算法不再足够有效,无法找到优化的设计。在这项工作中,我们提出了一个名为Gandse的神经网络加速器设计自动化框架,我们在其中重新考虑了设计空间探索的问题,并提出了一种基于生成对抗网络(GAN)的新方法,以支持高尺寸大型设计的优化探索空间。实验表明,与包括多层感知器和深度强化学习在内的方法相比,甘德能够在可忽略的时间中找到更优化的设计。
translated by 谷歌翻译
在进化多目标优化领域,决策者(DM)涉及相互冲突的目标。在现实世界中,通常存在多个DM,每个DM都涉及这些目标的一部分。提出了多方多目标优化问题(MPMOPS)来描绘拖把,其中涉及多个决策者,每个方都关注所有目标的某些目标。但是,在进化计算字段中,对mpmops的关注不多。本文基于距离最小化问题(DMP)构建了一系列MPMOP,它们的Pareto最佳解决方案可以生动地可视化。为了解决MPMOPS,新提出的算法OPTMPNDS3使用多方初始化方法来初始化总体,并带Jade2操作员生成后代。在问题套件上,将OPTMPNDS3与Optall,OptMPND和OptMPNDS2进行了比较。结果表明OPTMPNDS3与其他算法具有很强的可比性
translated by 谷歌翻译
已知用于图像分类的深神经网络(DNN)容易受到对抗性例子的影响。而且,对抗性示例具有可转移性,这意味着DNN模型的对抗示例可以欺骗其他具有非平凡概率的黑框模型。这给出了基于转移的对抗攻击,其中使用了预验证或已知模型(称为替代模型)产生的对抗示例来进行黑盒攻击。关于如何从给定的替代模型中生成对抗性示例以实现更好的可传递性,有一些工作。但是,训练一种特殊的替代模型以生成具有更好可传递性的对抗性示例的情况相对较小的探索。在本文中,我们提出了一种培训具有丰富黑暗知识的替代模型的方法,以提高替代模型产生的对抗性示例的对抗性转移性。该训练有素的替代模型被命名为“黑暗代理模型”(DSM),培训DSM的建议方法由两个关键组成部分组成:一种教师模型提取黑暗知识并提供软标签,以及增强的混合增强技能,增强了训练数据的黑暗知识。已经进行了广泛的实验,以表明所提出的方法可以基本上改善替代模型的不同体系结构和优化者的替代模型的对抗性转移性,以生成对抗性示例。我们还表明,所提出的方法可以应用于包含黑暗知识(例如面部验证)的基于转移攻击的其他情况。
translated by 谷歌翻译
动态和多模式特征是两个重要的属性,并且在许多真实世界优化问题中广泛存在。前者说明了这些问题的目标和/或限制随着时间的推移而变化,而后者意味着在每个环境中存在多于一个最佳解决方案(有时包括接受的本地解决方案)。动态多峰优化问题(DMMOPS)具有这些特征,这些特征都在进化计算和群体智能领域中进行了多年,并吸引了越来越多的关注。解决这些问题需要优化算法在更改环境中同时跟踪多个Optima。因此,决策者可以根据他们的经验和偏好挑选每个环境中的一个最佳解决方案,或者当当前一个无法正常工作时,或者快速转向其他解决方案。这对决策者来说非常有帮助,特别是在面临改变环境时。在本次竞争中,给出了关于DMMOPS的测试套装,其中模拟了现实世界的应用程序。具体而言,该测试服采用8个多模函数和8种变化模式来构建24个典型的动态多模态优化问题。同时,还可以给出度量来测量算法性能,这考虑了所有环境中发现的最佳解决方案的平均数。促进动态多式化优化算法的发展将非常有帮助。
translated by 谷歌翻译
减少全身CT扫描中患者的辐射暴露引起了医学成像界的广泛关注。鉴于低辐射剂量可能导致噪声和伪像增加,这极大地影响了临床诊断。为了获得高质量的全身低剂量CT(LDCT)图像,以前的基于深度学习的研究工作引入了各种网络架构。然而,大多数这些方法只采用正常剂量CT(NDCT)图像作为地面真理来指导去噪网络的训练。这种简单的限制导致模型效率更低,并使重建的图像遭受过平滑的效果。在本文中,我们提出了一种新的任务内知识转移方法,利用来自NDCT图像的蒸馏知识来帮助LDCT图像上的培训过程。派生架构被称为师生一致性网络(TSC-Net),由教师网络和具有相同架构的学生网络组成。通过中间功能之间的监督,鼓励学生网络模仿教师网络并获得丰富的纹理细节。此外,为了进一步利用CT扫描中包含的信息,介绍了在对比学习时建立的对比正规化机制(CRM).CRM执行将恢复的CT图像拉到NDCT样本,并将远离LDCT样本的遥控器中的遥远空间。此外,基于注意力和可变形卷积机制,我们设计了一种动态增强模块(DEM)以提高网络变换能力。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译