Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt / 30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
基于DNN的框架插值从两个连续的帧中生成中间帧,通常取决于具有大量功能的模型体系结构,从而阻止其在具有有限资源的系统(例如移动设备)上部署。我们提出了一种用于框架插值的压缩驱动的网络设计,该设计通过稀疏性诱导优化来利用模型,以大大降低模型大小,同时达到更高的性能。具体而言,我们首先压缩了最近提出的ADACOF模型,并证明了10次压缩ADACOF的性能类似于其原始对应物,在各种超参数设置下,对使用layerwise稀疏信息作为指导的不同策略进行了全面研究。然后,我们通过引入一个多分辨率翘曲模块来增强这种压缩模型,从而提高了视觉一致性,并通过多层次的细节来提高视觉一致性。结果,我们通过原始AdaCof的四分之一获得了可观的性能增长。此外,我们的模型在各种数据集上对其他最先进的方法都表现出色。我们注意到,建议的压缩驱动框​​架是通用的,可以轻松地传输到其他基于DNN的框架插值算法中。源代码可在https://github.com/tding1/cdfi上获得。
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
在对抗文献中,鲁棒性和准确性之间的权衡得到了广泛的研究。尽管仍然有争议,但普遍的观点是,从经验或理论上,这种权衡是固有的。因此,我们在对抗训练中挖掘了这种权衡的起源,发现它可能源于不当定义的可靠错误,该错误施加了局部不变性的诱导偏见 - 对平稳性的过度校正。鉴于此,我们主张采用局部模棱两可来描述健壮模型的理想行为,从而导致自洽的强大错误称为得分。根据定义,得分有助于稳健性与准确性之间的对帐,同时仍通过稳健优化处理最坏情况的不确定性。通过简单地将KL差异替换为距离指标的变体,得分可以有效地最小化。从经验上讲,我们的模型在AutoAttact下的强力板上实现了最高的性能。此外,得分提供了指导性见解,以解释在健壮模型上观察到的过度拟合现象和语义输入梯度。代码可在https://github.com/p2333/score上找到。
translated by 谷歌翻译
跟踪需要为推理阶段构建目标的判别模型。实现这一目标的有效方法是在线学习,可以舒适地占据截肢培训的型号。最近的研究表明,由于其像素级别歧视,视觉跟踪从统一视觉跟踪和分割的统一中受益匪浅。但是,对这种统一模型进行在线学习产生巨大挑战。分段模型不能轻易地从视觉跟踪方案中给出的先前信息学习。在本文中,我们提出了TrackM1P:一种新的元学习方法,优化了仅从部分信息学习以解决强加的挑战。我们的模型能够广泛利用有限的事先信息,因此具有比其他在线学习方法更强大的目标 - 背景辨别性。凭经验,我们表明我们的模型在竞争模型上实现了最先进的性能和切实改善。我们的模式实现了VOT2019,VOT2018,VOT2018和VOT2016数据集的66.0%,67.1%,68.5%的平均重叠增长了6.4%,7.3%,高于我们基线的6.4%。代码将公开可用。
translated by 谷歌翻译
结构化修剪是一种常用的技术,用于将深神经网络(DNN)部署到资源受限的设备上。但是,现有的修剪方法通常是启发式,任务指定的,并且需要额外的微调过程。为了克服这些限制,我们提出了一个框架,将DNN压缩成纤薄的架构,具有竞争性表现,并且仅通过列车 - 一次(OTO)减少重大拖车。 OTO包含两个键:(i)我们将DNN的参数分区为零不变组,使我们能够修剪零组而不影响输出; (ii)促进零群,我们制定了结构性稀疏优化问题,提出了一种新颖的优化算法,半空间随机投影梯度(HSPG),以解决它,这优于组稀疏性探索的标准近端方法和保持可比的收敛性。为了展示OTO的有效性,我们从划痕上同时培训和压缩全模型,而无需微调推理加速和参数减少,并且在CIFAR10的VGG16实现最先进的结果,为CIFAR10和Squad的BERT为BERT竞争结果在resnet50上为想象成。源代码可在https://github.com/tianyic/only_train_once上获得。
translated by 谷歌翻译
基于转移的对抗攻击可以评估黑框设置中的模型鲁棒性。几种方法表现出令人印象深刻的非目标转移性,但是,有效地产生有针对性的可转移性仍然具有挑战性。为此,我们开发了一个简单而有效的框架,以应用层次生成网络制作有针对性的基于转移的对抗性示例。特别是,我们有助于适应多级目标攻击的摊销设计。对Imagenet的广泛实验表明,我们的方法通过与现有方法相比,大幅度的余量提高了目标黑盒攻击的成功率 - 它的平均成功率为29.1 \%,而仅基于一个替代白盒的六种不同模型模型,大大优于最先进的基于梯度的攻击方法。此外,与基于梯度的方法相比,所提出的方法超出了数量级的效率也更有效。
translated by 谷歌翻译
正确分类对抗性示例是安全部署机器学习模型的必不可少但具有挑战性的要求。据抢救模型甚至是最先进的离职训练的模型,在CIFAR-10上努力超过67%的强大测试精度,这远非实用。互动的互补方法是引入拒绝选项,允许模型不返回对不确定输入的预测,自信是常用的确定性代理。随着这个例程,我们发现置信度和纠正的置信度(R-Con)可以形成两个耦合的拒绝度量,这可以从正确分类的次数中可以证明错误分类的输入。这种有趣的属性揭示了使用偶联策略来更好地检测和抑制对抗性实例。我们在包括自适应攻击的若干攻击下,在CiFar-10,CiFar-10-C和CiFar-100上评估我们的整流拒绝(RR)模块,并证明RR模块与改善稳健性的不同的对抗训练框架兼容额外的计算。代码可在https://github.com/p2333/Rectified-re注意到。
translated by 谷歌翻译
Deep neural networks are vulnerable to adversarial examples, which can mislead classifiers by adding imperceptible perturbations. An intriguing property of adversarial examples is their good transferability, making black-box attacks feasible in real-world applications. Due to the threat of adversarial attacks, many methods have been proposed to improve the robustness. Several state-of-the-art defenses are shown to be robust against transferable adversarial examples. In this paper, we propose a translation-invariant attack method to generate more transferable adversarial examples against the defense models. By optimizing a perturbation over an ensemble of translated images, the generated adversarial example is less sensitive to the white-box model being attacked and has better transferability. To improve the efficiency of attacks, we further show that our method can be implemented by convolving the gradient at the untranslated image with a pre-defined kernel. Our method is generally applicable to any gradient-based attack method. Extensive experiments on the ImageNet dataset validate the effectiveness of the proposed method. Our best attack fools eight state-of-the-art defenses at an 82% success rate on average based only on the transferability, demonstrating the insecurity of the current defense techniques.
translated by 谷歌翻译
Neural networks are vulnerable to adversarial examples, which poses a threat to their application in security sensitive systems. We propose high-level representation guided denoiser (HGD) as a defense for image classification. Standard denoiser suffers from the error amplification effect, in which small residual adversarial noise is progressively amplified and leads to wrong classifications. HGD overcomes this problem by using a loss function defined as the difference between the target model's outputs activated by the clean image and denoised image. Compared with ensemble adversarial training which is the state-of-the-art defending method on large images, HGD has three advantages. First, with HGD as a defense, the target model is more robust to either white-box or black-box adversarial attacks. Second, HGD can be trained on a small subset of the images and generalizes well to other images and unseen classes. Third, HGD can be transferred to defend models other than the one guiding it. In NIPS competition on defense against adversarial attacks, our HGD solution won the first place and outperformed other models by a large margin. 1 * Equal contribution.
translated by 谷歌翻译