智能论文笔记

Curvature regularization for Non-line-of-sight Imaging from Under-sampled Data

Rui Ding , Juntian Ye , Qifeng Gao , Feihu Xu , Yuping Duan

分类：计算机视觉

2023-01-01

Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of which is of high possibility to be degraded due to noises and distortions. In this paper, we propose two novel NLOS reconstruction models based on curvature regularization, i.e., the object-domain curvature regularization model and the dual (i.e., signal and object)-domain curvature regularization model. Fast numerical optimization algorithms are developed relying on the alternating direction method of multipliers (ADMM) with the backtracking stepsize rule, which are further accelerated by GPU implementation. We evaluate the proposed algorithms on both synthetic and real datasets, which achieve state-of-the-art performance, especially in the compressed sensing setting. All our codes and data are available at https://github.com/Duanlab123/CurvNLOS.

translated by 谷歌翻译

Flareon: Stealthy any2any Backdoor Injection via Poisoned Augmentation

Tianrui Qin , Xianghuan He , Xitong Gao , Yiren Zhao , Kejiang Ye , Cheng-Zhong Xu

分类：计算机视觉

2022-12-20

Open software supply chain attacks, once successful, can exact heavy costs in mission-critical applications. As open-source ecosystems for deep learning flourish and become increasingly universal, they present attackers previously unexplored avenues to code-inject malicious backdoors in deep neural network models. This paper proposes Flareon, a small, stealthy, seemingly harmless code modification that specifically targets the data augmentation pipeline with motion-based triggers. Flareon neither alters ground-truth labels, nor modifies the training loss objective, nor does it assume prior knowledge of the victim model architecture, training data, and training hyperparameters. Yet, it has a surprisingly large ramification on training -- models trained under Flareon learn powerful target-conditional (or "any2any") backdoors. The resulting models can exhibit high attack success rates for any target choices and better clean accuracies than backdoor attacks that not only seize greater control, but also assume more restrictive attack capabilities. We also demonstrate the effectiveness of Flareon against recent defenses. Flareon is fully open-source and available online to the deep learning community: https://github.com/lafeat/flareon.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Artificial Intelligence Security Competition (AISC)

Yinpeng Dong , Peng Chen , Senyou Deng , Lianji L , Yi Sun , Hanyu Zhao , Jiaxing Li , Yunteng Tan , Xinyu Liu , Yangyi Dong

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-07

The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.

translated by 谷歌翻译

Lifelong Person Re-Identification via Knowledge Refreshing and Consolidation

Chunlin Yu , Ye Shi , Zimo Liu , Shenghua Gao , Jingya Wang

分类：计算机视觉

2022-11-29

Lifelong person re-identification (LReID) is in significant demand for real-world development as a large amount of ReID data is captured from diverse locations over time and cannot be accessed at once inherently. However, a key challenge for LReID is how to incrementally preserve old knowledge and gradually add new capabilities to the system. Unlike most existing LReID methods, which mainly focus on dealing with catastrophic forgetting, our focus is on a more challenging problem, which is, not only trying to reduce the forgetting on old tasks but also aiming to improve the model performance on both new and old tasks during the lifelong learning process. Inspired by the biological process of human cognition where the somatosensory neocortex and the hippocampus work together in memory consolidation, we formulated a model called Knowledge Refreshing and Consolidation (KRC) that achieves both positive forward and backward transfer. More specifically, a knowledge refreshing scheme is incorporated with the knowledge rehearsal mechanism to enable bi-directional knowledge transfer by introducing a dynamic memory model and an adaptive working model. Moreover, a knowledge consolidation scheme operating on the dual space further improves model stability over the long term. Extensive evaluations show KRC's superiority over the state-of-the-art LReID methods on challenging pedestrian benchmarks.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

MiddleGAN: Generate Domain Agnostic Samples for Unsupervised Domain Adaptation

Ye Gao , Zhendong Chu , Hongning Wang , John Stankovic

分类：计算机视觉 | 人工智能

2022-11-06

In recent years, machine learning has achieved impressive results across different application areas. However, machine learning algorithms do not necessarily perform well on a new domain with a different distribution than its training set. Domain Adaptation (DA) is used to mitigate this problem. One approach of existing DA algorithms is to find domain invariant features whose distributions in the source domain are the same as their distribution in the target domain. In this paper, we propose to let the classifier that performs the final classification task on the target domain learn implicitly the invariant features to perform classification. It is achieved via feeding the classifier during training generated fake samples that are similar to samples from both the source and target domains. We call these generated samples domain-agnostic samples. To accomplish this we propose a novel variation of generative adversarial networks (GAN), called the MiddleGAN, that generates fake samples that are similar to samples from both the source and target domains, using two discriminators and one generator. We extend the theory of GAN to show that there exist optimal solutions for the parameters of the two discriminators and one generator in MiddleGAN, and empirically show that the samples generated by the MiddleGAN are similar to both samples from the source domain and samples from the target domain. We conducted extensive evaluations using 24 benchmarks; on the 24 benchmarks, we compare MiddleGAN against various state-of-the-art algorithms and outperform the state-of-the-art by up to 20.1\% on certain benchmarks.

translated by 谷歌翻译

Efficient View Path Planning for Autonomous Implicit Reconstruction

Jing Zeng , Yanxu Li , Yunlong Ran , Shuo Li , Fei Gao , Lincheng Li , Shibo He , Jiming chen , Qi Ye

分类：机器人

2022-09-27

隐式神经表示显示了3D场景重建的有希望的潜力。最近的工作将其应用于自主3D重建，通过学习信息获得图路径计划的信息增益。有效，信息增益的计算很昂贵，并且与使用体积表示相比，使用隐式表示为3D点进行碰撞检查要慢得多。在本文中，我们建议1）利用神经网络作为信息增益场的隐式函数近似器，以及2）将隐式细粒表示与粗量表示形式结合起来，以提高效率。随着效率的提高，我们提出了基于基于图的计划者的新型信息路径计划。我们的方法表明，与具有隐性和明确表示的自主重建相比，重建质量和计划效率的显着提高。我们将该方法部署在真正的无人机上，结果表明我们的方法可以计划信息意见并以高质量重建场景。

translated by 谷歌翻译

Bit Allocation using Optimization

Tongda Xu , Han Gao , Chenjian Gao , Jinyong Pi , Yanghao Li , Yuanyuan Wang , Ziyu Zhu , Dailan He , Mao Ye , Hongwei Qin

分类：计算机视觉

2022-09-20

在本文中，我们考虑了神经视频压缩（NVC）中位分配的问题。由于帧参考结构，使用相同的R-D（速率）权衡参数$ \ lambda $的当前NVC方法是次优的，这带来了位分配的需求。与以前基于启发式和经验R-D模型的方法不同，我们建议通过基于梯度的优化解决此问题。具体而言，我们首先提出了一种基于半损坏的变异推理（SAVI）的连续位实现方法。然后，我们通过更改SAVI目标，使用迭代优化提出了一个像素级隐式分配方法。此外，我们基于NVC的可区分特征得出了精确的R-D模型。我们通过使用精确的R-D模型证明其等效性与位分配的等效性来展示我们的方法的最佳性。实验结果表明，我们的方法显着改善了NVC方法，并且胜过现有的位分配方法。我们的方法是所有可区分NVC方法的插件，并且可以直接在现有的预训练模型上采用。

translated by 谷歌翻译

Learn to Adapt to New Environment from Past Experience and Few Pilot

Ouya Wang , Jiabao Gao , Geoffrey Ye Li

分类：机器学习

2022-09-02

近年来，深度学习已被广泛应用于沟通，并取得了显着的绩效提高。大多数现有作品都是基于数据驱动的深度学习，该学习需要大量的通信模型培训数据，以适应新的环境，并为收集数据和重新训练模型提供庞大的计算资源。在本文中，我们将通过利用已知环境的学习经验来大大减少新环境所需的培训数据。因此，我们介绍了很少的学习学习，以使通信模型推广到新环境，这是通过基于注意力的方法实现的。随着注意网络嵌入了基于深度学习的沟通模型中，可以在培训过程中一起学习具有不同功率延迟概况的环境，这称为学习经验。通过利用学习经验，沟通模型只需要很少的飞行员块即可在新环境中表现良好。通过基于深度学习的渠道估计的示例，我们证明了这种新颖的设计方法比为少数拍摄学习设计的现有数据驱动方法的性能更好。

translated by 谷歌翻译