智能论文笔记

Detection of Strongly Lensed Arcs in Galaxy Clusters with Transformers

Peng Jia , Ruiqi Sun , Nan Li , Yu Song , Runyu Ning , Hongyan Wei , Rui Luo

分类：计算机视觉

2022-11-11

Strong lensing in galaxy clusters probes properties of dense cores of dark matter halos in mass, studies the distant universe at flux levels and spatial resolutions otherwise unavailable, and constrains cosmological models independently. The next-generation large scale sky imaging surveys are expected to discover thousands of cluster-scale strong lenses, which would lead to unprecedented opportunities for applying cluster-scale strong lenses to solve astrophysical and cosmological problems. However, the large dataset challenges astronomers to identify and extract strong lensing signals, particularly strongly lensed arcs, because of their complexity and variety. Hence, we propose a framework to detect cluster-scale strongly lensed arcs, which contains a transformer-based detection algorithm and an image simulation algorithm. We embed prior information of strongly lensed arcs at cluster-scale into the training data through simulation and then train the detection algorithm with simulated images. We use the trained transformer to detect strongly lensed arcs from simulated and real data. Results show that our approach could achieve 99.63 % accuracy rate, 90.32 % recall rate, 85.37 % precision rate and 0.23 % false positive rate in detection of strongly lensed arcs from simulated images and could detect almost all strongly lensed arcs in real observation images. Besides, with an interpretation method, we have shown that our method could identify important information embedded in simulated data. Next step, to test the reliability and usability of our approach, we will apply it to available observations (e.g., DESI Legacy Imaging Surveys) and simulated data of upcoming large-scale sky surveys, such as the Euclid and the CSST.

translated by 谷歌翻译

Few-shot Open-set Recognition Using Background as Unknowns

Nan Song , Chi Zhang , Guosheng Lin

分类：计算机视觉

2022-07-19

很少有开放式识别旨在对可见类别的培训数据进行有限的培训数据进行分类和新颖的图像。这项任务的挑战是，该模型不仅需要学习判别性分类器，以用很少的培训数据对预定的类进行分类，而且还要拒绝从未见过的培训时间出现的未见类别的输入。在本文中，我们建议从两个新方面解决问题。首先，我们没有像在标准的封闭设置分类中那样学习看到类之间的决策边界，而是为看不见的类保留空间，因此位于这些区域中的图像被认为是看不见的类。其次，为了有效地学习此类决策边界，我们建议利用所见类的背景功能。由于这些背景区域没有显着促进近距离分类的决定，因此自然地将它们用作分类器学习的伪阶层。我们的广泛实验表明，我们提出的方法不仅要优于多个基线，而且还为三个流行的基准测试（即Tieredimagenet，Miniimagenet和Caltech-uscd Birds-birds-2011-2011（Cub））设定了新的最先进结果。

translated by 谷歌翻译

RobustAnalog: Fast Variation-Aware Analog Circuit Design Via Multi-task RL

Wei Shi , Hanrui Wang , Jiaqi Gu , Mingjie Liu , David Pan , Song Han , Nan Sun

分类：人工智能 | 机器学习

2022-07-13

模拟/混合信号电路设计是整个芯片设计过程中最复杂，最耗时的阶段之一。由于芯片制造的各种过程，电压和温度（PVT）变化，模拟电路不可避免地会遭受性能降解。尽管在典型条件下自动化模拟电路设计方面已经有很多工作，但在探索在真实且不可预测的硅变化下探索可靠设计的研究有限。针对变化的自动模拟设计需要过度的计算和时间成本。为了应对挑战，我们提出了RobustanAlog，这是一个强大的电路设计框架，涉及优化过程中的变化信息。具体而言，不同变化下的电路优化被认为是一组任务。任务之间的相似之处是杠杆作用，并且可以缓解竞争以实现样本效率高的多任务培训。此外，Robustanalog根据每次迭代中当前的性能来修剪任务空间，从而导致进一步的模拟成本降低。这样，鲁棒可以迅速产生一组电路参数，这些电路参数满足各种变化的各种约束（例如增益，带宽，噪声...）。我们将Robustanalog与贝叶斯优化，进化算法和深层确定性策略梯度（DDPG）进行了比较，并证明Robustanalog可以将所需的优化时间显着减少14-30次。因此，我们的研究提供了一种处理各种真实硅条件的可行方法。

translated by 谷歌翻译

Joint Generator-Ranker Learning for Natural Language Generation

Weizhou Shen , Yeyun Gong , Yelong Shen , Song Wang , Xiaojun Quan , Nan Duan , Weizhu Chen

分类：自然语言处理

2022-06-28

由于暴露偏见，大多数现有的自然语言产生（NLG）模型通过最大化的可能性目标训练了推理阶段的文本结果不佳。在本文中，为了解决此问题，我们重新审视生成的框架，并提出了用于文本生成任务的联合发电机库（JGR）培训算法。在JGR中，生成器模型是通过最大化两个目标来训练的：训练语料库的可能性和排名者模型给出的预期奖励。同时，Ranker模型从发电机模型中获取输入样本，并学会了将优质样本与生成池区分开来。发电机和排名模型交替优化，直到收敛为止。在实证研究中，提出的JGR模型在五个公共基准测试中实现了新的最先进的表现，涵盖了三项大众一代任务：摘要，问题生成和回答生成。我们将在https://github.com/microsoft/advnlg上提供代码，数据和模型。

translated by 谷歌翻译

4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions

Patrick Wenzel , Nan Yang , Rui Wang , Niclas Zeller , Daniel Cremers

分类：计算机视觉

2022-12-31

In this paper, we present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset. The proposed benchmark provides drastic appearance variations caused by seasonal changes and diverse weather and illumination conditions. While significant progress has been made in advancing visual SLAM on small-scale datasets with similar conditions, there is still a lack of unified benchmarks representative of real-world scenarios for autonomous driving. We introduce a new unified benchmark for jointly evaluating visual odometry, global place recognition, and map-based visual localization performance which is crucial to successfully enable autonomous driving in any condition. The data has been collected for more than one year, resulting in more than 300 km of recordings in nine different environments ranging from a multi-level parking garage to urban (including tunnels) to countryside and highway. We provide globally consistent reference poses with up to centimeter-level accuracy obtained from the fusion of direct stereo-inertial odometry with RTK GNSS. We evaluate the performance of several state-of-the-art visual odometry and visual localization baseline approaches on the benchmark and analyze their properties. The experimental results provide new insights into current approaches and show promising potential for future research. Our benchmark and evaluation protocols will be available at https://www.4seasons-dataset.com/.

translated by 谷歌翻译

Push-the-Boundary: Boundary-aware Feature Propagation for Semantic Segmentation of 3D Point Clouds

Shenglan Du , Nail Ibrahimli , Jantien Stoter , Julian Kooij , Liangliang Nan

分类：计算机视觉

2022-12-23

Feedforward fully convolutional neural networks currently dominate in semantic segmentation of 3D point clouds. Despite their great success, they suffer from the loss of local information at low-level layers, posing significant challenges to accurate scene segmentation and precise object boundary delineation. Prior works either address this issue by post-processing or jointly learn object boundaries to implicitly improve feature encoding of the networks. These approaches often require additional modules which are difficult to integrate into the original architecture. To improve the segmentation near object boundaries, we propose a boundary-aware feature propagation mechanism. This mechanism is achieved by exploiting a multi-task learning framework that aims to explicitly guide the boundaries to their original locations. With one shared encoder, our network outputs (i) boundary localization, (ii) prediction of directions pointing to the object's interior, and (iii) semantic segmentation, in three parallel streams. The predicted boundaries and directions are fused to propagate the learned features to refine the segmentation. We conduct extensive experiments on the S3DIS and SensatUrban datasets against various baseline methods, demonstrating that our proposed approach yields consistent improvements by reducing boundary errors. Our code is available at https://github.com/shenglandu/PushBoundary.

translated by 谷歌翻译

Learning to Detect and Segment for Open Vocabulary Object Detection

Tao Wang , Nan Li

分类：计算机视觉

2022-12-23

Open vocabulary object detection has been greatly advanced by the recent development of vision-language pretrained model, which helps recognize novel objects with only semantic categories. The prior works mainly focus on knowledge transferring to the object proposal classification and employ class-agnostic box and mask prediction. In this work, we propose CondHead, a principled dynamic network design to better generalize the box regression and mask segmentation for open vocabulary setting. The core idea is to conditionally parameterize the network heads on semantic embedding and thus the model is guided with class-specific knowledge to better detect novel categories. Specifically, CondHead is composed of two streams of network heads, the dynamically aggregated head and the dynamically generated head. The former is instantiated with a set of static heads that are conditionally aggregated, these heads are optimized as experts and are expected to learn sophisticated prediction. The latter is instantiated with dynamically generated parameters and encodes general class-specific information. With such a conditional design, the detection model is bridged by the semantic embedding to offer strongly generalizable class-wise box and mask prediction. Our method brings significant improvement to the state-of-the-art open vocabulary object detection methods with very minor overhead, e.g., it surpasses a RegionClip model by 3.0 detection AP on novel categories, with only 1.1% more computation.

translated by 谷歌翻译

Predicting Survival of Tongue Cancer Patients by Machine Learning Models

Angelos Vasilopoulos , Nan Miles Xi

分类：机器学习

2022-12-23

Tongue cancer is a common oral cavity malignancy that originates in the mouth and throat. Much effort has been invested in improving its diagnosis, treatment, and management. Surgical removal, chemotherapy, and radiation therapy remain the major treatment for tongue cancer. The survival of patients determines the treatment effect. Previous studies have identified certain survival and risk factors based on descriptive statistics, ignoring the complex, nonlinear relationship among clinical and demographic variables. In this study, we utilize five cutting-edge machine learning models and clinical data to predict the survival of tongue cancer patients after treatment. Five-fold cross-validation, bootstrap analysis, and permutation feature importance are applied to estimate and interpret model performance. The prognostic factors identified by our method are consistent with previous clinical studies. Our method is accurate, interpretable, and thus useable as additional evidence in tongue cancer treatment and management.

translated by 谷歌翻译

GENIE: Large Scale Pre-training for Text Generation with Diffusion Model

Zhenghao Lin , Yeyun Gong , Yelong Shen , Tong Wu , Zhihao Fan , Chen Lin , Weizhu Chen , Nan Duan

分类：自然语言处理 | 机器学习

2022-12-22

In this paper, we propose a large-scale language pre-training for text GENeration using dIffusion modEl, which is named GENIE. GENIE is a pre-training sequence-to-sequence text generation model which combines Transformer and diffusion. The diffusion model accepts the latent information from the encoder, which is used to guide the denoising of the current time step. After multiple such denoise iterations, the diffusion model can restore the Gaussian noise to the diverse output text which is controlled by the input text. Moreover, such architecture design also allows us to adopt large scale pre-training on the GENIE. We propose a novel pre-training method named continuous paragraph denoise based on the characteristics of the diffusion model. Extensive experiments on the XSum, CNN/DailyMail, and Gigaword benchmarks shows that GENIE can achieves comparable performance with various strong baselines, especially after pre-training, the generation quality of GENIE is greatly improved. We have also conduct a lot of experiments on the generation diversity and parameter impact of GENIE. The code for GENIE will be made publicly available.

translated by 谷歌翻译

Trajectory Generation and Tracking Control for Aggressive Tail-Sitter Flights

Guozheng Lu , Yixi Cai , Nan Chen , Fanze Kong , Yunfan Ren , Fu Zhang

分类：机器人

2022-12-22

We address the theoretical and practical problems related to the trajectory generation and tracking control of tail-sitter UAVs. Theoretically, we focus on the differential flatness property with full exploitation of actual UAV aerodynamic models, which lays a foundation for generating dynamically feasible trajectory and achieving high-performance tracking control. We have found that a tail-sitter is differentially flat with accurate aerodynamic models within the entire flight envelope, by specifying coordinate flight condition and choosing the vehicle position as the flat output. This fundamental property allows us to fully exploit the high-fidelity aerodynamic models in the trajectory planning and tracking control to achieve accurate tail-sitter flights. Particularly, an optimization-based trajectory planner for tail-sitters is proposed to design high-quality, smooth trajectories with consideration of kinodynamic constraints, singularity-free constraints and actuator saturation. The planned trajectory of flat output is transformed to state trajectory in real-time with consideration of wind in environments. To track the state trajectory, a global, singularity-free, and minimally-parameterized on-manifold MPC is developed, which fully leverages the accurate aerodynamic model to achieve high-accuracy trajectory tracking within the whole flight envelope. The effectiveness of the proposed framework is demonstrated through extensive real-world experiments in both indoor and outdoor field tests, including agile SE(3) flight through consecutive narrow windows requiring specific attitude and with speed up to 10m/s, typical tail-sitter maneuvers (transition, level flight and loiter) with speed up to 20m/s, and extremely aggressive aerobatic maneuvers (Wingover, Loop, Vertical Eight and Cuban Eight) with acceleration up to 2.5g.

translated by 谷歌翻译