智能论文笔记

Random cohort effects and age groups dependency structure for mortality modelling and forecasting: Mixed-effects time-series model approach

Ka Kin Lam , Bo Wang

分类： (统计)机器学习

2021-12-31

鉴于在过去几十年中对许多发达国家的持续增长成为许多发达国家的持续增长，已经有重大努力解决寿命风险。凯恩斯 - 布莱克·罗克（CBD）模型，其中包括群组效应参数在其拟处设计中，是在更高年龄和寿命风险上最着名的死亡率建模方法之一。本文提出了一种新的混合效应时间序列方法，用于考虑年龄组依赖性和随机队列效应参数的考虑因素。所提出的模型可以透露更多的死亡率数据信息，并提供模型参数不确定性的自然量化，没有预先指定的约束，需要估计群组效应参数所需的预先指定的约束。通过具有经验雄性和女性死亡率数据的两种应用来证明所提出的方法的能力。与在数值例子中使用几个发达国家的几个发达国家的死亡率数据的短期，中期和长期预测中的CBD模型相比，该方法在预测准确性方面表现出显着改进。

translated by 谷歌翻译

Self-distillation with Batch Knowledge Ensembling Improves ImageNet Classification

Yixiao Ge , Xiao Zhang , Ching Lam Choi , Ka Chun Cheung , Peipei Zhao , Feng Zhu , Xiaogang Wang , Rui Zhao , Hongsheng Li

分类：计算机视觉

2021-04-27

最近对知识蒸馏的研究发现，组合来自多位教师或学生的“黑暗知识”是有助于为培训创造更好的软目标，但以更大的计算和/或参数的成本为本。在这项工作中，我们通过在同一批量中传播和集合其他样本的知识来提供批处理知识合奏（烘焙）以生产用于锚固图像的精细柔软目标。具体地，对于每个感兴趣的样本，根据采样间的亲和力加权知识的传播，其与当前网络一起估计。然后可以集合传播的知识以形成更好的蒸馏靶。通过这种方式，我们的烘焙框架只通过单个网络跨多个样本进行在线知识。与现有知识合并方法相比，它需要最小的计算和内存开销。广泛的实验表明，轻质但有效的烘烤始终如一地提升多个数据集上各种架构的分类性能，例如，在想象网上的显着+ 0.7％的VINE-T的增益，只有+ 1.5％计算开销和零附加参数。烘焙不仅改善了Vanilla基线，还超越了所有基准的单一网络最先进。

translated by 谷歌翻译

Peer Learning for Unbiased Scene Graph Generation

Liguang Zhou , Junjie Hu , Yuhongze Zhou , Tin Lun Lam , Yangsheng Xu

分类：计算机视觉

2022-12-31

In this paper, we propose a novel framework dubbed peer learning to deal with the problem of biased scene graph generation (SGG). This framework uses predicate sampling and consensus voting (PSCV) to encourage different peers to learn from each other, improving model diversity and mitigating bias in SGG. To address the heavily long-tailed distribution of predicate classes, we propose to use predicate sampling to divide and conquer this issue. As a result, the model is less biased and makes more balanced predicate predictions. Specifically, one peer may not be sufficiently diverse to discriminate between different levels of predicate distributions. Therefore, we sample the data distribution based on frequency of predicates into sub-distributions, selecting head, body, and tail classes to combine and feed to different peers as complementary predicate knowledge during the training process. The complementary predicate knowledge of these peers is then ensembled utilizing a consensus voting strategy, which simulates a civilized voting process in our society that emphasizes the majority opinion and diminishes the minority opinion. This approach ensures that the learned representations of each peer are optimally adapted to the various data distributions. Extensive experiments on the Visual Genome dataset demonstrate that PSCV outperforms previous methods. We have established a new state-of-the-art (SOTA) on the SGCls task by achieving a mean of \textbf{31.6}.

translated by 谷歌翻译

Attentional Graph Convolutional Network for Structure-aware Audio-Visual Scene Classification

Liguang Zhou , Yuhongze Zhou , Xiaonan Qi , Junjie Hu , Tin Lun Lam , Yangsheng Xu

分类：计算机视觉

2022-12-31

Audio-Visual scene understanding is a challenging problem due to the unstructured spatial-temporal relations that exist in the audio signals and spatial layouts of different objects and various texture patterns in the visual images. Recently, many studies have focused on abstracting features from convolutional neural networks while the learning of explicit semantically relevant frames of sound signals and visual images has been overlooked. To this end, we present an end-to-end framework, namely attentional graph convolutional network (AGCN), for structure-aware audio-visual scene representation. First, the spectrogram of sound and input image is processed by a backbone network for feature extraction. Then, to build multi-scale hierarchical information of input features, we utilize an attention fusion mechanism to aggregate features from multiple layers of the backbone network. Notably, to well represent the salient regions and contextual information of audio-visual inputs, the salient acoustic graph (SAG) and contextual acoustic graph (CAG), salient visual graph (SVG), and contextual visual graph (CVG) are constructed for the audio-visual scene representation. Finally, the constructed graphs pass through a graph convolutional network for structure-aware audio-visual scene recognition. Extensive experimental results on the audio, visual and audio-visual scene recognition datasets show that promising results have been achieved by the AGCN methods. Visualizing graphs on the spectrograms and images have been presented to show the effectiveness of proposed CAG/SAG and CVG/SVG that could focus on the salient and semantic relevant regions.

translated by 谷歌翻译

GraphCast: Learning skillful medium-range global weather forecasting

Remi Lam , Alvaro Sanchez-Gonzalez , Matthew Willson , Peter Wirnsberger , Meire Fortunato , Alexander Pritzel , Suman Ravuri , Timo Ewalds , Ferran Alet , Zach Eaton-Rosen

分类：机器学习

2022-12-24

We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an autoregressive model, based on graph neural networks and a novel high-resolution multi-scale mesh representation, which we trained on historical weather data from the European Centre for Medium-Range Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer resolution at the equator. Our results show GraphCast is more accurate than ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the 2760 variable and lead time combinations we evaluated. GraphCast also outperforms the most accurate previous ML-based weather forecasting model on 99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast (35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike traditional forecasting methods, ML-based forecasting scales well with data: by training on bigger, higher quality, and more recent data, the skill of the forecasts can improve. Together these results represent a key step forward in complementing and improving weather modeling with ML, open new opportunities for fast, accurate forecasting, and help realize the promise of ML-based simulation in the physical sciences.

translated by 谷歌翻译

Deep Learning Methods for Calibrated Photometric Stereo and Beyond: A Survey

Yakun Ju , Kin-Man Lam , Wuyuan Xie , Huiyu Zhou , Junyu Dong , Boxin Shi

分类：计算机视觉 | 人工智能

2022-12-16

Photometric stereo recovers the surface normals of an object from multiple images with varying shading cues, i.e., modeling the relationship between surface orientation and intensity at each pixel. Photometric stereo prevails in superior per-pixel resolution and fine reconstruction details. However, it is a complicated problem because of the non-linear relationship caused by non-Lambertian surface reflectance. Recently, various deep learning methods have shown a powerful ability in the context of photometric stereo against non-Lambertian surfaces. This paper provides a comprehensive review of existing deep learning-based calibrated photometric stereo methods. We first analyze these methods from different perspectives, including input processing, supervision, and network architecture. We summarize the performance of deep learning photometric stereo models on the most widely-used benchmark data set. This demonstrates the advanced performance of deep learning-based photometric stereo methods. Finally, we give suggestions and propose future research trends based on the limitations of existing models.

translated by 谷歌翻译

TRIP: Triangular Document-level Pre-training for Multilingual Language Models

Hongyuan Lu , Haoyang Huang , Shuming Ma , Dongdong Zhang , Wai Lam , Furu Wei

分类：自然语言处理

2022-12-15

Despite the current success of multilingual pre-training, most prior works focus on leveraging monolingual data or bilingual parallel data and overlooked the value of trilingual parallel data. This paper presents \textbf{Tri}angular Document-level \textbf{P}re-training (\textbf{TRIP}), which is the first in the field to extend the conventional monolingual and bilingual pre-training to a trilingual setting by (i) \textbf{Grafting} the same documents in two languages into one mixed document, and (ii) predicting the remaining one language as the reference translation. Our experiments on document-level MT and cross-lingual abstractive summarization show that TRIP brings by up to 3.65 d-BLEU points and 6.2 ROUGE-L points on three multilingual document-level machine translation benchmarks and one cross-lingual abstractive summarization benchmark, including multiple strong state-of-the-art (SOTA) scores. In-depth analysis indicates that TRIP improves document-level machine translation and captures better document contexts in at least three characteristics: (i) tense consistency, (ii) noun consistency and (iii) conjunction presence.

translated by 谷歌翻译

Generalizing DP-SGD with Shuffling and Batching Clipping

Marten van Dijk , Phuong Ha Nguyen , Toan N. Nguyen , Lam M. Nguyen

分类：机器学习

2022-12-12

Classical differential private DP-SGD implements individual clipping with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clipping, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clipping). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clipping with shuffling. This is much better than the previously anticipated linear dependency in $g$ and is much better than the previously expected square root dependency on the total number of rounds within $E$ epochs which is generally much more than $\sqrt{E}$.

translated by 谷歌翻译

From Clozing to Comprehending: Retrofitting Pre-trained Language Model to Pre-trained Machine Reader

Weiwen Xu , Xin Li , Wenxuan Zhang , Meng Zhou , Lidong Bing , Wai Lam , Luo Si

分类：自然语言处理

2022-12-09

We present Pre-trained Machine Reader (PMR), a novel method to retrofit Pre-trained Language Models (PLMs) into Machine Reading Comprehension (MRC) models without acquiring labeled data. PMR is capable of resolving the discrepancy between model pre-training and downstream fine-tuning of existing PLMs, and provides a unified solver for tackling various extraction tasks. To achieve this, we construct a large volume of general-purpose and high-quality MRC-style training data with the help of Wikipedia hyperlinks and design a Wiki Anchor Extraction task to guide the MRC-style pre-training process. Although conceptually simple, PMR is particularly effective in solving extraction tasks including Extractive Question Answering and Named Entity Recognition, where it shows tremendous improvements over previous approaches especially under low-resource settings. Moreover, viewing sequence classification task as a special case of extraction task in our MRC formulation, PMR is even capable to extract high-quality rationales to explain the classification process, providing more explainability of the predictions.

translated by 谷歌翻译

TMSTC*: A Turn-minimizing Algorithm For Multi-robot Coverage Path Planning

Junjie Lu , Bi Zeng , Jingtao Tang , Tin Lun Lam

分类：机器人

2022-12-05

Coverage path planning is a major application for mobile robots, which requires robots to move along a planned path to cover the entire map. For large-scale tasks, coverage path planning benefits greatly from multiple robots. In this paper, we describe Turn-minimizing Multirobot Spanning Tree Coverage Star(TMSTC*), an improved multirobot coverage path planning (mCPP) algorithm based on the MSTC*. Our algorithm partitions the map into minimum bricks as tree's branches and thereby transforms the problem into finding the maximum independent set of bipartite graph. We then connect bricks with greedy strategy to form a tree, aiming to reduce the number of turns of corresponding circumnavigating coverage path. Our experimental results show that our approach enables multiple robots to make fewer turns and thus complete terrain coverage tasks faster than other popular algorithms.

translated by 谷歌翻译