Synthetic datasets are often used to pretrain end-to-end optical flow networks, due to the lack of a large amount of labeled, real-scene data. But major drops in accuracy occur when moving from synthetic to real scenes. How do we better transfer the knowledge learned from synthetic to real domains? To this end, we propose CLIP-FLow, a semi-supervised iterative pseudo-labeling framework to transfer the pretraining knowledge to the target real domain. We leverage large-scale, unlabeled real data to facilitate transfer learning with the supervision of iteratively updated pseudo-ground truth labels, bridging the domain gap between the synthetic and the real. In addition, we propose a contrastive flow loss on reference features and the warped features by pseudo ground truth flows, to further boost the accurate matching and dampen the mismatching due to motion, occlusion, or noisy pseudo labels. We adopt RAFT as the backbone and obtain an F1-all error of 4.11%, i.e. a 19% error reduction from RAFT (5.10%) and ranking 2$^{nd}$ place at submission on the KITTI 2015 benchmark. Our framework can also be extended to other models, e.g. CRAFT, reducing the F1-all error from 4.79% to 4.66% on KITTI 2015 benchmark.
translated by 谷歌翻译
传统的联邦优化方法的性能较差(即降低准确性),尤其是对于高度偏斜的数据。在本文中,我们调查了佛罗里达州的标签分布偏斜,在那里标签的分布各不相同。首先,我们从统计视图研究了标签分布偏斜。我们在理论上和经验上都证明了基于软马克斯跨凝结的先前方法不合适,这可能会导致本地模型非常适合少数群体和缺失的类别。此外,我们从理论上引入了一个偏离,以测量本地更新后梯度的偏差。最后,我们建议通过\ textbf {l} ogits \ textbf {c}启动)FedLc(\ textbf {fed {fed}学习,该学习根据每个类别的出现可能性。 FedLC通过添加成对标签的边距将细粒度校准的跨透镜损失应用于本地更新。联合数据集和现实世界数据集的广泛实验表明,联邦快递会导致更准确的全球模型和大大改善的性能。此外,将其他FL方法集成到我们的方法中可以进一步增强全球模型的性能。
translated by 谷歌翻译
土地覆盖分类是一项多级分割任务,将每个像素分类为地球表面的某些天然或人为类别,例如水,土壤,自然植被,农作物和人类基础设施。受硬件计算资源和内存能力的限制,大多数现有研究通过将它们放置或将其裁剪成小于512*512像素的小斑块来预处理原始遥感图像,然后再将它们发送到深神经网络。然而,下调图像会导致空间细节损失,使小细分市场难以区分,并逆转了数十年来努力获得的空间分辨率进度。将图像裁剪成小斑块会导致远程上下文信息的丢失,并将预测的结果恢复为原始大小会带来额外的延迟。为了响应上述弱点,我们提出了称为Mkanet的有效的轻巧的语义分割网络。 Mkanet针对顶视图高分辨率遥感图像的特征,利用共享内核同时且同样处理不一致的尺度的地面段,还采用平行且浅层的体系结构来提高推理速度和友好的支持速度和友好的支持图像贴片,超过10倍。为了增强边界和小段歧视,我们还提出了一种捕获类别杂质区域的方法,利用边界信息并对边界和小部分错误判断施加额外的惩罚。广泛实验的视觉解释和定量指标都表明,Mkanet在两个土地覆盖分类数据集上获得了最先进的准确性,并且比其他竞争性轻量级网络快2倍。所有这些优点突出了Mkanet在实际应用中的潜力。
translated by 谷歌翻译
在线性回归中,斜率是一种新的凸分析方法,通过分类的L1惩罚推广套索:更大的装配系数更大。这种依赖性正则化需要输入惩罚序列$ \ lambda $,而不是在套索案件中的标量惩罚,从而使设计在计算中非常昂贵。在本文中,我们提出了两个有效的算法来设计可能的高维坡度损失,以便最小化平均平方误差。对于高斯数据矩阵,我们在近似消息传递制度下提出了一个第一个订单投影梯度下降(PGD)。对于一般数据矩阵,我们呈现了一个零阶坐标血统(CD)来设计斜率的子类,称为K级斜率。我们的CD允许在准确性和计算速度之间进行有用的权衡。我们通过对综合数据和现实世界数据集的广泛实验展示了坡度与我们的设计的表现。
translated by 谷歌翻译
Crowdsourcing, in which human intelligence and productivity is dynamically mobilized to tackle tasks too complex for automation alone to handle, has grown to be an important research topic and inspired new businesses (e.g., Uber, Airbnb). Over the years, crowdsourcing has morphed from providing a platform where workers and tasks can be matched up manually into one which leverages data-driven algorithmic management approaches powered by artificial intelligence (AI) to achieve increasingly sophisticated optimization objectives. In this paper, we provide a survey presenting a unique systematic overview on how AI can empower crowdsourcing - which we refer to as AI-Empowered Crowdsourcing(AIEC). We propose a taxonomy which divides algorithmic crowdsourcing into three major areas: 1) task delegation, 2) motivating workers, and 3) quality control, focusing on the major objectives which need to be accomplished. We discuss the limitations and insights, and curate the challenges of doing research in each of these areas to highlight promising future research directions.
translated by 谷歌翻译
Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning -- a setting where not all the data samples are labeled. An underlying issue in this scenario is the model forgetting representations of unlabeled data and overfitting the labeled ones. We leverage the power of nearest-neighbor classifiers to non-linearly partition the feature space and learn a strong representation for the current task, as well as distill relevant information from previous tasks. We perform a thorough experimental evaluation and show that our method outperforms all the existing approaches by large margins, setting a strong state of the art on the continual semi-supervised learning paradigm. For example, on CIFAR100 we surpass several others even when using at least 30 times less supervision (0.8% vs. 25% of annotations).
translated by 谷歌翻译
Learning semantic-rich representations from raw unlabeled time series data is critical for downstream tasks such as classification and forecasting. Contrastive learning has recently shown its promising representation learning capability in the absence of expert annotations. However, existing contrastive approaches generally treat each instance independently, which leads to false negative pairs that share the same semantics. To tackle this problem, we propose MHCCL, a Masked Hierarchical Cluster-wise Contrastive Learning model, which exploits semantic information obtained from the hierarchical structure consisting of multiple latent partitions for multivariate time series. Motivated by the observation that fine-grained clustering preserves higher purity while coarse-grained one reflects higher-level semantics, we propose a novel downward masking strategy to filter out fake negatives and supplement positives by incorporating the multi-granularity information from the clustering hierarchy. In addition, a novel upward masking strategy is designed in MHCCL to remove outliers of clusters at each partition to refine prototypes, which helps speed up the hierarchical clustering process and improves the clustering quality. We conduct experimental evaluations on seven widely-used multivariate time series datasets. The results demonstrate the superiority of MHCCL over the state-of-the-art approaches for unsupervised time series representation learning.
translated by 谷歌翻译
Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample size, such as analysis of multi-omics data, present daunting challenges. Imputation is arguably the most popular method for handling missing data, though existing imputation methods have a number of limitations. Single imputation methods such as matrix completion methods do not adequately account for imputation uncertainty and hence would yield improper statistical inference. In contrast, multiple imputation (MI) methods allow for proper inference but existing methods do not perform well in high-dimensional settings. Our work aims to address these significant methodological gaps, leveraging recent advances in neural network Gaussian process (NNGP) from a Bayesian viewpoint. We propose two NNGP-based MI methods, namely MI-NNGP, that can apply multiple imputations for missing values from a joint (posterior predictive) distribution. The MI-NNGP methods are shown to significantly outperform existing state-of-the-art methods on synthetic and real datasets, in terms of imputation error, statistical inference, robustness to missing rates, and computation costs, under three missing data mechanisms, MCAR, MAR, and MNAR.
translated by 谷歌翻译
Compared to the great progress of large-scale vision transformers (ViTs) in recent years, large-scale models based on convolutional neural networks (CNNs) are still in an early state. This work presents a new large-scale CNN-based foundation model, termed InternImage, which can obtain the gain from increasing parameters and training data like ViTs. Different from the recent CNNs that focus on large dense kernels, InternImage takes deformable convolution as the core operator, so that our model not only has the large effective receptive field required for downstream tasks such as detection and segmentation, but also has the adaptive spatial aggregation conditioned by input and task information. As a result, the proposed InternImage reduces the strict inductive bias of traditional CNNs and makes it possible to learn stronger and more robust patterns with large-scale parameters from massive data like ViTs. The effectiveness of our model is proven on challenging benchmarks including ImageNet, COCO, and ADE20K. It is worth mentioning that InternImage-H achieved the new record 65.4 mAP on COCO test-dev. The code will be released at https://github.com/OpenGVLab/InternImage.
translated by 谷歌翻译
Adversarial perturbation plays a significant role in the field of adversarial robustness, which solves a maximization problem over the input data. We show that the backward propagation of such optimization can accelerate $2\times$ (and thus the overall optimization including the forward propagation can accelerate $1.5\times$), without any utility drop, if we only compute the output gradient but not the parameter gradient during the backward propagation.
translated by 谷歌翻译