Photorealistic style transfer aims to transfer the artistic style of an image onto an input image or video while keeping photorealism. In this paper, we think it's the summary statistics matching scheme in existing algorithms that leads to unrealistic stylization. To avoid employing the popular Gram loss, we propose a self-supervised style transfer framework, which contains a style removal part and a style restoration part. The style removal network removes the original image styles, and the style restoration network recovers image styles in a supervised manner. Meanwhile, to address the problems in current feature transformation methods, we propose decoupled instance normalization to decompose feature transformation into style whitening and restylization. It works quite well in ColoristaNet and can transfer image styles efficiently while keeping photorealism. To ensure temporal coherency, we also incorporate optical flow methods and ConvLSTM to embed contextual information. Experiments demonstrates that ColoristaNet can achieve better stylization effects when compared with state-of-the-art algorithms.
translated by 谷歌翻译
我们提出了EasyRec,这是一个易于使用,可扩展和高效的推荐框架,用于构建工业推荐系统。我们的EasyRec框架在以下方面是优越的:首先,EasyRec采用模块化和可插入的设计模式来减少建立定制模型的努力;其次,EasyRec实现了超参数优化和特征选择算法,以自动提高模型性能;第三,EasyRec应用在线学习,以快速适应不断变化的数据分布。该代码发布:https://github.com/alibaba/easyrec。
translated by 谷歌翻译
在联合学习(FL)中,模型性能通常遭受数据异质性引起的客户漂移,而主流工作则专注于纠正客户漂移。我们提出了一种名为Virtual同质性学习(VHL)的不同方法,以直接“纠正”数据异质性。尤其是,VHL使用一个虚拟均匀的数据集进行FL,该数据集精心制作以满足两个条件:不包含私人信息和可分开的情况。虚拟数据集可以从跨客户端共享的纯噪声中生成,旨在校准异质客户的功能。从理论上讲,我们证明VHL可以在自然分布上实现可证明的概括性能。从经验上讲,我们证明了VHL赋予FL具有巨大改善的收敛速度和概括性能。VHL是使用虚拟数据集解决数据异质性的首次尝试,为FL提供了新的有效手段。
translated by 谷歌翻译
在实践中,很难收集配对的培训数据,但是不合格的样本广泛存在。当前的方法旨在通过探索损坏的数据和清洁数据之间的关系来从未配对样本中生成合成的培训数据。这项工作提出了Lud-Vae,这是一种从边际分布中采样的数据中学习关节概率密度函数的深层生成方法。我们的方法基于一个经过精心设计的概率图形模型,在该模型中,干净和损坏的数据域在条件上是独立的。使用变异推断,我们最大化证据下限(ELBO)以估计关节概率密度函数。此外,我们表明在推理不变假设下没有配对样品的情况下,ELBO是可以计算的。该属性在未配对的环境中提供了我们方法的数学原理。最后,我们将我们的方法应用于现实世界图像denoising,超分辨率和低光图像增强任务,并使用Lud-vae生成的合成数据训练模型。实验结果验证了我们方法比其他方法的优势。
translated by 谷歌翻译
广告分配涉及将广告和有机项目分配给有限的饲料插槽,以最大化平台收入,已成为研究热点。请注意,电子商务平台通常有多个针对不同类别的入口,并且某些入口几乎没有访问。这些入口的数据覆盖范围较低,这使得代理很难学习。为了应对这一挑战,我们提出了基于相似性的ADS分配(SHTAA)的混合转移,该转移有效地将样本和知识从数据富裕的入口转移到数据贫乏的入口。具体而言,我们为MDP定义了不确定性感知的相似性,以估计不同入口的MDP的相似性。基于这种相似性,我们设计了一种混合转移方法,包括实例传输和策略传输,以有效地将样本和知识从一个入口传递到另一个入口。 Meituan食品交付平台上的离线和在线实验都表明,该建议的方法可以在数据贫困的入口方面获得更好的性能并增加平台的收入。
translated by 谷歌翻译
随着强化学习(RL)的最新流行率,在推荐平台(例如电子商务和新闻提要网站)中利用RL来利用RL进行广泛的兴趣。为了获得更好的分配,将最近基于RL的广告分配方法的输入从点单项目升级到列表项目的布置。但是,这也导致了国家行动对的高维空间,因此很难以良好的概括能力学习列表表示。这进一步阻碍了RL药物的探索,并导致样本效率差。为了解决这个问题,我们提出了一种基于RL的新方法,用于广告分配,该方法通过利用Meituan食品交付平台上的任务特定信号来学习更好的列表表示形式。具体而言,我们根据对ADS分配的先前领域知识分别提出基于重建,预测和对比度学习的三个不同的辅助任务。我们在Meituan食品输送平台上进行了广泛的实验,以评估拟议的辅助任务的有效性。离线和在线实验结果都表明,与最先进的基线相比,提出的方法可以学习更好的列表表示形式,并获得更高的平台收入。
translated by 谷歌翻译
COVID-19大流行威胁着全球健康。许多研究应用了深度卷积神经网络(CNN),以识别基于胸部3D计算机断层扫描(CT)的COVID-19。最近的作品表明,没有模型在不同国家 /地区的CT数据集中概括得很好,并且为特定数据集设计模型需要专业知识。因此,旨在自动搜索模型的神经体系结构搜索(NAS)已成为一个有吸引力的解决方案。为了降低大型3D CT数据集的搜索成本,大多数基于NAS的作品都使用权重共享(WS)策略来使所有型号在超级网中共享权重。但是,WS不可避免地会导致搜索不稳定性,从而导致模型估计不准确。在这项工作中,我们提出了一个有效的进化多目标架构搜索(EMARS)框架。我们提出了一个新的目标,即潜在的潜力,可以帮助利用有前途的模型间接减少权重训练中涉及的模型数量,从而减轻搜索不稳定性。我们证明,在准确性和潜力的目标下,EMAR可以平衡剥削和探索,即减少搜索时间并找到更好的模型。我们的搜索模型很小,并且比在三个公共Covid-19 3D CT数据集上的先前工作表现更好。
translated by 谷歌翻译
The space-air-ground integrated network (SAGIN), one of the key technologies for next-generation mobile communication systems, can facilitate data transmission for users all over the world, especially in some remote areas where vast amounts of informative data are collected by Internet of remote things (IoRT) devices to support various data-driven artificial intelligence (AI) services. However, training AI models centrally with the assistance of SAGIN faces the challenges of highly constrained network topology, inefficient data transmission, and privacy issues. To tackle these challenges, we first propose a novel topology-aware federated learning framework for the SAGIN, namely Olive Branch Learning (OBL). Specifically, the IoRT devices in the ground layer leverage their private data to perform model training locally, while the air nodes in the air layer and the ring-structured low earth orbit (LEO) satellite constellation in the space layer are in charge of model aggregation (synchronization) at different scales.To further enhance communication efficiency and inference performance of OBL, an efficient Communication and Non-IID-aware Air node-Satellite Assignment (CNASA) algorithm is designed by taking the data class distribution of the air nodes as well as their geographic locations into account. Furthermore, we extend our OBL framework and CNASA algorithm to adapt to more complex multi-orbit satellite networks. We analyze the convergence of our OBL framework and conclude that the CNASA algorithm contributes to the fast convergence of the global model. Extensive experiments based on realistic datasets corroborate the superior performance of our algorithm over the benchmark policies.
translated by 谷歌翻译
Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume and may fail when the range is too large or unreliable. To address this problem, we propose a disparity-based MVS method based on the epipolar disparity flow (E-flow), called DispMVS, which infers the depth information from the pixel movement between two views. The core of DispMVS is to construct a 2D cost volume on the image plane along the epipolar line between each pair (between the reference image and several source images) for pixel matching and fuse uncountable depths triangulated from each pair by multi-view geometry to ensure multi-view consistency. To be robust, DispMVS starts from a randomly initialized depth map and iteratively refines the depth map with the help of the coarse-to-fine strategy. Experiments on DTUMVS and Tanks\&Temple datasets show that DispMVS is not sensitive to the depth range and achieves state-of-the-art results with lower GPU memory.
translated by 谷歌翻译
Kernels on graphs have had limited options for node-level problems. To address this, we present a novel, generalized kernel for graphs with node feature data for semi-supervised learning. The kernel is derived from a regularization framework by treating the graph and feature data as two Hilbert spaces. We also show how numerous kernel-based models on graphs are instances of our design. A kernel defined this way has transductive properties, and this leads to improved ability to learn on fewer training points, as well as better handling of highly non-Euclidean data. We demonstrate these advantages using synthetic data where the distribution of the whole graph can inform the pattern of the labels. Finally, by utilizing a flexible polynomial of the graph Laplacian within the kernel, the model also performed effectively in semi-supervised classification on graphs of various levels of homophily.
translated by 谷歌翻译