关于点击率(CTR)预测的最新研究通过对更长的用户行为序列进行建模,已达到新的水平。除其他外,两阶段的方法是用于工业应用的最先进的解决方案(SOTA)。两阶段方法首先训练检索模型,以事先截断长行为序列,然后使用截短序列训练CTR模型。但是,检索模型和CTR模型是分别训练的。因此,CTR模型中检索到的子序列不准确,它降低了最终性能。在本文中,我们提出了一个端到端范式来建模长行为序列,与现有模型相比,该序列能够实现卓越的性能以及出色的成本效益。我们的贡献是三倍:首先,我们提出了一个名为ETA-NET的基于哈希的有效目标(TA)网络,以基于低成本的位置操作来启用端到端的用户行为检索。提出的ETA-NET可以通过顺序数据建模的数量级来降低标准TA的复杂性。其次,我们建议将通用系统体系结构作为一种可行的解决方案,用于在工业系统上部署ETA-NET。特别是,与SOTA两阶段方法相比,ETA-NET已部署在TAOBAO的推荐系统上,并在CTR上带来了1.8%的升降机和3.1%的升降机(GMV)。第三,我们在离线数据集和在线A/B测试上进行了广泛的实验。结果证明,在CTR预测性能和在线成本效益方面,所提出的模型大大优于现有的CTR模型。 ETA-NET现在为TAOBAO的主要流量提供服务,每天为数亿用户提供服务。
translated by 谷歌翻译
商业深度传感器通常会产生嘈杂和缺失的深度,尤其是在镜面和透明的对象上,这对下游深度或基于点云的任务构成了关键问题。为了减轻此问题,我们提出了一个强大的RGBD融合网络Swindrnet,以进行深度修复。我们进一步提出了域随机增强深度模拟(DREDS)方法,以使用基于物理的渲染模拟主动的立体声深度系统,并生成一个大规模合成数据集,该数据集包含130k Photorealistic RGB图像以及其模拟深度带有现实主义的传感器。为了评估深度恢复方法,我们还策划了一个现实世界中的数据集,即STD,该数据集捕获了30个混乱的场景,这些场景由50个对象组成,具有不同的材料,从透明,透明,弥漫性。实验表明,提议的DREDS数据集桥接了SIM到实地域间隙,因此,经过训练,我们的Swindrnet可以无缝地概括到其他真实的深度数据集,例如。 ClearGrasp,并以实时速度优于深度恢复的竞争方法。我们进一步表明,我们的深度恢复有效地提高了下游任务的性能,包括类别级别的姿势估计和掌握任务。我们的数据和代码可从https://github.com/pku-epic/dreds获得
translated by 谷歌翻译
我们人类正在进入虚拟时代,确实想将动物带到虚拟世界中。然而,计算机生成的(CGI)毛茸茸的动物受到乏味的离线渲染的限制,更不用说交互式运动控制了。在本文中,我们提出了Artemis,这是一种新型的神经建模和渲染管道,用于生成具有外观和运动合成的清晰神经宠物。我们的Artemis可以实现互动运动控制,实时动画和毛茸茸的动物的照片真实渲染。我们的Artemis的核心是神经生成的(NGI)动物引擎,该动物发动机采用了有效的基于OCTREE的动物动画和毛皮渲染的代表。然后,该动画等同于基于显式骨骼翘曲的体素级变形。我们进一步使用快速的OCTREE索引和有效的体积渲染方案来生成外观和密度特征地图。最后,我们提出了一个新颖的阴影网络,以在外观和密度特征图中生成外观和不透明度的高保真细节。对于Artemis中的运动控制模块,我们将最新动物运动捕获方法与最近的神经特征控制方案相结合。我们引入了一种有效的优化方案,以重建由多视图RGB和Vicon相机阵列捕获的真实动物的骨骼运动。我们将所有捕获的运动馈送到神经角色控制方案中,以生成具有运动样式的抽象控制信号。我们将Artemis进一步整合到支持VR耳机的现有引擎中,提供了前所未有的沉浸式体验,用户可以与各种具有生动动作和光真实外观的虚拟动物进行紧密互动。我们可以通过https://haiminluo.github.io/publication/artemis/提供我们的Artemis模型和动态毛茸茸的动物数据集。
translated by 谷歌翻译
我们建议基于张量CP分解模拟矩阵时间序列。而不是使用作为估计CP分解的标准做法的迭代算法,我们提出了一种基于由底层过程的串行依赖结构构成的广义特征分析的新的和单遍估计过程。新程序的一个关键思想是将在具有全排序矩阵的秩减少矩阵方面将概要的矩阵预定为下方,以避免以前的前者的复杂性可以为零,有限和无限。在没有实践性的情况下,在一般环境下建立了渐近理论。例如,图2示出了CP - 分解中的所有组件系数矢量,根据时间序列尺寸与样本大小之间的相对大小一致地估计CP分解中的所有组件系数矢量。建议的模型和估计方法进一步用模拟和真实数据说明;显示有效维度降低模型和预测矩阵时间序列。
translated by 谷歌翻译
在线广告收入占发布者的收入流越来越多的份额,特别是对于依赖谷歌和Facebook等技术公司广告网络的中小型出版商而言。因此,出版商可能会从准确的在线广告收入预测中获益,以更好地管理其网站货币化战略。但是,只能获得自己的收入数据的出版商缺乏出版商广告总市场的整体视图,这反过来限制了他们在他们未来的在线广告收入中产生见解的能力。为了解决这一业务问题,我们利用了一个专有的数据库,包括来自各种各样的地区的大量出版商的Google Adsense收入。我们采用时间融合变压器(TFT)模型,这是一种新的基于关注的架构,以预测出版商的广告收入。我们利用多个协变量,不仅包括出版商自己的特征,还包括其他出版商的广告收入。我们的预测结果优于多个时间范围的几个基准深度学习时间系列预测模型。此外,我们通过分析可变重要性重量来识别显着的特征和自我注意重量来解释结果,以揭示持久的时间模式。
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译