Rankings are widely collected in various real-life scenarios, leading to the leakage of personal information such as users' preferences on videos or news. To protect rankings, existing works mainly develop privacy protection on a single ranking within a set of ranking or pairwise comparisons of a ranking under the $\epsilon$-differential privacy. This paper proposes a novel notion called $\epsilon$-ranking differential privacy for protecting ranks. We establish the connection between the Mallows model (Mallows, 1957) and the proposed $\epsilon$-ranking differential privacy. This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy. Theoretical results regarding the utility of synthetic rankings in the downstream tasks, including the inference attack and the personalized ranking tasks, are established. For the inference attack, we quantify how $\epsilon$ affects the estimation of the true ranking based on synthetic rankings. For the personalized ranking task, we consider varying privacy preferences among users and quantify how their privacy preferences affect the consistency in estimating the optimal ranking function. Extensive numerical experiments are carried out to verify the theoretical results and demonstrate the effectiveness of the proposed synthetic ranking algorithm.
translated by 谷歌翻译
Contextual bandit has been widely used for sequential decision-making based on the current contextual information and historical feedback data. In modern applications, such context format can be rich and can often be formulated as a matrix. Moreover, while existing bandit algorithms mainly focused on reward-maximization, less attention has been paid to the statistical inference. To fill in these gaps, in this work we consider a matrix contextual bandit framework where the true model parameter is a low-rank matrix, and propose a fully online procedure to simultaneously make sequential decision-making and conduct statistical inference. The low-rank structure of the model parameter and the adaptivity nature of the data collection process makes this difficult: standard low-rank estimators are not fully online and are biased, while existing inference approaches in bandit algorithms fail to account for the low-rankness and are also biased. To address these, we introduce a new online doubly-debiasing inference procedure to simultaneously handle both sources of bias. In theory, we establish the asymptotic normality of the proposed online doubly-debiased estimator and prove the validity of the constructed confidence interval. Our inference results are built upon a newly developed low-rank stochastic gradient descent estimator and its non-asymptotic convergence result, which is also of independent interest.
translated by 谷歌翻译
强化学习的最新出现为使用这些算法计算的参数估计值创造了强大的统计推断方法的需求。现有的在线学习中统计推断的方法仅限于涉及独立采样观察的设置,而现有的强化学习中统计推断方法(RL)仅限于批处理设置。在线引导程序是一种灵活,有效的方法,用于线性随机近似算法中的统计推断,但在涉及Markov噪声(例如RL)的设置中,其功效尚未探索。在本文中,我们研究了在线引导方法在RL中的统计推断的使用。特别是,我们专注于时间差异(TD)学习和梯度TD(GTD)学习算法,它们本身就是马尔可夫噪声下线性随机近似的特殊实例。该方法在策略评估中的统计推断上表明该方法在分布上是一致的,并且包括数值实验,以证明该算法在跨一系列实际RL环境中在统计推断任务上的有效性。
translated by 谷歌翻译
我们的目标是在沿着张量模式的协变量信息存在中可获得稀疏和高度缺失的张量。我们的动机来自在线广告,在各种设备上的广告上的用户点击率(CTR)形成了大约96%缺失条目的CTR张量,并且在非缺失条目上有许多零,这使得独立的张量完井方法不满意。除了CTR张量旁边,额外的广告功能或用户特性通常可用。在本文中,我们提出了协助协助的稀疏张力完成(Costco),以合并复苏恢复稀疏张量的协变量信息。关键思想是共同提取来自张量和协变矩阵的潜伏组分以学习合成表示。从理论上讲,我们导出了恢复的张量组件的错误绑定,并明确地量化了由于协变量引起的显露概率条件和张量恢复精度的改进。最后,我们将Costco应用于由CTR张量和广告协变矩阵组成的广告数据集,从而通过基线的23%的准确性改进。重要的副产品是来自Costco的广告潜在组件显示有趣的广告集群,这对于更好的广告目标是有用的。
translated by 谷歌翻译
使用始终有效的在线统计学习程序设计动态定价政策是一个重要且尚未解决的问题。最现有的动态定价政策,重点关注所采用的客户选择模型的忠诚度,展示了在定价过程中调整学习统计模型的在线不确定性的有限能力。在本文中,我们提出了一种新颖的方法,可以使用理论担保设计基于动态定价策略的正规化在线统计学习。新方法克服了在线套索程序持续监测的挑战,并具有多种吸引人的财产。特别是,我们做出了决定性观察,即定价决策的始终有效性构建和茁壮成长在线正规方案。我们所提出的在线正则化计划将建议的乐观在线正常化最高似然定价(Oormlp)定价政策具有三大优势:将市场噪声知识编码为定价过程乐观;在线统计学习,以所有决策点的始终有效期以时间均匀的非渐近Oracle不等式信封预测误差过程。这种类型的非渐近推理结果允许我们在实践中设计更具样品有效和强大的动态定价算法。理论上,所提出的OormLP算法利用高维模型的稀疏结构,并在决策范围内确保对数后悔。通过提出一种乐观的在线套索程序,可以根据非渐近鞅浓度的新颖,提出解决过程级别的动态定价问题的乐观在线套索程序来实现这些理论前进。在实验中,我们在不同的合成和实际定价问题设置中评估OormLP,并证明OormLP推进了最先进的方法。
translated by 谷歌翻译
错误 - 背面范围(BackProp)算法仍然是人工神经网络中信用分配问题的最常见解决方案。在神经科学中,尚不清楚大脑是否可以采用类似的策略来纠正其突触。最近的模型试图弥合这一差距,同时与一系列实验观察一致。但是,这些模型要么无法有效地跨多层返回误差信号,要么需要多相学习过程,它们都不让人想起大脑中的学习。在这里,我们介绍了一种新模型,破裂的皮质皮质网络(BUSTCCN),该网络通过整合了皮质网络的已知特性,即爆发活动,短期可塑性(STP)和dendrite-target-targeting Interneurons来解决这些问题。 BUSTCCN依赖于连接型特异性STP的突发多路复用来传播深层皮质网络中的反向Prop样误差信号。这些误差信号是在远端树突上编码的,由于兴奋性抑制性抑制性倒入输入而诱导爆发依赖性可塑性。首先,我们证明我们的模型可以使用单相学习过程有效地通过多层回溯错误。接下来,我们通过经验和分析表明,在我们的模型中学习近似反向推广的梯度。最后,我们证明我们的模型能够学习复杂的图像分类任务(MNIST和CIFAR-10)。总体而言,我们的结果表明,跨细胞,细胞,微电路和系统水平的皮质特征共同基于大脑中的单相有效深度学习。
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
Federated learning has recently been applied to recommendation systems to protect user privacy. In federated learning settings, recommendation systems can train recommendation models only collecting the intermediate parameters instead of the real user data, which greatly enhances the user privacy. Beside, federated recommendation systems enable to collaborate with other data platforms to improve recommended model performance while meeting the regulation and privacy constraints. However, federated recommendation systems faces many new challenges such as privacy, security, heterogeneity and communication costs. While significant research has been conducted in these areas, gaps in the surveying literature still exist. In this survey, we-(1) summarize some common privacy mechanisms used in federated recommendation systems and discuss the advantages and limitations of each mechanism; (2) review some robust aggregation strategies and several novel attacks against security; (3) summarize some approaches to address heterogeneity and communication costs problems; (4)introduce some open source platforms that can be used to build federated recommendation systems; (5) present some prospective research directions in the future. This survey can guide researchers and practitioners understand the research progress in these areas.
translated by 谷歌翻译
In recent years, large amounts of effort have been put into pushing forward the real-world application of dynamic digital human (DDH). However, most current quality assessment research focuses on evaluating static 3D models and usually ignores motion distortions. Therefore, in this paper, we construct a large-scale dynamic digital human quality assessment (DDH-QA) database with diverse motion content as well as multiple distortions to comprehensively study the perceptual quality of DDHs. Both model-based distortion (noise, compression) and motion-based distortion (binding error, motion unnaturalness) are taken into consideration. Ten types of common motion are employed to drive the DDHs and a total of 800 DDHs are generated in the end. Afterward, we render the video sequences of the distorted DDHs as the evaluation media and carry out a well-controlled subjective experiment. Then a benchmark experiment is conducted with the state-of-the-art video quality assessment (VQA) methods and the experimental results show that existing VQA methods are limited in assessing the perceptual loss of DDHs. The database will be made publicly available to facilitate future research.
translated by 谷歌翻译
Morality in dialogue systems has raised great attention in research recently. A moral dialogue system could better connect users and enhance conversation engagement by gaining users' trust. In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems. In our framework, we first explore the communication mechanisms of morality and resolve expressed morality into four sub-modules. The sub-modules indicate the roadmap for building a moral dialogue system. Based on that, we design a simple yet effective method: constructing moral discussions from Rules of Thumb (RoTs) between simulated specific users and the dialogue system. The constructed discussion consists of expressing, explaining, and revising the moral views in dialogue exchanges, which makes conversational models learn morality well in a natural manner. Furthermore, we propose a novel evaluation method in the framework. We evaluate the multiple aspects of morality by judging the relation between dialogue responses and RoTs in discussions, where the multifaceted nature of morality is particularly considered. Automatic and manual experiments demonstrate that our framework is promising to train and evaluate moral dialogue systems.
translated by 谷歌翻译