形状约束在完全非参数和完全参数的方法之间产生灵活的中间地,以建模数据分布。对数凹陷的具体假设是经济学,生存建模和可靠性理论的应用程序的激励。但是,目前没有对给定数据的底层密度是对数凹的有效测试。最近的普遍似然比测试提供了有效的测试。通用测试依赖于最大似然估计(MLE),并且已经存在有效的方法来查找日志凹形MLE。这产生了在任何维度的有限样本中过度有效的对数凹面的第一次测试,我们还建立了渐近一致性结果。经验上,我们发现通过使用随机投影来获得最高功率以将D维测试问题转换为许多一维问题,导致统计上和计算效率的简单过程。
translated by 谷歌翻译
Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards understanding how these methods behave in high-dimensional settings, where $d$ and $n$ both increase to infinity together. This often leads to different inference procedures, depending on the assumptions about the dimensionality, leaving the practitioner in a bind: given a dataset with 100 samples in 20 dimensions, should they calibrate by assuming $n \gg d$, or $d/n \approx 0.2$? This paper considers the goal of dimension-agnostic inference; developing methods whose validity does not depend on any assumption on $d$ versus $n$. We introduce an approach that uses variational representations of existing test statistics along with sample splitting and self-normalization to produce a new test statistic with a Gaussian limiting distribution, regardless of how $d$ scales with $n$. The resulting statistic can be viewed as a careful modification of degenerate U-statistics, dropping diagonal blocks and retaining off-diagonal blocks. We exemplify our technique for some classical problems including one-sample mean and covariance testing, and show that our tests have minimax rate-optimal power against appropriate local alternatives. In most settings, our cross U-statistic matches the high-dimensional power of the corresponding (degenerate) U-statistic up to a $\sqrt{2}$ factor.
translated by 谷歌翻译
我们研究通过应用具有多个初始化的梯度上升方法来源的估计器的统计特性。我们派生了该估算器的目标的人口数量,并研究了从渐近正常性和自举方法构成的置信区间(CIS)的性质。特别是,我们通过有限数量的随机初始化来分析覆盖范围。我们还通过反转可能性比率测试,得分测试和WALD测试来调查CI,我们表明所得到的CIS可能非常不同。即使MLE是棘手的,我们也提出了一种两个样本测试程序。此外,我们在随机初始化下分析了EM算法的性能,并通过有限数量的初始化导出了CI的覆盖范围。
translated by 谷歌翻译
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. However, when the model is misspecified, the test may have poor power, for example when $X$ is involved in complex interactions, or lead to many false rejections. In this work we study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests, to yield both robust error control and high power. The procedure involves using these methods to perform regressions, first to estimate a form of projection of $Y$ on $X$ and $Z$ using one half of the data, and then to estimate the expected conditional covariance between this projection and $Y$ on the remaining half of the data. While the approach is general, we show that a version of our procedure using spline regression achieves what we show is the minimax optimal rate in this nonparametric testing problem. Numerical experiments demonstrate the effectiveness of our approach both in terms of maintaining Type I error control, and power, compared to several existing approaches.
translated by 谷歌翻译
Sequential testing, always-valid $p$-values, and confidence sequences promise flexible statistical inference and on-the-fly decision making. However, unlike fixed-$n$ inference based on asymptotic normality, existing sequential tests either make parametric assumptions and end up under-covering/over-rejecting when these fail or use non-parametric but conservative concentration inequalities and end up over-covering/under-rejecting. To circumvent these issues, we sidestep exact at-least-$\alpha$ coverage and focus on asymptotically exact coverage and asymptotic optimality. That is, we seek sequential tests whose probability of ever rejecting a true hypothesis asymptotically approaches $\alpha$ and whose expected time to reject a false hypothesis approaches a lower bound on all tests with asymptotic coverage at least $\alpha$, both under an appropriate asymptotic regime. We permit observations to be both non-parametric and dependent and focus on testing whether the observations form a martingale difference sequence. We propose the universal sequential probability ratio test (uSPRT), a slight modification to the normal-mixture sequential probability ratio test, where we add a burn-in period and adjust thresholds accordingly. We show that even in this very general setting, the uSPRT is asymptotically optimal under mild generic conditions. We apply the results to stabilized estimating equations to test means, treatment effects, etc. Our results also provide corresponding guarantees for the implied confidence sequences. Numerical simulations verify our guarantees and the benefits of the uSPRT over alternatives.
translated by 谷歌翻译
三角形流量,也称为kn \“{o}的Rosenblatt测量耦合,包括用于生成建模和密度估计的归一化流模型的重要构建块,包括诸如实值的非体积保存变换模型的流行自回归流模型(真实的NVP)。我们提出了三角形流量统计模型的统计保证和样本复杂性界限。特别是,我们建立了KN的统计一致性和kullback-leibler估算器的rospblatt的kullback-leibler估计的有限样本会聚率使用实证过程理论的工具测量耦合。我们的结果突出了三角形流动下播放功能类的各向异性几何形状,优化坐标排序,并导致雅各比比流动的统计保证。我们对合成数据进行数值实验,以说明我们理论发现的实际意义。
translated by 谷歌翻译
我们基于电子价值开发假设检测理论,这是一种与p值不同的证据,允许毫不费力地结合来自常见场景中的几项研究的结果,其中决定执行新研究可能取决于以前的结果。基于E-V值的测试是安全的,即它们在此类可选的延续下保留I型错误保证。我们将增长速率最优性(GRO)定义为可选的连续上下文中的电力模拟,并且我们展示了如何构建GRO E-VARIABLE,以便为复合空缺和替代,强调模型的常规测试问题,并强调具有滋扰参数的模型。 GRO E值采取具有特殊前瞻的贝叶斯因子的形式。我们使用几种经典示例说明了该理论,包括一个样本安全T检验(其中右哈尔前方的右手前锋为GE)和2x2差价表(其中GRE之前与标准前沿不同)。分享渔业,奈曼和杰弗里斯·贝叶斯解释,电子价值观和相应的测试可以提供所有三所学校的追随者可接受的方法。
translated by 谷歌翻译
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures controlling the false discovery rate (FDR) and simultaneously discovering more relevant variables. Model-X methods, such as Knockoffs and conditional randomization tests, achieve the first goal of finite-sample FDR control under the assumption of known covariates distribution. However, it is not clear whether these methods can concurrently achieve the second goal of maximizing the number of discoveries. In fact, designing procedures to discover more relevant variables with finite-sample FDR control is a largely open question, even in the arguably simplest linear models. In this paper, we derive near-optimal testing procedures in high dimensional Bayesian linear models with isotropic covariates. We propose a Model-X multiple testing procedure, PoEdCe, which provably controls the frequentist FDR from finite samples even under model misspecification, and conjecturally achieves near-optimal power when the data follow the Bayesian linear model with a known prior. PoEdCe has three important ingredients: Posterior Expectation, distilled Conditional randomization test (dCRT), and the Benjamini-Hochberg procedure with e-values (eBH). The optimality conjecture of PoEdCe is based on a heuristic calculation of its asymptotic true positive proportion (TPP) and false discovery proportion (FDP), which is supported by methods from statistical physics as well as extensive numerical simulations. Furthermore, when the prior is unknown, we show that an empirical Bayes variant of PoEdCe still has finite-sample FDR control and achieves near-optimal power.
translated by 谷歌翻译
本文衍生了置信区间(CI)和时间统一的置信序列(CS),用于从有限观测值中估算未知平均值的经典问题。我们提出了一种衍生浓度界限的一般方法,可以看作是著名的切尔诺夫方法的概括(和改进)。它的核心是基于推导一类新的复合非负胸腔,通过投注和混合方法与测试的连接很强。我们展示了如何将这些想法扩展到无需更换的情况下,这是另一个经过深入研究的问题。在所有情况下,我们的界限都适应未知的差异,并且基于Hoeffding或经验的Bernstein不平等及其最近的Supermartingale概括,经验上大大优于现有方法。简而言之,我们为四个基本问题建立了一个新的最先进的问题:在有或没有替换的情况下进行采样时,CS和CI进行有限的手段。
translated by 谷歌翻译
经典的错误发现率(FDR)控制程序提供了强大而可解释的保证,而它们通常缺乏灵活性。另一方面,最近的机器学习分类算法是基于随机森林(RF)或神经网络(NN)的算法,具有出色的实践表现,但缺乏解释和理论保证。在本文中,我们通过引入新的自适应新颖性检测程序(称为Adadetect)来使这两个相遇。它将多个测试文献的最新作品范围扩展到高维度的范围,尤其是Yang等人的范围。 (2021)。显示AD​​ADETECT既可以强烈控制FDR,又具有在特定意义上模仿甲骨文之一的力量。理论结果,几个基准数据集上的数值实验以及对天体物理数据的应用,我们的方法的兴趣和有效性得到了证明。特别是,虽然可以将AdadEtect与任何分类器结合使用,但它在带有RF的现实世界数据集以及带有NN的图像上特别有效。
translated by 谷歌翻译
我们研究了与给定的无向图$ g $相对应的图形模型的最大似然估计的问题。我们表明,最大似然估计(MLE)是几个帐篷函数的指数的乘积,每个最大集团的$ g $。虽然图形模型中的一组对数符号密度是无限维度的,但我们的结果表明,可以通过求解有限维凸优化问题来找到MLE。我们提供实施和一些示例。此外,我们证明MLE存在并且具有概率为1,只要样品数量大于$ g $ chordal时最大的$ g $集团的大小。我们证明,当图$ g $是集团的不交联时,MLE是一致的。最后,我们讨论了$ g $的图形模型中的对数 - 串联密度在$ g $中具有对数符号分解的条件。
translated by 谷歌翻译
我们提出了一种基于最大平均差异(MMD)的新型非参数两样本测试,该测试是通过具有不同核带宽的聚合测试来构建的。这种称为MMDAGG的聚合过程可确保对所使用的内核的收集最大化测试能力,而无需持有核心选择的数据(这会导致测试能力损失)或任意内核选择,例如中位数启发式。我们在非反应框架中工作,并证明我们的聚集测试对Sobolev球具有最小自适应性。我们的保证不仅限于特定的内核,而是符合绝对可集成的一维翻译不变特性内核的任何产品。此外,我们的结果适用于流行的数值程序来确定测试阈值,即排列和野生引导程序。通过对合成数据集和现实世界数据集的数值实验,我们证明了MMDAGG优于MMD内核适应的替代方法,用于两样本测试。
translated by 谷歌翻译
度量的运输提供了一种用于建模复杂概率分布的多功能方法,并具有密度估计,贝叶斯推理,生成建模及其他方法的应用。单调三角传输地图$ \ unicode {x2014} $近似值$ \ unicode {x2013} $ rosenblatt(kr)重新安排$ \ unicode {x2014} $是这些任务的规范选择。然而,此类地图的表示和参数化对它们的一般性和表现力以及对从数据学习地图学习(例如,通过最大似然估计)出现的优化问题的属性产生了重大影响。我们提出了一个通用框架,用于通过平滑函数的可逆变换来表示单调三角图。我们建立了有关转化的条件,以使相关的无限维度最小化问题没有伪造的局部最小值,即所有局部最小值都是全球最小值。我们展示了满足某些尾巴条件的目标分布,唯一的全局最小化器与KR地图相对应。鉴于来自目标的样品,我们提出了一种自适应算法,该算法估计了基础KR映射的稀疏半参数近似。我们证明了如何将该框架应用于关节和条件密度估计,无可能的推断以及有向图形模型的结构学习,并在一系列样本量之间具有稳定的概括性能。
translated by 谷歌翻译
Deep generative models parametrized up to a normalizing constant (e.g. energy-based models) are difficult to train by maximizing the likelihood of the data because the likelihood and/or gradients thereof cannot be explicitly or efficiently written down. Score matching is a training method, whereby instead of fitting the likelihood $\log p(x)$ for the training data, we instead fit the score function $\nabla_x \log p(x)$ -- obviating the need to evaluate the partition function. Though this estimator is known to be consistent, its unclear whether (and when) its statistical efficiency is comparable to that of maximum likelihood -- which is known to be (asymptotically) optimal. We initiate this line of inquiry in this paper, and show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated -- i.e. the Poincar\'e, log-Sobolev and isoperimetric constant -- quantities which govern the mixing time of Markov processes like Langevin dynamics. Roughly, we show that the score matching estimator is statistically comparable to the maximum likelihood when the distribution has a small isoperimetric constant. Conversely, if the distribution has a large isoperimetric constant -- even for simple families of distributions like exponential families with rich enough sufficient statistics -- score matching will be substantially less efficient than maximum likelihood. We suitably formalize these results both in the finite sample regime, and in the asymptotic regime. Finally, we identify a direct parallel in the discrete setting, where we connect the statistical properties of pseudolikelihood estimation with approximate tensorization of entropy and the Glauber dynamics.
translated by 谷歌翻译
The kernel Maximum Mean Discrepancy~(MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus it has an intractable limiting distribution. Hence, to design a level-$\alpha$ test, one usually selects the rejection threshold as the $(1-\alpha)$-quantile of the permutation distribution. The resulting nonparametric test has finite-sample validity but suffers from large computational cost, since every permutation takes quadratic time. We propose the cross-MMD, a new quadratic-time MMD test statistic based on sample-splitting and studentization. We prove that under mild assumptions, the cross-MMD has a limiting standard Gaussian distribution under the null. Importantly, we also show that the resulting test is consistent against any fixed alternative, and when using the Gaussian kernel, it has minimax rate-optimal power against local alternatives. For large sample sizes, our new cross-MMD provides a significant speedup over the MMD, for only a slight loss in power.
translated by 谷歌翻译
我们提出了对非参数仪器变量(NPIV)模型中的结构函数的多面体锥体(例如,单调性,凸起)和平等(例如,参数,半游戏)限制的新的自适应假设试验。我们的测试统计是基于受限制和不受限制的筛估计之间的二次距离的改进的休假样本模拟。我们提供筛选调整参数的计算简单,数据驱动的选择和调整的CHI平方临界值。我们的测试在未知的内能性和仪器的未知强度存在下适应替代功能的未知平滑度。它达到了$ ^ 2 $以$ ^ 2 $的试验率。也就是说,通过未知规则的NPIV模型的任何其他假设测试,不能改善其在复合空缺上均匀地均匀地均匀的I型错误及其类型的II误差。通过反转自适应测试,可以获得数据驱动的置信度量为$ ^ 2 $。模拟确认我们的自适应测试控制规模及其有限样本功率极大地超过了NPIV模型中的单调性和参数限制的现有非自适应测试。介绍了对差异化产品需求和Engel曲线进行形状限制的经验应用。
translated by 谷歌翻译
我们在右审查的生存时间和协变量之间介绍一般的非参数独立测试,这可能是多变量的。我们的测试统计数据具有双重解释,首先是潜在无限的重量索引日志秩检验的超级索引,具有属于函数的再现内核HILBERT空间(RKHS)的重量函数;其次,作为某些有限措施的嵌入差异的规范,与Hilbert-Schmidt独立性标准(HSIC)测试统计类似。我们研究了测试的渐近性质,找到了足够的条件,以确保我们的测试在任何替代方案下正确拒绝零假设。可以直截了当地计算测试统计,并且通过渐近总体的野外自注程序进行拒绝阈值。对模拟和实际数据的广泛调查表明,我们的测试程序通常比检测复杂的非线性依赖的竞争方法更好。
translated by 谷歌翻译
非政策评估和学习(OPE/L)使用离线观察数据来做出更好的决策,这对于在线实验有限的应用至关重要。但是,完全取决于记录的数据,OPE/L对环境分布的变化很敏感 - 数据生成环境和部署策略的差异。 \ citet {si2020distributional}提议的分布在稳健的OPE/L(Drope/L)解决此问题,但该提案依赖于逆向权重,如果估计错误和遗憾,如果倾向是非参数估计的,即使其差异是次级估计,即使是次级估计的,其估计错误和遗憾将降低。对于标准的,非体,OPE/L,这是通过双重鲁棒(DR)方法来解决的,但它们并不自然地扩展到更复杂的drop/l,涉及最糟糕的期望。在本文中,我们提出了具有KL-Divergence不确定性集的DROPE/L的第一个DR算法。为了进行评估,我们提出了局部双重稳健的drope(LDR $^2 $ ope),并表明它在弱产品速率条件下实现了半摩托效率。多亏了本地化技术,LDR $^2 $ OPE仅需要安装少量回归,就像标准OPE的DR方法一样。为了学习,我们提出了连续的双重稳健下降(CDR $^2 $ opl),并表明,在涉及连续回归的产品速率条件下,它具有$ \ Mathcal {o} \ left的快速后悔率(n^) {-1/2} \ right)$即使未知的倾向是非参数估计的。我们从经验上验证了模拟中的算法,并将结果进一步扩展到一般$ f $ divergence的不确定性集。
translated by 谷歌翻译
Mixtures of regression are a powerful class of models for regression learning with respect to a highly uncertain and heterogeneous response variable of interest. In addition to being a rich predictive model for the response given some covariates, the parameters in this model class provide useful information about the heterogeneity in the data population, which is represented by the conditional distributions for the response given the covariates associated with a number of distinct but latent subpopulations. In this paper, we investigate conditions of strong identifiability, rates of convergence for conditional density and parameter estimation, and the Bayesian posterior contraction behavior arising in finite mixture of regression models, under exact-fitted and over-fitted settings and when the number of components is unknown. This theory is applicable to common choices of link functions and families of conditional distributions employed by practitioners. We provide simulation studies and data illustrations, which shed some light on the parameter learning behavior found in several popular regression mixture models reported in the literature.
translated by 谷歌翻译
专家(MOE)的混合是一种流行的统计和机器学习模型,由于其灵活性和效率,多年来一直引起关注。在这项工作中,我们将高斯门控的局部MOE(GLOME)和块对基因协方差局部MOE(Blome)回归模型在异质数据中呈现非线性关系,并在高维预测变量之间具有潜在的隐藏图形结构相互作用。这些模型从计算和理论角度提出了困难的统计估计和模型选择问题。本文致力于研究以混合成分数量,高斯平均专家的复杂性以及协方差矩阵的隐藏块 - 基因结构为特征的Glome或Blome模型集合中的模型选择问题。惩罚最大似然估计框架。特别是,我们建立了以弱甲骨文不平等的形式的非反应风险界限,但前提是罚款的下限。然后,在合成和真实数据集上证明了我们的模型的良好经验行为。
translated by 谷歌翻译