智能论文笔记

Downsampling for Testing and Learning in Product Distributions

Nathaniel Harms , Yuichi Yoshida

分类：机器学习

2020-07-15

我们研究无名概率分布的无分发物业测试和学习问题是超过$ \ mathbb {r} ^ d $的产品分布。对于许多重要的功能，例如半空间，多项式阈值函数，凸集和$ k $ -alternation函数的交叉点，所知的算法具有复杂性，这取决于分配的支持大小，或者仅被证明仅工作对于产品分布的具体例子。我们介绍了一般方法，我们调用DownS采样，解决了这些问题。 Downs采样使用对产品分布的“直线等异仪”的概念，这进一步加强了等偏移，测试和学习之间的连接。使用这种技术，我们在$ \ mathbb {r} ^ d $的产品分布下获得了新的高效分布算法：1。用于函数$ [n] ^ d \的非自适应，单调单调测试的更简单证明\ {0,1 \} $，并改进了对未知产品分布的单调性的样本复杂性，从$ O（d ^ 7）$ [黑色，chakrabarty，＆seshadhri，soda 2020]到$ \ widetilde o（d ^ 3）$。 2.多项式禁止学习算法，用于恒定数量的半空间和恒定程度多项式阈值函数。 3. $ \ exp（o（d \ log（dk）））$ - 时间不可知学习算法，以及$ \ exp（o（d \ log（dk）））$ - 样本容差测试仪，用于$的函数K $凸套;和2 ^ {\ widetilde o（d）} $ satmas的单面测试仪，用于凸套。 4. $ \ exp（\ widetilde o（k \ sqrt d））$ - 时间可靠学习算法，以$ k $ -alternation函数，以及具有相同复杂性的基于样本的容忍测试仪。

translated by 谷歌翻译

Identity Testing for High-Dimensional Distributions via Entropy Tensorization

Antonio Blanca , Zongchen Chen , Daniel Štefankovič , Eric Vigoda

分类：机器学习

2022-07-19

我们提出了改进的算法，并为身份测试$ n $维分布的问题提供了统计和计算下限。在身份测试问题中，我们将作为输入作为显式分发$ \ mu $，$ \ varepsilon> 0 $，并访问对隐藏分布$ \ pi $的采样甲骨文。目标是区分两个分布$ \ mu $和$ \ pi $是相同的还是至少$ \ varepsilon $ -far分开。当仅从隐藏分布$ \ pi $中访问完整样本时，众所周知，可能需要许多样本，因此以前的作品已经研究了身份测试，并额外访问了各种有条件采样牙齿。我们在这里考虑一个明显弱的条件采样甲骨文，称为坐标Oracle，并在此新模型中提供了身份测试问题的相当完整的计算和统计表征。我们证明，如果一个称为熵的分析属性为可见分布$ \ mu $保留，那么对于任何使用$ \ tilde {o}（n/\ tilde {o}），有一个有效的身份测试算法Varepsilon）$查询坐标Oracle。熵的近似张力是一种经典的工具，用于证明马尔可夫链的最佳混合时间边界用于高维分布，并且最近通过光谱独立性为许多分布族建立了最佳的混合时间。我们将算法结果与匹配的$ \ omega（n/\ varepsilon）$统计下键进行匹配的算法结果补充，以供坐标Oracle下的查询数量。我们还证明了一个计算相变：对于$ \ {+1，-1，-1 \}^n $以上的稀疏抗抗铁磁性模型，在熵失败的近似张力失败的状态下，除非RP = np，否则没有有效的身份测试算法。

translated by 谷歌翻译

Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise

Ilias Diakonikolas , Daniel M. Kane

分类：机器学习 | (统计)机器学习

2020-12-17

我们研究了Massart噪声的PAC学习半圆的问题。给定标记的样本$（x，y）$从$ \ mathbb {r} ^ {d} ^ {d} \ times \ times \ {\ pm 1 \} $，这样的例子是任意的和标签$ y $ y $ y $ x $是由按萨塔特对手损坏的目标半空间与翻转概率$ \ eta（x）\ leq \ eta \ leq 1/2 $，目标是用小小的假设计算假设错误分类错误。这个问题的最佳已知$ \ mathrm {poly}（d，1 / \ epsilon）$时间算法实现$ \ eta + \ epsilon $的错误，这可能远离$ \ mathrm {opt} +的最佳界限\ epsilon $，$ \ mathrm {opt} = \ mathbf {e} _ {x \ sim d_x} [\ eta（x）] $。虽然已知实现$ \ mathrm {opt} + O（1）$误差需要超级多项式时间在统计查询模型中，但是在已知的上限和下限之间存在大的间隙。在这项工作中，我们基本上表征了统计查询（SQ）模型中Massart HalfSpaces的有效可读性。具体来说，我们表明，在$ \ mathbb {r} ^ d $中没有高效的sq算法用于学习massart halfpaces ^ d $可以比$ \ omega（\ eta）$更好地实现错误，即使$ \ mathrm {opt} = 2 ^ { - - \ log ^ {c}（d）$，适用于任何通用常量$ c \ in（0,1）$。此外，当噪声上限$ \ eta $接近$ 1/2 $时，我们的错误下限变为$ \ eta - o _ {\ eta}（1）$，其中$ o _ {\ eta}（1）$当$ \ eta $接近$ 1/2 $时，术语达到0美元。我们的结果提供了强有力的证据表明，大规模半空间的已知学习算法几乎是最可能的，从而解决学习理论中的长期开放问题。

translated by 谷歌翻译

List-Decodable Covariance Estimation

Misha Ivkov , Pravesh K. Kothari

分类：机器学习 | (统计)机器学习

2022-06-22

我们给出了\ emph {list-codobable协方差估计}的第一个多项式时间算法。对于任何$ \ alpha> 0 $，我们的算法获取输入样本$ y \ subseteq \ subseteq \ mathbb {r}^d $ size $ n \ geq d^{\ mathsf {poly}（1/\ alpha）} $获得通过对抗损坏I.I.D的$（1- \ alpha）n $点。从高斯分布中的样本$ x $ size $ n $，其未知平均值$ \ mu _*$和协方差$ \ sigma _*$。在$ n^{\ mathsf {poly}（1/\ alpha）} $ time中，它输出$ k = k（\ alpha）=（1/\ alpha）^{\ mathsf {poly}的常数大小列表（1/\ alpha）} $候选参数，具有高概率，包含$（\ hat {\ mu}，\ hat {\ sigma}）$，使得总变化距离$ tv（\ Mathcal {n}（n}）（n}（n}）（ \ mu _*，\ sigma _*），\ Mathcal {n}（\ hat {\ mu}，\ hat {\ sigma}））<1-o _ {\ alpha}（1）$。这是距离的统计上最强的概念，意味着具有独立尺寸误差的参数的乘法光谱和相对Frobenius距离近似。我们的算法更普遍地适用于$（1- \ alpha）$ - 任何具有低度平方总和证书的分布$ d $的损坏，这是两个自然分析属性的：1）一维边际和抗浓度2）2度多项式的超收缩率。在我们工作之前，估计可定性设置的协方差的唯一已知结果是针对Karmarkar，Klivans和Kothari（2019），Raghavendra和Yau（2019和2019和2019和2019和2019年）的特殊情况。 2020年）和巴克西（Bakshi）和科塔里（Kothari）（2020年）。这些结果需要超级物理时间，以在基础维度中获得任何子构误差。我们的结果意味着第一个多项式\ emph {extcect}算法，用于列表可解码的线性回归和子空间恢复，尤其允许获得$ 2^{ - \ Mathsf { - \ Mathsf {poly}（d）} $多项式时间错误。我们的结果还意味着改进了用于聚类非球体混合物的算法。

translated by 谷歌翻译

Robustness Implies Privacy in Statistical Estimation

Samuel B. Hopkins , Gautam Kamath , Mahbod Majid , Shyam Narayanan

分类： (统计)机器学习

2022-12-09

We study the relationship between adversarial robustness and differential privacy in high-dimensional algorithmic statistics. We give the first black-box reduction from privacy to robustness which can produce private estimators with optimal tradeoffs among sample complexity, accuracy, and privacy for a wide range of fundamental high-dimensional parameter estimation problems, including mean and covariance estimation. We show that this reduction can be implemented in polynomial time in some important special cases. In particular, using nearly-optimal polynomial-time robust estimators for the mean and covariance of high-dimensional Gaussians which are based on the Sum-of-Squares method, we design the first polynomial-time private estimators for these problems with nearly-optimal samples-accuracy-privacy tradeoffs. Our algorithms are also robust to a constant fraction of adversarially-corrupted samples.

translated by 谷歌翻译

Active Sampling for Linear Regression Beyond the $\ell_2$ Norm

Cameron Musco , Christopher Musco , David P. Woodruff , Taisuke Yasuda

分类：机器学习 | (统计)机器学习

2021-11-09

我们研究了用于线性回归的主动采样算法，该算法仅旨在查询目标向量$ b \ in \ mathbb {r} ^ n $的少量条目，并将近最低限度输出到$ \ min_ {x \ In \ mathbb {r} ^ d} \ | ax-b \ | $，其中$ a \ in \ mathbb {r} ^ {n \ times d} $是一个设计矩阵和$ \ | \ cdot \ | $是一些损失函数。对于$ \ ell_p $ norm回归的任何$ 0 <p <\ idty $，我们提供了一种基于Lewis权重采样的算法，其使用只需$ \ tilde {o}输出$（1+ \ epsilon）$近似解决方案（d ^ {\ max（1，{p / 2}）} / \ mathrm {poly}（\ epsilon））$查询到$ b $。我们表明，这一依赖于$ D $是最佳的，直到对数因素。我们的结果解决了陈和Derezi的最近开放问题，陈和Derezi \'{n} Ski，他们为$ \ ell_1 $ norm提供了附近的最佳界限，以及$ p \中的$ \ ell_p $回归的次优界限（1,2） $。我们还提供了$ O的第一个总灵敏度上限（D ^ {\ max \ {1，p / 2 \} \ log ^ 2 n）$以满足最多的$ p $多项式增长。这改善了Tukan，Maalouf和Feldman的最新结果。通过将此与我们的技术组合起来的$ \ ell_p $回归结果，我们获得了一个使$ \ tilde o的活动回归算法（d ^ {1+ \ max \ {1，p / 2 \}} / \ mathrm {poly}。（\ epsilon））$疑问，回答陈和德里兹的另一个打开问题{n}滑雪。对于Huber损失的重要特殊情况，我们进一步改善了我们对$ \ tilde o的主动样本复杂性的绑定（d ^ {（1+ \ sqrt2）/ 2} / \ epsilon ^ c）$和非活跃$ \ tilde o的样本复杂性（d ^ {4-2 \ sqrt 2} / \ epsilon ^ c）$，由于克拉克森和伍德拉夫而改善了Huber回归的以前的D ^ 4 $。我们的敏感性界限具有进一步的影响，使用灵敏度采样改善了各种先前的结果，包括orlicz规范子空间嵌入和鲁棒子空间近似。最后，我们的主动采样结果为每种$ \ ell_p $ norm提供的第一个Sublinear时间算法。

translated by 谷歌翻译

Robust Sparse Mean Estimation via Sum of Squares

Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Ankit Pensia , Thanasis Pittas

分类：机器学习 | (统计)机器学习

2022-06-07

我们研究了在存在$ \ epsilon $ - 对抗异常值的高维稀疏平均值估计的问题。先前的工作为此任务获得了该任务的样本和计算有效算法，用于辅助性Subgaussian分布。在这项工作中，我们开发了第一个有效的算法，用于强大的稀疏平均值估计，而没有对协方差的先验知识。对于$ \ Mathbb r^d $上的分布，带有“认证有限”的$ t $ tum-矩和足够轻的尾巴，我们的算法达到了$ o（\ epsilon^{1-1/t}）$带有样品复杂性$的错误（\ epsilon^{1-1/t}） m =（k \ log（d））^{o（t）}/\ epsilon^{2-2/t} $。对于高斯分布的特殊情况，我们的算法达到了$ \ tilde o（\ epsilon）$的接近最佳错误，带有样品复杂性$ m = o（k^4 \ mathrm {polylog}（d）（d））/\ epsilon^^ 2 $。我们的算法遵循基于方形的总和，对算法方法的证明。我们通过统计查询和低度多项式测试的下限来补充上限，提供了证据，表明我们算法实现的样本时间 - 错误权衡在质量上是最好的。

translated by 谷歌翻译

Two new results about quantum exact learning

Srinivasan Arunachalam , Sourav Chakraborty , Troy Lee , Manaswi Paraashar , Ronald de Wolf

分类：机器学习

2018-09-30

我们提出了两个关于量子计算机精确学习的新结果。首先，我们展示了如何从$ o（k ^ {1.5}（\ log k）^ 2）$统一量子示例的$ o（k ^ {1.5}（\ log k）^ 2）的$ k $ -fourier-sparse $ n $ -fourier-sparse $ n $ k $ -fourier-sparse $ n $ couber boolean函数。这改善了$ \ widetilde {\ theta}（kn）$统一的randuly \ emph {classical}示例（haviv和regev，ccc'15）。此外，我们提供了提高我们的$ \ widetilde {o}（k ^ {1.5}）美元的可能方向，通过证明k $-$ -fourier-稀疏的布尔函数的改进，通过提高Chang的Lemma。其次，如果可以使用$ q $量子会员查询可以完全学习概念类$ \ mathcal {c} $，则也可以使用$ o o \ left（\ frac {q ^ 2} {\ logq} \ log | \ mathcal {c} | \右）$ \ emph {classical}会员查询。这通过$ \ log q $ -factor来改善最佳的仿真结果（Servedio和Gortler，Sicomp'04）。

translated by 谷歌翻译

Near-Optimal Bounds for Testing Histogram Distributions

Clément L. Canonne , Ilias Diakonikolas , Daniel M. Kane , Sihan Liu

分类：机器学习

2022-07-14

我们研究了测试有序域上的离散概率分布是否是指定数量的垃圾箱的直方图。$ k $的简洁近似值的最常见工具之一是$ k $ [n] $，是概率分布，在一组$ k $间隔上是分段常数的。直方图测试问题如下：从$ [n] $上的未知分布中给定样品$ \ mathbf {p} $，我们想区分$ \ mathbf {p} $的情况从任何$ k $ - 组织图中，总变化距离的$ \ varepsilon $ -far。我们的主要结果是针对此测试问题的样本接近最佳和计算有效的算法，以及几乎匹配的（在对数因素内）样品复杂性下限。具体而言，我们表明直方图测试问题具有样品复杂性$ \ widetilde \ theta（\ sqrt {nk} / \ varepsilon + k / \ varepsilon^2 + \ sqrt {n} / \ varepsilon^2）$。

translated by 谷歌翻译

Learning General Halfspaces with General Massart Noise under the Gaussian Distribution

Ilias Diakonikolas , Daniel M. Kane , Vasilis Kontonis , Christos Tzamos , Nikos Zarifis

分类：机器学习 | (统计)机器学习

2021-08-19

我们在高斯分布下使用Massart噪声与Massart噪声进行PAC学习半个空间的问题。在Massart模型中，允许对手将每个点$ \ mathbf {x} $的标签与未知概率$ \ eta（\ mathbf {x}）\ leq \ eta $，用于某些参数$ \ eta \ [0,1 / 2] $。目标是找到一个假设$ \ mathrm {opt} + \ epsilon $的错误分类错误，其中$ \ mathrm {opt} $是目标半空间的错误。此前已经在两个假设下研究了这个问题：（i）目标半空间是同质的（即，分离超平面通过原点），并且（ii）参数$ \ eta $严格小于$ 1/2 $。在此工作之前，当除去这些假设中的任何一个时，不知道非增长的界限。我们研究了一般问题并建立以下内容：对于$ \ eta <1/2 $，我们为一般半个空间提供了一个学习算法，采用样本和计算复杂度$ d ^ {o_ {\ eta}（\ log（1 / \ gamma））））}} \ mathrm {poly}（1 / \ epsilon）$，其中$ \ gamma = \ max \ {\ epsilon，\ min \ {\ mathbf {pr} [f（\ mathbf {x}）= 1]， \ mathbf {pr} [f（\ mathbf {x}）= -1] \} \} $是目标半空间$ f $的偏差。现有的高效算法只能处理$ \ gamma = 1/2 $的特殊情况。有趣的是，我们建立了$ d ^ {\ oomega（\ log（\ log（\ log（\ log））}}的质量匹配的下限，而是任何统计查询（SQ）算法的复杂性。对于$ \ eta = 1/2 $，我们为一般半空间提供了一个学习算法，具有样本和计算复杂度$ o_ \ epsilon（1）d ^ {o（\ log（1 / epsilon））} $。即使对于均匀半空间的子类，这个结果也是新的;均匀Massart半个空间的现有算法为$ \ eta = 1/2 $提供可持续的保证。我们与D ^ {\ omega（\ log（\ log（\ log（\ log（\ epsilon））} $的近似匹配的sq下限补充了我们的上限，这甚至可以为同类半空间的特殊情况而保持。

translated by 谷歌翻译

Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean Estimation

Kristian Georgiev , Samuel B. Hopkins

分类： (统计)机器学习 | 机器学习

2022-11-01

We establish a simple connection between robust and differentially-private algorithms: private mechanisms which perform well with very high probability are automatically robust in the sense that they retain accuracy even if a constant fraction of the samples they receive are adversarially corrupted. Since optimal mechanisms typically achieve these high success probabilities, our results imply that optimal private mechanisms for many basic statistics problems are robust. We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems. Assuming the Brennan-Bresler secret-leakage planted clique conjecture, we demonstrate a fundamental tradeoff between computational efficiency, privacy leakage, and success probability for sparse mean estimation. Private algorithms which match this tradeoff are not yet known -- we achieve that (up to polylogarithmic factors) in a polynomially-large range of parameters via the Sum-of-Squares method. To establish an information-computation gap for private sparse mean estimation, we also design new (exponential-time) mechanisms using fewer samples than efficient algorithms must use. Finally, we give evidence for privacy-induced information-computation gaps for several other statistics and learning problems, including PAC learning parity functions and estimation of the mean of a multivariate Gaussian.

translated by 谷歌翻译

Private and polynomial time algorithms for learning Gaussians and beyond

Hassan Ashtiani , Christopher Liaw

分类： (统计)机器学习 | 机器学习

2021-11-22

我们为其非私人对准减少$（\ varepsilon，\ delta）$差异私人（dp）统计估计，提供了一个相当一般的框架。作为本框架的主要应用，我们提供多项式时间和$（\ varepsilon，\ delta）$ - DP算法用于学习（不受限制的）高斯分布在$ \ mathbb {r} ^ d $。我们学习高斯的方法的样本复杂度高斯距离总变化距离$ \ alpha $是$ \ widetilde {o} \ left（\ frac {d ^ 2} {\ alpha ^ 2} + \ frac {d ^ 2 \ sqrt {\ ln {1 / \ delta}} {\ alpha \ varepsilon} \右）$，匹配（最多为对数因子）最佳已知的信息理论（非高效）样本复杂性上限的aden-ali， Ashtiani，Kamath〜（alt'21）。在一个独立的工作中，Kamath，Mouzakis，Singhal，Steinke和Ullman〜（Arxiv：2111.04609）使用不同的方法证明了类似的结果，并以$ O（d ^ {5/2}）$样本复杂性依赖于$ d $ 。作为我们的框架的另一个应用，我们提供了第一次多项式时间$（\ varepsilon，\ delta）$-dp算法，用于鲁棒学习（不受限制的）高斯。

translated by 谷歌翻译

A Strongly Polynomial Algorithm for Approximate Forster Transforms and its Application to Halfspace Learning

Ilias Diakonikolas , Christos Tzamos , Daniel M. Kane

分类：机器学习 | (统计)机器学习

2022-12-06

The Forster transform is a method of regularizing a dataset by placing it in {\em radial isotropic position} while maintaining some of its essential properties. Forster transforms have played a key role in a diverse range of settings spanning computer science and functional analysis. Prior work had given {\em weakly} polynomial time algorithms for computing Forster transforms, when they exist. Our main result is the first {\em strongly polynomial time} algorithm to compute an approximate Forster transform of a given dataset or certify that no such transformation exists. By leveraging our strongly polynomial Forster algorithm, we obtain the first strongly polynomial time algorithm for {\em distribution-free} PAC learning of halfspaces. This learning result is surprising because {\em proper} PAC learning of halfspaces is {\em equivalent} to linear programming. Our learning approach extends to give a strongly polynomial halfspace learner in the presence of random classification noise and, more generally, Massart noise.

translated by 谷歌翻译

Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models

Ilias Diakonikolas , Daniel M. Kane , Yuxin Sun

分类：机器学习 | (统计)机器学习

2022-06-09

我们建立了最佳的统计查询（SQ）下限，以鲁棒地学习某些离散高维分布的家庭。特别是，我们表明，没有访问$ \ epsilon $ -Cruntupted二进制产品分布的有效SQ算法可以在$ \ ell_2 $ -error $ o（\ epsilon \ sqrt {\ log（\ log（1/\ epsilon））内学习其平均值}）$。同样，我们表明，没有访问$ \ epsilon $ - 腐败的铁磁高温岛模型的有效SQ算法可以学习到总变量距离$ O（\ Epsilon \ log（1/\ Epsilon））$。我们的SQ下限符合这些问题已知算法的错误保证，提供证据表明这些任务的当前上限是最好的。在技术层面上，我们为离散的高维分布开发了一个通用的SQ下限，从低维矩匹配构建体开始，我们认为这将找到其他应用程序。此外，我们介绍了新的想法，以分析这些矩匹配的结构，以进行离散的单变量分布。

translated by 谷歌翻译

What Can We Learn Privately?

Shiva Prasad Kasiviswanathan , Homin K. Lee , Kobbi Nissim , Sofya Raskhodnikova , Adam Smith

分类：

2008-03-06

Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large real-life data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or specific training example? More precisely, we investigate learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in contexts where aggregate information is released about a database containing sensitive information about individuals.Our goal is a broad understanding of the resources required for private learning in terms of samples, computation time, and interaction. We demonstrate that, ignoring computational constraints, it is possible to privately agnostically learn any concept class using a sample size approximately logarithmic in the cardinality of the concept class. Therefore, almost anything learnable is learnable privately: specifically, if a concept class is learnable by a (non-private) algorithm with polynomial sample complexity and output size, then it can be learned privately using a polynomial number of samples. We also present a computationally efficient private PAC learner for the class of parity functions. This result dispels the similarity between learning with noise and private learning (both must be robust to small changes in inputs), since parity is thought to be very hard to learn given random classification noise.Local (or randomized response) algorithms are a practical class of private algorithms that have received extensive investigation. We provide a precise characterization of local private learning algorithms. We show that a concept class is learnable by a local algorithm if and only if it is learnable in the statistical query (SQ) model. Therefore, for local private learning algorithms, the similarity to learning with noise is stronger: local learning is equivalent to SQ learning, and SQ algorithms include most known noise-tolerant learning algorithms. Finally, we present a separation between the power of interactive and noninteractive local learning algorithms. Because of the equivalence to SQ learning, this result also separates adaptive and nonadaptive SQ learning.

translated by 谷歌翻译

On the power of adaptivity in statistical adversaries

Guy Blanc , Jane Lange , Ali Malik , Li-Yang Tan

分类：机器学习

2021-11-19

我们研究了算法收到I.I.D的统计问题中对抗噪声模型的基本问题。从分发$ \ mathcal {d} $绘制。这些对手的定义指定了允许的损坏类型（噪声模型）以及可以进行这些损坏（适应性）;后者区别了唯一可以损坏分发$ \ mathcal {d} $和适应性对手的疏忽，这些对手可以损坏他们的腐败依赖于从$ \ mathcal {d} $绘制的特定样本$ s $。在这项工作中，我们调查了在文献中研究的所有噪声模型中是否有效地相当于自适应对手。具体而言，算法$ \ mathcal {a} $的行为可以在不受算法$ \ mathcal {a}'$的情况下始终受到适应性对手的存在的良好近似？我们的第一个结果表明，这确实是在所有合理的噪声模型下广泛的统计查询算法的情况。然后，我们显示在附加噪声的具体情况下，这种等价物适用于所有算法。最后，我们将所有算法和所有合理的噪声模型中的最丰富的一般性映射到最完整的普遍性的方法。

translated by 谷歌翻译

Boosting Simple Learners

Noga Alon , Alon Gonen , Elad Hazan , Shay Moran

分类：机器学习 | (统计)机器学习

2020-01-31

Boosting是一种著名的机器学习方法，它基于将弱和适度不准确假设与强烈而准确的假设相结合的想法。我们研究了弱假设属于界限能力类别的假设。这个假设的灵感来自共同的惯例，即虚弱的假设是“易于学习的类别”中的“人数规则”。（Schapire和Freund〜 '12，Shalev-Shwartz和Ben-David '14。）正式，我们假设弱假设类别具有有界的VC维度。我们关注两个主要问题：（i）甲骨文的复杂性：产生准确的假设需要多少个弱假设？我们设计了一种新颖的增强算法，并证明它绕过了由Freund和Schapire（'95，'12）的经典下限。虽然下限显示$ \ omega（{1}/{\ gamma^2}）$弱假设有时是必要的，而有时则需要使用$ \ gamma $ -margin，但我们的新方法仅需要$ \ tilde {o}（{1}）（{1}） /{\ gamma}）$弱假设，前提是它们属于一类有界的VC维度。与以前的增强算法以多数票汇总了弱假设的算法不同，新的增强算法使用了更复杂（“更深”）的聚合规则。我们通过表明复杂的聚合规则实际上是规避上述下限是必要的，从而补充了这一结果。（ii）表现力：通过提高有限的VC类的弱假设可以学习哪些任务？可以学到“遥远”的复杂概念吗？为了回答第一个问题，我们{介绍组合几何参数，这些参数捕获增强的表现力。}作为推论，我们为认真的班级的第二个问题提供了肯定的答案，包括半空间和决策树桩。一路上，我们建立并利用差异理论的联系。

translated by 谷歌翻译

On the Statistical Complexity of Sample Amplification

Brian Axelrod , Shivam Garg , Yanjun Han , Vatsal Sharan , Gregory Valiant

分类：机器学习

2022-01-12

鉴于$ n $ i.i.d.从未知的分发$ P $绘制的样本，何时可以生成更大的$ n + m $ samples，这些标题不能与$ n + m $ i.i.d区别区别。从$ p $绘制的样品？（AXELROD等人2019）将该问题正式化为样本放大问题，并为离散分布和高斯位置模型提供了最佳放大程序。然而，这些程序和相关的下限定制到特定分布类，对样本扩增的一般统计理解仍然很大程度上。在这项工作中，我们通过推出通常适用的放大程序，下限技术和与现有统计概念的联系来放置对公司统计基础的样本放大问题。我们的技术适用于一大类分布，包括指数家庭，并在样本放大和分配学习之间建立严格的联系。

translated by 谷歌翻译

Clustering Mixtures with Almost Optimal Separation in Polynomial Time

Jerry Li , Allen Liu

分类：机器学习 | (统计)机器学习

2021-12-01

我们考虑了在高维度中平均分离的高斯聚类混合物的问题。我们是从$ k $身份协方差高斯的混合物提供的样本，使任何两对手段之间的最小成对距离至少为$ \ delta $，对于某些参数$ \ delta> 0 $，目标是恢复这些样本的地面真相聚类。它是分离$ \ delta = \ theta（\ sqrt {\ log k}）$既有必要且足以理解恢复良好的聚类。但是，实现这种担保的估计值效率低下。我们提供了在多项式时间内运行的第一算法，几乎符合此保证。更确切地说，我们给出了一种算法，它需要多项式许多样本和时间，并且可以成功恢复良好的聚类，只要分离为$ \ delta = \ oomega（\ log ^ {1/2 + c} k）$ ，任何$ c> 0 $。以前，当分离以k $的分离和可以容忍$ \ textsf {poly}（\ log k）$分离所需的quasi arynomial时间时，才知道该问题的多项式时间算法。我们还将我们的结果扩展到分布的分布式的混合物，该分布在额外的温和假设下满足Poincar \ {e}不等式的分布。我们认为我们相信的主要技术工具是一种新颖的方式，可以隐含地代表和估计分配的高度时刻，这使我们能够明确地提取关于高度时刻的重要信息而没有明确地缩小全瞬间张量。

translated by 谷歌翻译

Cryptographic Hardness of Learning Halfspaces with Massart Noise

Ilias Diakonikolas , Daniel M. Kane , Pasin Manurangsi , Lisheng Ren

分类：机器学习

2022-07-28

我们研究了Massart噪声存在下PAC学习半空间的复杂性。在这个问题中，我们得到了I.I.D.标记的示例$（\ mathbf {x}，y）\ in \ mathbb {r}^n \ times \ {\ pm 1 \} $，其中$ \ mathbf {x} $的分布是任意的，标签$ y y y y y y。 $是$ f（\ mathbf {x}）$的MassArt损坏，对于未知的半空间$ f：\ mathbb {r}^n \ to \ to \ {\ pm 1 \} $，带有翻转概率$ \ eta（\ eta）（\ eta） Mathbf {x}）\ leq \ eta <1/2 $。学习者的目的是计算一个小于0-1误差的假设。我们的主要结果是该学习问题的第一个计算硬度结果。具体而言，假设学习错误（LWE）问题（LWE）问题的（被认为是广泛的）超指定时间硬度，我们表明，即使最佳，也没有多项式时间MassArt Halfspace学习者可以更好地达到错误的错误，即使是最佳0-1错误很小，即$ \ mathrm {opt} = 2^{ - \ log^{c}（n）} $对于任何通用常数$ c \ in（0，1）$。先前的工作在统计查询模型中提供了定性上类似的硬度证据。我们的计算硬度结果基本上可以解决Massart Halfspaces的多项式PAC可学习性，这表明对该问题的已知有效学习算法几乎是最好的。

translated by 谷歌翻译