We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $\mu$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $\mu$ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability $\tau$, having an additive $\log(1/\tau)$ dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.
translated by 谷歌翻译
预测+优化是一个最近提出的框架,将机器学习和约束优化结合在一起,解决包含在求解时间未知参数的优化问题。目标是预测未知参数,并使用估计值来解决优化问题的估计最佳解决方案。但是,所有先前的作品都集中在未知参数仅出现在优化目标而不是约束中的情况下,其简单原因是,如果不确定的约束,则估计的最佳解决方案在真实参数下甚至可能是可行的。 。本文的贡献是两个方面。首先,我们为预测+优化设置提出了一个新颖且实际相关的框架,但是在目标和约束中都有未知参数。我们介绍了校正函数的概念,并在损失函数中的额外惩罚项进行了建模实际情况,在该方案中可以将估计的最佳解决方案修改为可行的解决方案,并在揭示了真实参数后,但以额外的成本进行了修改。其次,我们为我们的框架提出了相应的算法方法,该方法处理所有包装和涵盖线性程序。我们的方法灵感来自先前的曼迪和枪支工作,尽管对我们的不同环境进行了关键的修改和重新启示。实验证明了我们方法比经典方法的卓越经验表现。
translated by 谷歌翻译
我们考虑一维位置估计,其中我们从$ n $ samples $ \ lambda + \ eta_i $估算一个参数$ \ lambda $,每个$ \ eta_i $ drawn i.i.d.从已知的分销$ f $。对于固定的$ f $,最大易变估计(MLE)众所周知,在$ n \ to \ infty $中是最佳的,它是渐近正常的,差异与cram \'er-rao的差异相匹配。\ frac {1} {n \ Mathcal {i}} $,其中$ \ Mathcal {i} $是$ f $的Fisher信息。但是,这种界限不适合有限$ n $,或者当$ f $随$ n $而变化时。我们以任意$ f $和$ n $的方式显示,人们可以根据$ f $的平滑版本的渔民信息来恢复类似的理论,其中平滑半径损失了$ n $。
translated by 谷歌翻译
ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列,该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战,这是由于探测器的几何形状,不均匀的散射和冰中光的吸收,并且低于100 GEV的光,每个事件产生的信号光子数量相对较少。为了应对这一挑战,可以将ICECUBE事件表示为点云图形,并将图形神经网络(GNN)作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开,对不同的中微子事件类型进行分类,并重建沉积的能量,方向和相互作用顶点。基于仿真,我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术,包括已知系统不确定性的影响。对于中微子事件分类,与当前的IceCube方法相比,GNN以固定的假阳性速率(FPR)提高了信号效率的18%。另外,GNN在固定信号效率下将FPR的降低超过8(低于半百分比)。对于能源,方向和相互作用顶点的重建,与当前最大似然技术相比,分辨率平均提高了13%-20%。当在GPU上运行时,GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件,这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。
translated by 谷歌翻译
Migraine is a high-prevalence and disabling neurological disorder. However, information migraine management in real-world settings could be limited to traditional health information sources. In this paper, we (i) verify that there is substantial migraine-related chatter available on social media (Twitter and Reddit), self-reported by migraine sufferers; (ii) develop a platform-independent text classification system for automatically detecting self-reported migraine-related posts, and (iii) conduct analyses of the self-reported posts to assess the utility of social media for studying this problem. We manually annotated 5750 Twitter posts and 302 Reddit posts. Our system achieved an F1 score of 0.90 on Twitter and 0.93 on Reddit. Analysis of information posted by our 'migraine cohort' revealed the presence of a plethora of relevant information about migraine therapies and patient sentiments associated with them. Our study forms the foundation for conducting an in-depth analysis of migraine-related information using social media data.
translated by 谷歌翻译
In the recent years, various gradient descent algorithms including the methods of gradient descent, gradient descent with momentum, adaptive gradient (AdaGrad), root-mean-square propagation (RMSProp) and adaptive moment estimation (Adam) have been applied to the parameter optimization of several deep learning models with higher accuracies or lower errors. These optimization algorithms may need to set the values of several hyperparameters which include a learning rate, momentum coefficients, etc. Furthermore, the convergence speed and solution accuracy may be influenced by the values of hyperparameters. Therefore, this study proposes an analytical framework to use mathematical models for analyzing the mean error of each objective function based on various gradient descent algorithms. Moreover, the suitable value of each hyperparameter could be determined by minimizing the mean error. The principles of hyperparameter value setting have been generalized based on analysis results for model optimization. The experimental results show that higher efficiency convergences and lower errors can be obtained by the proposed method.
translated by 谷歌翻译
Large language models (LLMs) have demonstrated excellent zero-shot generalization to new language tasks. However, effective utilization of LLMs for zero-shot visual question-answering (VQA) remains challenging, primarily due to the modality disconnection and task disconnection between LLM and VQA task. End-to-end training on vision and language data may bridge the disconnections, but is inflexible and computationally expensive. To address this issue, we propose \emph{Img2Prompt}, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training. In order to provide such prompts, we further employ LLM-agnostic models to provide prompts that can describe image content and self-constructed question-answer pairs, which can effectively guide LLM to perform zero-shot VQA tasks. Img2Prompt offers the following benefits: 1) It can flexibly work with various LLMs to perform VQA. 2)~Without the needing of end-to-end training, it significantly reduces the cost of deploying LLM for zero-shot VQA tasks. 3) It achieves comparable or better performance than methods relying on end-to-end training. For example, we outperform Flamingo~\cite{Deepmind:Flamingo2022} by 5.6\% on VQAv2. On the challenging A-OKVQA dataset, our method even outperforms few-shot methods by as much as 20\%.
translated by 谷歌翻译
Model counting is a fundamental problem which has been influential in many applications, from artificial intelligence to formal verification. Due to the intrinsic hardness of model counting, approximate techniques have been developed to solve real-world instances of model counting. This paper designs a new anytime approach called PartialKC for approximate model counting. The idea is a form of partial knowledge compilation to provide an unbiased estimate of the model count which can converge to the exact count. Our empirical analysis demonstrates that PartialKC achieves significant scalability and accuracy over prior state-of-the-art approximate counters, including satss and STS. Interestingly, the empirical results show that PartialKC reaches convergence for many instances and therefore provides exact model counting performance comparable to state-of-the-art exact counters.
translated by 谷歌翻译
Generating realistic motions for digital humans is a core but challenging part of computer animations and games, as human motions are both diverse in content and rich in styles. While the latest deep learning approaches have made significant advancements in this domain, they mostly consider motion synthesis and style manipulation as two separate problems. This is mainly due to the challenge of learning both motion contents that account for the inter-class behaviour and styles that account for the intra-class behaviour effectively in a common representation. To tackle this challenge, we propose a denoising diffusion probabilistic model solution for styled motion synthesis. As diffusion models have a high capacity brought by the injection of stochasticity, we can represent both inter-class motion content and intra-class style behaviour in the same latent. This results in an integrated, end-to-end trained pipeline that facilitates the generation of optimal motion and exploration of content-style coupled latent space. To achieve high-quality results, we design a multi-task architecture of diffusion model that strategically generates aspects of human motions for local guidance. We also design adversarial and physical regulations for global guidance. We demonstrate superior performance with quantitative and qualitative results and validate the effectiveness of our multi-task architecture.
translated by 谷歌翻译
In this paper the implementation of piezoelectrics to a state-of-the-art wafer gripper is investigated. The objective is to propose and validate a solution method, which includes a mechanical design and control system, to achieve at least 5% damping for two eigenmodes of a wafer gripper. This objective serves as a 'proof of concept' to show the possibilities of implementing a state-of-the-art damping method to an industrial application, which in turn can be used to dampen different thin structures. The coupling relation between the piezoelectrics and their host structure were used to design the placement of the piezoelectric patches, together with modal analysis data of the a state-of-the-art wafer gripper. This data had been measured through an experimental setup. Active damping has been succesfully implemented onto the wafer gripper where positive position feedback (PPF) is used as a control algorithm to dampen two eigenmodes.
translated by 谷歌翻译