超参数优化构成了典型的现代机器学习工作流程的很大一部分。这是由于这样一个事实,即机器学习方法和相应的预处理步骤通常只有在正确调整超参数时就会产生最佳性能。但是在许多应用中,我们不仅有兴趣仅仅为了预测精度而优化ML管道;确定最佳配置时,必须考虑其他指标或约束,从而导致多目标优化问题。由于缺乏知识和用于多目标超参数优化的知识和容易获得的软件实现,因此通常在实践中被忽略。在这项工作中,我们向读者介绍了多个客观超参数优化的基础知识,并激励其在应用ML中的实用性。此外,我们从进化算法和贝叶斯优化的领域提供了现有优化策略的广泛调查。我们说明了MOO在几个特定ML应用中的实用性,考虑了诸如操作条件,预测时间,稀疏,公平,可解释性和鲁棒性之类的目标。
translated by 谷歌翻译
尽管自动超参数优化(HPO)的所有好处,但大多数现代的HPO算法本身都是黑盒子。这使得很难理解导致所选配置,减少对HPO的信任,从而阻碍其广泛采用的决策过程。在这里,我们研究了HPO与可解释的机器学习(IML)方法(例如部分依赖图)的组合。但是,如果将这种方法天真地应用于HPO过程的实验数据,则优化器的潜在采样偏差会扭曲解释。我们提出了一种修改的HPO方法,该方法有效地平衡了对全局最佳W.R.T.的搜索。预测性能以及通过耦合贝叶斯优化和贝叶斯算法执行的基础黑框函数的IML解释的可靠估计。在神经网络的合成目标和HPO的基准情况下,我们证明我们的方法返回对基础黑盒的更可靠的解释,而不会损失优化性能。
translated by 谷歌翻译
自动化封路计优化(HPO)已经获得了很大的普及,并且是大多数自动化机器学习框架的重要成分。然而,设计HPO算法的过程仍然是一个不系统和手动的过程:确定了现有工作的限制,提出的改进是 - 即使是专家知识的指导 - 仍然是一定任意的。这很少允许对哪些算法分量的驾驶性能进行全面了解,并且承载忽略良好算法设计选择的风险。我们提出了一个原理的方法来实现应用于多倍性HPO(MF-HPO)的自动基准驱动算法设计的原则方法:首先,我们正式化包括的MF-HPO候选的丰富空间,但不限于普通的HPO算法,然后呈现可配置的框架覆盖此空间。要自动和系统地查找最佳候选者,我们遵循通过优化方法,并通过贝叶斯优化搜索算法候选的空间。我们挑战是否必须通过执行消融分析来挑战所发现的设计选择或可以通过更加天真和更简单的设计。我们观察到使用相对简单的配置,在某些方式中比建立的方法更简单,只要某些关键配置参数具有正确的值,就可以很好地执行得很好。
translated by 谷歌翻译
自动化的HyperParameter优化(HPO)可以支持从业者在机器学习模型中获得峰值性能。然而,通常缺乏有价值的见解,以对不同的超参数对最终模型性能的影响。这种缺乏可解释性使得难以信任并理解自动化的HPO过程及其结果。我们建议使用可解释的机器学习(IML)从HPO中获得的实验数据与贝叶斯优化(BO)一起获得见解。 BO倾向于专注于具有潜在高性能配置的有前途的区域,从而诱导采样偏差。因此,许多IML技术,例如部分依赖曲线(PDP),承载产生偏置解释的风险。通过利用BO代理模型的后部不确定性,我们引入了具有估计置信带的PDP的变种。我们建议分区Quand参数空间以获得相关子区域的更自信和可靠的PDP。在一个实验研究中,我们为子区域内PDP的质量提高提供了定量证据。
translated by 谷歌翻译
在开发和分析新的高参数优化方法时,在经过良好策划的基准套件上进行经验评估和比较至关重要。在这项工作中,我们提出了一套新的具有挑战性和相关的基准问题,这些问题是由此类基准测试的理想属性和要求所激发的。我们新的基于替代物的基准集合包含14个方案,这些方案总共构成了700多个多保体超参数优化问题,所有这些方案都可以实现多目标超参数优化。此外,我们从经验上将基于替代物的基准测试与更广泛的表格基准进行了比较,并证明后者可能会在HPO方法的性能排名中产生不忠实的结果。我们检查并比较了根据定义要求的基准收集,并提出了一个单目标和多目标基准套件,我们在基准实验中比较了7个单目标和7个多目标优化器。我们的软件可从[https://github.com/slds-lmu/yahpo_gym]获得。
translated by 谷歌翻译
Many modern research fields increasingly rely on collecting and analysing massive, often unstructured, and unwieldy datasets. Consequently, there is growing interest in machine learning and artificial intelligence applications that can harness this `data deluge'. This broad nontechnical overview provides a gentle introduction to machine learning with a specific focus on medical and biological applications. We explain the common types of machine learning algorithms and typical tasks that can be solved, illustrating the basics with concrete examples from healthcare. Lastly, we provide an outlook on open challenges, limitations, and potential impacts of machine-learning-powered medicine.
translated by 谷歌翻译
The detection of anomalies in time series data is crucial in a wide range of applications, such as system monitoring, health care or cyber security. While the vast number of available methods makes selecting the right method for a certain application hard enough, different methods have different strengths, e.g. regarding the type of anomalies they are able to find. In this work, we compare six unsupervised anomaly detection methods with different complexities to answer the questions: Are the more complex methods usually performing better? And are there specific anomaly types that those method are tailored to? The comparison is done on the UCR anomaly archive, a recent benchmark dataset for anomaly detection. We compare the six methods by analyzing the experimental results on a dataset- and anomaly type level after tuning the necessary hyperparameter for each method. Additionally we examine the ability of individual methods to incorporate prior knowledge about the anomalies and analyse the differences of point-wise and sequence wise features. We show with broad experiments, that the classical machine learning methods show a superior performance compared to the deep learning methods across a wide range of anomaly types.
translated by 谷歌翻译
Dialogue models are able to generate coherent and fluent responses, but they can still be challenging to control and may produce non-engaging, unsafe results. This unpredictability diminishes user trust and can hinder the use of the models in the real world. To address this, we introduce DialGuide, a novel framework for controlling dialogue model behavior using natural language rules, or guidelines. These guidelines provide information about the context they are applicable to and what should be included in the response, allowing the models to generate responses that are more closely aligned with the developer's expectations and intent. We evaluate DialGuide on three tasks in open-domain dialogue response generation: guideline selection, response generation, and response entailment verification. Our dataset contains 10,737 positive and 15,467 negative dialogue context-response-guideline triplets across two domains - chit-chat and safety. We provide baseline models for the tasks and benchmark their performance. We also demonstrate that DialGuide is effective in the dialogue safety domain, producing safe and engaging responses that follow developer guidelines.
translated by 谷歌翻译
To train deep learning models, which often outperform traditional approaches, large datasets of a specified medium, e.g., images, are used in numerous areas. However, for light field-specific machine learning tasks, there is a lack of such available datasets. Therefore, we create our own light field datasets, which have great potential for a variety of applications due to the abundance of information in light fields compared to singular images. Using the Unity and C# frameworks, we develop a novel approach for generating large, scalable, and reproducible light field datasets based on customizable hardware configurations to accelerate light field deep learning research.
translated by 谷歌翻译
We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an a priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown constraint and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. Our approach is naturally applicable to continuous domains and does not require additional hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we explore by learning about the constraint up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.
translated by 谷歌翻译