背景:具有很小样本量的高维数据中的嵌入式特征选择需要优化模型构建过程的超参数。对于这种超参数优化,必须应用嵌套的交叉验证以避免偏向性能估计。由高维数据进行的重复训练导致了很长的计算时间。此外,它可能会观察到由小验证集中的异常值引起的个体性能评估指标的较高差异。因此,早期停止应用标准修剪算法来节省时间风险,以丢弃有希望的超参数集。结果:为了加快样本量微小数据的高维数据的速度选择,我们适应了最先进的异步连续的休息器。此外,我们将其与基于领域或先验知识的两种补充修剪策略相结合。一种修剪策略立即停止对所选超参数组合的语义上毫无意义的结果进行计算试验。另一个是一种新的外推阈值修剪策略,适用于具有较大性能评估指标差异的嵌套交叉验证。在反复的实验中,我们的组合修剪策略保持了所有有前途的试验。同时,与仅使用最先行的连续减半pruner相比,计算时间大大减少。训练训练的型号少于81.3 \%,获得了相同的优化结果。结论:所提出的组合修剪策略可以加速数据分析或在同一计算时间内更深入地搜索超参数。这导致了时间,资金和能源消耗大量节省,为高级,耗时的分析打开了大门。
translated by 谷歌翻译
We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of PNL solutions. We introduce Approximate Neurosymbolic Inference (A-NeSI): a new framework for PNL that uses neural networks for scalable approximate inference. A-NeSI 1) performs approximate inference in polynomial time without changing the semantics of probabilistic logics; 2) is trained using data generated by the background knowledge; 3) can generate symbolic explanations of predictions; and 4) can guarantee the satisfaction of logical constraints at test time, which is vital in safety-critical applications. Our experiments show that A-NeSI is the first end-to-end method to scale the Multi-digit MNISTAdd benchmark to sums of 15 MNIST digits, up from 4 in competing systems. Finally, our experiments show that A-NeSI achieves explainability and safety without a penalty in performance.
translated by 谷歌翻译
Speech to text models tend to be trained and evaluated against a single target accent. This is especially true for English for which native speakers from the United States became the main benchmark. In this work, we are going to show how two simple methods: pre-trained embeddings and auxiliary classification losses can improve the performance of ASR systems. We are looking for upgrades as universal as possible and therefore we will explore their impact on several models architectures and several languages.
translated by 谷歌翻译
The Makespan Scheduling problem is an extensively studied NP-hard problem, and its simplest version looks for an allocation approach for a set of jobs with deterministic processing times to two identical machines such that the makespan is minimized. However, in real life scenarios, the actual processing time of each job may be stochastic around the expected value with a variance, under the influence of external factors, and the actual processing times of these jobs may be correlated with covariances. Thus within this paper, we propose a chance-constrained version of the Makespan Scheduling problem and investigate the theoretical performance of the classical Randomized Local Search and (1+1) EA for it. More specifically, we first study two variants of the Chance-constrained Makespan Scheduling problem and their computational complexities, then separately analyze the expected runtime of the two algorithms to obtain an optimal solution or almost optimal solution to the instances of the two variants. In addition, we investigate the experimental performance of the two algorithms for the two variants.
translated by 谷歌翻译
To mitigate climate change, the share of renewable needs to be increased. Renewable energies introduce new challenges to power grids due to decentralization, reduced inertia and volatility in production. The operation of sustainable power grids with a high penetration of renewable energies requires new methods to analyze the dynamic stability. We provide new datasets of dynamic stability of synthetic power grids and find that graph neural networks (GNNs) are surprisingly effective at predicting the highly non-linear target from topological information only. To illustrate the potential to scale to real-sized power grids, we demonstrate the successful prediction on a Texan power grid model.
translated by 谷歌翻译
The evolution of wireless communications into 6G and beyond is expected to rely on new machine learning (ML)-based capabilities. These can enable proactive decisions and actions from wireless-network components to sustain quality-of-service (QoS) and user experience. Moreover, new use cases in the area of vehicular and industrial communications will emerge. Specifically in the area of vehicle communication, vehicle-to-everything (V2X) schemes will benefit strongly from such advances. With this in mind, we have conducted a detailed measurement campaign with the purpose of enabling a plethora of diverse ML-based studies. The resulting datasets offer GPS-located wireless measurements across diverse urban environments for both cellular (with two different operators) and sidelink radio access technologies, thus enabling a variety of different studies towards V2X. The datasets are labeled and sampled with a high time resolution. Furthermore, we make the data publicly available with all the necessary information to support the on-boarding of new researchers. We provide an initial analysis of the data showing some of the challenges that ML needs to overcome and the features that ML can leverage, as well as some hints at potential research studies.
translated by 谷歌翻译
Force modulation of robotic manipulators has been extensively studied for several decades. However, it is not yet commonly used in safety-critical applications due to a lack of accurate interaction contact modeling and weak performance guarantees - a large proportion of them concerning the modulation of interaction forces. This study presents a high-level framework for simultaneous trajectory optimization and force control of the interaction between a manipulator and soft environments, which is prone to external disturbances. Sliding friction and normal contact force are taken into account. The dynamics of the soft contact model and the manipulator are simultaneously incorporated in a trajectory optimizer to generate desired motion and force profiles. A constrained optimization framework based on Alternative Direction Method of Multipliers (ADMM) has been employed to efficiently generate real-time optimal control inputs and high-dimensional state trajectories in a Model Predictive Control fashion. Experimental validation of the model performance is conducted on a soft substrate with known material properties using a Cartesian space force control mode. Results show a comparison of ground truth and real-time model-based contact force and motion tracking for multiple Cartesian motions in the valid range of the friction model. It is shown that a contact model-based motion planner can compensate for frictional forces and motion disturbances and improve the overall motion and force tracking accuracy. The proposed high-level planner has the potential to facilitate the automation of medical tasks involving the manipulation of compliant, delicate, and deformable tissues.
translated by 谷歌翻译
Vision and language models (VL) are known to exploit unrobust indicators in individual modalities (e.g., introduced by distributional biases), instead of focusing on relevant information in each modality. A small drop in accuracy obtained on a VL task with a unimodal model suggests that so-called unimodal collapse occurred. But how to quantify the amount of unimodal collapse reliably, at dataset and instance-level, to diagnose and combat unimodal collapse in a targeted way? We present MM-SHAP, a performance-agnostic multimodality score that quantifies the proportion by which a model uses individual modalities in multimodal tasks. MM-SHAP is based on Shapley values and will be applied in two ways: (1) to compare models for their degree of multimodality, and (2) to measure the contribution of individual modalities for a given task and dataset. Experiments with 6 VL models -- LXMERT, CLIP and four ALBEF variants -- on four VL tasks highlight that unimodal collapse can occur to different degrees and in different directions, contradicting the wide-spread assumption that unimodal collapse is one-sided. We recommend MM-SHAP for analysing multimodal tasks, to diagnose and guide progress towards multimodal integration. Code available at: https://github.com/Heidelberg-NLP/MM-SHAP
translated by 谷歌翻译
A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.
translated by 谷歌翻译
Hyperparameter optimization (HPO) is essential for the better performance of deep learning, and practitioners often need to consider the trade-off between multiple metrics, such as error rate, latency, memory requirements, robustness, and algorithmic fairness. Due to this demand and the heavy computation of deep learning, the acceleration of multi-objective (MO) optimization becomes ever more important. Although meta-learning has been extensively studied to speedup HPO, existing methods are not applicable to the MO tree-structured parzen estimator (MO-TPE), a simple yet powerful MO-HPO algorithm. In this paper, we extend TPE's acquisition function to the meta-learning setting, using a task similarity defined by the overlap in promising domains of each task. In a comprehensive set of experiments, we demonstrate that our method accelerates MO-TPE on tabular HPO benchmarks and yields state-of-the-art performance. Our method was also validated externally by winning the AutoML 2022 competition on "Multiobjective Hyperparameter Optimization for Transformers".
translated by 谷歌翻译