本文档介绍了生成连续动态优化问题实例的广义移动峰值基准(GMPB)。GMPB产生的景观是通过组装多种可控特性的多种可控特性来构建的,该景观包括从单峰的高度多峰,对称的,对称,平滑地高度不规则,以及各种可变的相互作用和不均匀程度。在本文档中,我们解释了如何通过GMPB的不同参数设置生成这些特征。还解释了GMPB的MATLAB源代码。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
We introduce a family of interpretable machine learning models, with two broad additions: Linearised Additive Models (LAMs) which replace the ubiquitous logistic link function in General Additive Models (GAMs); and SubscaleHedge, an expert advice algorithm for combining base models trained on subsets of features called subscales. LAMs can augment any additive binary classification model equipped with a sigmoid link function. Moreover, they afford direct global and local attributions of additive components to the model output in probability space. We argue that LAMs and SubscaleHedge improve the interpretability of their base algorithms. Using rigorous null-hypothesis significance testing on a broad suite of financial modelling data, we show that our algorithms do not suffer from large performance penalties in terms of ROC-AUC and calibration.
translated by 谷歌翻译
Pronoun resolution is a challenging subset of an essential field in natural language processing called coreference resolution. Coreference resolution is about finding all entities in the text that refers to the same real-world entity. This paper presents a hybrid model combining multiple rulebased sieves with a machine-learning sieve for pronouns. For this purpose, seven high-precision rule-based sieves are designed for the Persian language. Then, a random forest classifier links pronouns to the previous partial clusters. The presented method demonstrates exemplary performance using pipeline design and combining the advantages of machine learning and rulebased methods. This method has solved some challenges in end-to-end models. In this paper, the authors develop a Persian coreference corpus called Mehr in the form of 400 documents. This corpus fixes some weaknesses of the previous corpora in the Persian language. Finally, the efficiency of the presented system compared to the earlier model in Persian is reported by evaluating the proposed method on the Mehr and Uppsala test sets.
translated by 谷歌翻译
Coreference resolution (CR) is one of the most challenging areas of natural language processing. This task seeks to identify all textual references to the same real-world entity. Research in this field is divided into coreference resolution and anaphora resolution. Due to its application in textual comprehension and its utility in other tasks such as information extraction systems, document summarization, and machine translation, this field has attracted considerable interest. Consequently, it has a significant effect on the quality of these systems. This article reviews the existing corpora and evaluation metrics in this field. Then, an overview of the coreference algorithms, from rule-based methods to the latest deep learning techniques, is provided. Finally, coreference resolution and pronoun resolution systems in Persian are investigated.
translated by 谷歌翻译
Existing metrics for evaluating the quality of automatically generated questions such as BLEU, ROUGE, BERTScore, and BLEURT compare the reference and predicted questions, providing a high score when there is a considerable lexical overlap or semantic similarity between the candidate and the reference questions. This approach has two major shortcomings. First, we need expensive human-provided reference questions. Second, it penalises valid questions that may not have high lexical or semantic similarity to the reference questions. In this paper, we propose a new metric, RQUGE, based on the answerability of the candidate question given the context. The metric consists of a question-answering and a span scorer module, in which we use pre-trained models from the existing literature, and therefore, our metric can be used without further training. We show that RQUGE has a higher correlation with human judgment without relying on the reference question. RQUGE is shown to be significantly more robust to several adversarial corruptions. Additionally, we illustrate that we can significantly improve the performance of QA models on out-of-domain datasets by fine-tuning on the synthetic data generated by a question generation model and re-ranked by RQUGE.
translated by 谷歌翻译
我们提出了一种在多孔培养基中使用物理知识的神经网络(PINNS)中多相热力学(THM)过程中的参数鉴定的解决方案策略。我们采用无量纲的理事方程式,特别适合逆问题,我们利用了我们先前工作中开发的顺序多物理Pinn求解器。我们在多个基准问题上验证了所提出的反模型方法,包括Terzaghi的等温固结问题,Barry-Mercer的等温注射产生问题以及非饱和土壤层的非等热整合。我们报告了提出的顺序PINN-THM逆求器的出色性能,从而为将PINNS应用于复杂非线性多物理问题的逆建模铺平了道路。
translated by 谷歌翻译
数字资产(DAS)(例如比特币(BTC))的采用越来越多地提高了对准确的期权定价模型的需求。然而,现有的方法无法应付新兴DAS的挥发性。已经提出了许多模型来解决非平稳性和特殊统计在DA市场中引起的微观结构的非传统市场动态和频繁的破坏。但是,它们要么容易受到维度的诅咒,因为采用传统理论需要额外的复杂性,要​​么过分拟合可能永远不会重复的历史模式。取而代之的是,我们利用隐含的随机波动率模型(ISVM)利用市场制度(MR)聚类的最新进展。时间计时聚类是一种时间聚类方法,它将市场的历史演变簇为不同的波动期。 ISVM可以通过使用隐含波动率(IV)数据在每个情绪驱动的时期内纳入投资者的期望。在本文中,我们将此集成的时间注册聚类和ISVM方法(称为MR-ISVM)应用于流行的交易平台Deribit上的BTC选项上的高频数据。我们证明,MR-ISVM有助于克服复杂适应的负担,以使选择定价模型的高阶特征跳跃。这使我们可以根据其适应性方式根据其参与者的期望为市场定价。
translated by 谷歌翻译
3D姿势估计对于分析和改善人体机器人相互作用的人体工程学和降低肌肉骨骼疾病的风险很重要。基于视觉的姿势估计方法容易出现传感器和模型误差以及遮挡,而姿势估计仅来自相互作用的机器人的轨迹,却遭受了模棱两可的解决方案。为了从两种方法的优势中受益并改善了它们的弊端,我们引入了低成本,非侵入性和遮挡刺激性多感应3D姿势估计算法中的物理人类手机相互作用。我们在单个相机上使用openpose的2D姿势,以及人类执行任务时相互作用的机器人的轨迹。我们将问题建模为部分观察的动力学系统,并通过粒子滤波器推断3D姿势。我们介绍了远程操作的工作,但可以将其推广到其他人类机器人互动的其他应用。我们表明,我们的多感官系统比仅使用机器人的轨迹仅使用openpose或姿势估计的姿势估计来更好地解决人运动冗余。与金标准运动捕获姿势相比,这将提高估计姿势的准确性。此外,当使用Rula评估工具进行姿势评估时,我们的方法也比其他单一感觉方法更好。
translated by 谷歌翻译
我们介绍了一种方法,例如针对3D点云的提案生成。现有技术通常直接在单个进料前进的步骤中回归建议,从而导致估计不准确。我们表明,这是一个关键的瓶颈,并提出了一种基于迭代双边滤波的方法。遵循双边滤波的精神,我们考虑了每个点的深度嵌入以及它们在3D空间中的位置。我们通过合成实验表明,在为给定的兴趣点生成实例建议时,我们的方法会带来巨大的改进。我们进一步验证了我们在挑战性扫描基准测试中的方法,从而在自上而下的方法的子类别中实现了最佳实例分割性能。
translated by 谷歌翻译