We present BotSIM, a data-efficient end-to-end Bot SIMulation toolkit for commercial text-based task-oriented dialog (TOD) systems. BotSIM consists of three major components: 1) a Generator that can infer semantic-level dialog acts and entities from bot definitions and generate user queries via model-based paraphrasing; 2) an agenda-based dialog user Simulator (ABUS) to simulate conversations with the dialog agents; 3) a Remediator to analyze the simulated conversations, visualize the bot health reports and provide actionable remediation suggestions for bot troubleshooting and improvement. We demonstrate BotSIM's effectiveness in end-to-end evaluation, remediation and multi-intent dialog generation via case studies on two commercial bot platforms. BotSIM's "generation-simulation-remediation" paradigm accelerates the end-to-end bot evaluation and iteration process by: 1) reducing manual test cases creation efforts; 2) enabling a holistic gauge of the bot in terms of NLU and end-to-end performance via extensive dialog simulation; 3) improving the bot troubleshooting process with actionable suggestions. A demo of our system can be found at https://tinyurl.com/mryu74cd and a demo video at https://youtu.be/qLi5iSoly30. We have open-sourced the toolkit at https://github.com/salesforce/botsim
translated by 谷歌翻译
由于COVID-19,许多学校通过视频会议软件在线考试已经采用了许多学校。虽然方便,但教师要同时显示的学生变焦窗口监督在线考试是具有挑战性的。在本文中,我们提出了IEXAM,这是一种智能的在线考试监测和分析系统,不仅可以使用面部检测来帮助监护人实时学生识别,而且还可以检测到常见的异常行为(包括面部消失,旋转的面部,旋转的面部,旋转,,旋转,并在考试期间用另一个人替换)通过基于面部识别后的外观后视频分析。为了建立这样的新型系统,我们克服了三个挑战。首先,我们发现了一种轻巧的方法来捕获考试视频流并实时分析它们。其次,我们利用每个学生的变焦窗口上显示的左角名称,并提出了改进的OCR(光学角色识别)技术来自动收集具有动态位置的学生面孔的地面真相。第三,我们进行了几次实验比较和优化,以有效缩短教师PC所需的训练时间和测试时间。我们的评估表明,IEXAM可以实现高精度,实时面部检测为90.4%,后验后面部识别率为98.4%,同时保持可接受的运行时性能。我们已经在https://github.com/vprlab/iexam上提供了IEXAM的源代码。
translated by 谷歌翻译
我们研究机器学习中的\ emph {分类器derandomization}的问题:给定一个随机二进制分类器$ f:x \ to [0,1] $,示例确定性分类器$ \ hat {f} ,1 \} $在任何数据分发上近似$ f $的输出。最近的工作揭示了如何有效地降低具有强大输出近似保证的随机分类器,但以个人公平为代价 - 也就是说,如果$ f $处理过类似的输入,则$ \ hat {f} $没有。在本文中,我们启动了对分类器衍生物的系统研究,并提供了公平保证。我们表明,先前的降低方法几乎是最大的度量 - ``随机阈值''的简单``derandomization''可实现最佳公平性,但输出近似较弱。然后,我们设计了一个降低的程序,该程序在这两个之间提供了一个有吸引力的权衡:如果$ f $是$ \ alpha $ - metric博览会,根据度量$ d $,带有局部敏感的哈希(LSH)家族,则是我们的贬低$ \ \ \ \ \ \ \ \ \ hat {f} $具有很高的概率,$ o(\ alpha)$ - 公平级别和$ f $的近似值。我们还证明了适用于所有(公平和不公平的)分类器降低程序的通用结果,包括偏置方差分解和各种度量公平概念之间的降低。
translated by 谷歌翻译
我们研究气动非划和操纵(即吹),作为有效移动散射物体进入目标插座的一种手段。由于空气动力的混乱性质,吹吹控制器必须(i)不断适应其动作的意外变化,(ii)保持细粒度的控制,因为丝毫失误可能会导致很大的意外后果(例如,散射对象已经已经存在在一堆中)和(iii)推断远程计划(例如,将机器人移至战略性吹动地点)。我们在深度强化学习的背景下应对这些挑战,引入了空间动作地图框架的多频版本。这可以有效学习基于视觉的政策,这些政策有效地结合了高级计划和低级闭环控制,以进行动态移动操作。实验表明,我们的系统学会了对任务的有效行为,特别是证明吹吹以比推动更好的下游性能,并且我们的政策改善了基线的性能。此外,我们表明我们的系统自然会鼓励跨越低级细粒控制和高级计划的不同亚物质之间的新兴专业化。在配备微型气鼓的真实移动机器人上,我们表明我们的模拟训练策略很好地转移到了真实的环境中,并可以推广到新颖的物体。
translated by 谷歌翻译
Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.
translated by 谷歌翻译
We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced. An adversary first adds a few carefully crafted points to the training dataset such that the impact on the model's predictions is minimal. The adversary subsequently triggers a request to remove a subset of the introduced points at which point the attack is unleashed and the model's predictions are negatively affected. In particular, we consider clean-label targeted attacks (in which the goal is to cause the model to misclassify a specific test point) on datasets including CIFAR-10, Imagenette, and Imagewoof. This attack is realized by constructing camouflage datapoints that mask the effect of a poisoned dataset.
translated by 谷歌翻译
While dense retrieval has been shown effective and efficient across tasks and languages, it remains difficult to create effective fully zero-shot dense retrieval systems when no relevance label is available. In this paper, we recognize the difficulty of zero-shot learning and encoding relevance. Instead, we propose to pivot through Hypothetical Document Embeddings~(HyDE). Given a query, HyDE first zero-shot instructs an instruction-following language model (e.g. InstructGPT) to generate a hypothetical document. The document captures relevance patterns but is unreal and may contain false details. Then, an unsupervised contrastively learned encoder~(e.g. Contriever) encodes the document into an embedding vector. This vector identifies a neighborhood in the corpus embedding space, where similar real documents are retrieved based on vector similarity. This second step ground the generated document to the actual corpus, with the encoder's dense bottleneck filtering out the incorrect details. Our experiments show that HyDE significantly outperforms the state-of-the-art unsupervised dense retriever Contriever and shows strong performance comparable to fine-tuned retrievers, across various tasks (e.g. web search, QA, fact verification) and languages~(e.g. sw, ko, ja).
translated by 谷歌翻译
Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy. However, DNNs can be computationally intensive with billions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that's easy, light-weight and universal in text classification: a combination of a simple compressor like gzip with a $k$-nearest-neighbor classifier. Without any training, pre-training or fine-tuning, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distributed datasets. It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also performs particularly well in few-shot settings where labeled data are too scarce for DNNs to achieve a satisfying accuracy.
translated by 谷歌翻译
An enhanced geothermal system is essential to provide sustainable and long-term geothermal energy supplies and reduce carbon emissions. Optimal well-control scheme for effective heat extraction and improved heat sweep efficiency plays a significant role in geothermal development. However, the optimization performance of most existing optimization algorithms deteriorates as dimension increases. To solve this issue, a novel surrogate-assisted level-based learning evolutionary search algorithm (SLLES) is proposed for heat extraction optimization of enhanced geothermal system. SLLES consists of classifier-assisted level-based learning pre-screen part and local evolutionary search part. The cooperation of the two parts has realized the balance between the exploration and exploitation during the optimization process. After iteratively sampling from the design space, the robustness and effectiveness of the algorithm are proven to be improved significantly. To the best of our knowledge, the proposed algorithm holds state-of-the-art simulation-involved optimization framework. Comparative experiments have been conducted on benchmark functions, a two-dimensional fractured reservoir and a three-dimensional enhanced geothermal system. The proposed algorithm outperforms other five state-of-the-art surrogate-assisted algorithms on all selected benchmark functions. The results on the two heat extraction cases also demonstrate that SLLES can achieve superior optimization performance compared with traditional evolutionary algorithm and other surrogate-assisted algorithms. This work lays a solid basis for efficient geothermal extraction of enhanced geothermal system and sheds light on the model management strategies of data-driven optimization in the areas of energy exploitation.
translated by 谷歌翻译
The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenotyping, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.
translated by 谷歌翻译