引入逻辑混淆是针对集成电路(IC)的多个硬件威胁的关键防御,包括反向工程(RE)和知识产权(IP)盗窃。逻辑混淆的有效性受到最近引入的布尔满意度(SAT)攻击及其变体的挑战。还提出了大量对策,以挫败SAT袭击。不论针对SAT攻击的实施防御,大型权力,性能和领域的开销是必不可少的。相比之下,我们提出了一种认知解决方案:基于神经网络的UNSAT子句翻译器Satconda,它会造成最小的区域和开销,同时以无法穿透的安全性保留原始功能。 SATCONDA与UNSAT子句生成器一起孵育,该生成器通过最小的扰动(例如包含一对逆变器或缓冲液)转换现有的结合性正常形式(CNF),或者根据提供的CNF添加新的轻巧UNSAT块。为了有效的Unsat子句生成,Satconda配备了多层神经网络,该网络首先了解特征(文字和条款)的依赖性,然后是一个长期 - 长期内存(LSTM)网络,以验证和回溯SAT-硬度,以更好地学习和翻译。我们拟议的Satconda在ISCAS85和ISCAS89基准上进行了评估,并被认为可以防御为硬件RE设计的多个最先进的SAT攻击。此外,我们还评估了针对Minisat,Lingeling和葡萄糖SAT求解器的拟议SATCONDAS经验性能,这些溶剂构成了许多现有的Deobfuscation SAT攻击。
translated by 谷歌翻译
Household environments are visually diverse. Embodied agents performing Vision-and-Language Navigation (VLN) in the wild must be able to handle this diversity, while also following arbitrary language instructions. Recently, Vision-Language models like CLIP have shown great performance on the task of zero-shot object recognition. In this work, we ask if these models are also capable of zero-shot language grounding. In particular, we utilize CLIP to tackle the novel problem of zero-shot VLN using natural language referring expressions that describe target objects, in contrast to past work that used simple language templates describing object classes. We examine CLIP's capability in making sequential navigational decisions without any dataset-specific finetuning, and study how it influences the path that an agent takes. Our results on the coarse-grained instruction following task of REVERIE demonstrate the navigational capability of CLIP, surpassing the supervised baseline in terms of both success rate (SR) and success weighted by path length (SPL). More importantly, we quantitatively show that our CLIP-based zero-shot approach generalizes better to show consistent performance across environments when compared to SOTA, fully supervised learning approaches when evaluated via Relative Change in Success (RCS).
translated by 谷歌翻译
Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object. Object-nav has been extensively studied by the Embodied-AI community, but most solutions are often restricted to considering static objects (e.g., television, fridge, etc.). We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects (e.g. fruits, glasses, phones, etc.) that frequently change their positions due to human intervention. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty and learns a model of the likelihood of spotting different objects from each navigable location. The likelihoods are used as rewards in a weighted minimum latency solver to deduce a trajectory for the robot. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.
translated by 谷歌翻译
Workloads in modern cloud data centers are becoming increasingly complex. The number of workloads running in cloud data centers has been growing exponentially for the last few years, and cloud service providers (CSP) have been supporting on-demand services in real-time. Realizing the growing complexity of cloud environment and cloud workloads, hardware vendors such as Intel and AMD are increasingly introducing cloud-specific workload acceleration features in their CPU platforms. These features are typically targeted towards popular and commonly-used cloud workloads. Nonetheless, uncommon, customer-specific workloads (unknown workloads), if their characteristics are different from common workloads (known workloads), may not realize the potential of the underlying platform. To address this problem of realizing the full potential of the underlying platform, we develop a machine learning based technique to characterize, profile and predict workloads running in the cloud environment. Experimental evaluation of our technique demonstrates good prediction performance. We also develop techniques to analyze the performance of the model in a standalone manner.
translated by 谷歌翻译
As multimodal learning finds applications in a wide variety of high-stakes societal tasks, investigating their robustness becomes important. Existing work has focused on understanding the robustness of vision-and-language models to imperceptible variations on benchmark tasks. In this work, we investigate the robustness of multimodal classifiers to cross-modal dilutions - a plausible variation. We develop a model that, given a multimodal (image + text) input, generates additional dilution text that (a) maintains relevance and topical coherence with the image and existing text, and (b) when added to the original text, leads to misclassification of the multimodal input. Via experiments on Crisis Humanitarianism and Sentiment Detection tasks, we find that the performance of task-specific fusion-based multimodal classifiers drops by 23.3% and 22.5%, respectively, in the presence of dilutions generated by our model. Metric-based comparisons with several baselines and human evaluations indicate that our dilutions show higher relevance and topical coherence, while simultaneously being more effective at demonstrating the brittleness of the multimodal classifiers. Our work aims to highlight and encourage further research on the robustness of deep multimodal models to realistic variations, especially in human-facing societal applications. The code and other resources are available at https://claws-lab.github.io/multimodal-robustness/.
translated by 谷歌翻译
经过良好策划的数据集的可用性推动了机器学习(ML)模型的成功。尽管对农业的地球观测数据的获取增加了,但仍有少数策划的标签数据集,这限制了其在训练ML模型中用于农业中的遥控模型的潜力。为此,我们介绍了一个首先的数据集,镰刀,在3个不同卫星的不同空间分辨率下具有时间序列图像,并用多个关键的裁剪参数注释,用于帕迪种植的帕迪耕种,用于泰米尔纳德邦的Cauvery Delta地区,印度。该数据集由388个独特地块的2398个季节样品组成,分布在三角洲的4个地区。该数据集涵盖了2018年1月3月2021日的时间段之间的多光谱,热和微波数据。稻田样品用4个关键的裁剪参数注释,即播种日期,移植日期,收获日期和作物收率。这是最早将生长季节(使用播种和收获日期)视为数据集的一部分的研究之一。我们还提出了一种产量预测策略,该策略使用基于观察到的生长季节以及该地区泰米尔纳德邦农业大学获得的标准季节性信息生成的时间序列数据。随之而来的绩效提高凸显了ML技术的影响,该技术利用了与特定地区的农民紧随其后的标准实践相一致的领域知识。我们在3个单独的任务上进行基准测试数据集,即作物类型,物候日期(播种,移植,收获)和产量预测,并开发了一个端到端框架,用于预测现实世界中的关键作物参数。
translated by 谷歌翻译
有效的缩放和灵活的任务接口使大型语言模型能够在许多任务中表现出色。帕利(Pali)根据视觉和文本输入生成文本,并使用该界面以许多语言执行许多视觉,语言和多模式任务。为了训练帕利,我们利用了大型的编码器语言模型和视觉变压器(VITS)。这使我们能够利用其现有能力,并利用培训它们的大量成本。我们发现,视觉和语言组成部分的联合缩放很重要。由于现有的语言变压器比其视觉对应物要大得多,因此我们训练迄今为止最大的VIT(VIT-E),以量化甚至大容量视觉模型的好处。为了训练Pali,我们基于一个新的图像文本训练集,其中包含10B图像和文本,以100多种语言来创建大型的多语言组合。帕利(Pali)在多个视觉和语言任务(例如字幕,视觉问题,索方式,场景文本理解)中实现了最新的,同时保留了简单,模块化和可扩展的设计。
translated by 谷歌翻译
本文着眼于针对基于图像的文本识别的半监督学习(SSL)。最受欢迎的SSL方法之一是伪标记(PL)。 PL方法将标签分配给未标记的数据,然后再通过标记和伪标记数据的组合重新训练模型。然而,由于包含由校准较差的模型产生的错误的高置信伪标签,因此PL方法因噪声而严重降解,并且容易与嘈杂的标签过度贴合,因此,基于阈值的选择无效。此外,假设空间的组合复杂性以及由于多个不正确的自回归步骤引起的误差积累,对序列模型的伪标记构成了挑战。为此,我们提出了一个伪标签生成和半监视文本识别的基于不确定性的数据选择框架。我们首先使用横梁搜索推论来产生高度可能的假设,以将伪标记分配给未标记的示例。然后,我们采用了通过应用辍学来取样的模型集合,以获得与预测相关的不确定性的稳健估计,考虑到字符级别和单词级别的预测分布以选择优质的伪标记。在几个基准笔迹和场景文本数据集上进行的广泛实验表明,我们的方法优于基线方法和先前的最新半监督文本识别方法。
translated by 谷歌翻译
我们为合作和异构多机构学习提供了多模式(视觉和语言)基准。我们介绍了一个基准的多模式数据集,其任务涉及在丰富的多房间环境中多个模拟异质机器人之间的协作。我们提供了一个集成的学习框架,最先进的多机构增强学习技术的多模式实现以及一致的评估协议。我们的实验研究了不同方式对多代理学习绩效的影响。我们还引入了代理之间的简单消息传递方法。结果表明,多模式为合作多学院学习带来了独特的挑战,并且在此类环境中推进多机构增强学习方法还有很大的空间。
translated by 谷歌翻译
人们最近开始通过社交网站上用户生成的多媒体材料来传达自己的思想和观点。此信息可以是图像,文本,视频或音频。近年来,这种模式的发生频率有所增加。 Twitter是最广泛使用的社交媒体网站之一,它也是最好的地点之一,可以使人们对与蒙基波疾病有关的事件有一种了解。这是因为Twitter上的推文被缩短并经常更新,这两者都促成了平台的角色。这项研究的基本目标是对人们对这种情况的存在的各种反应进行更深入的理解。这项研究重点是找出个人对猴蛋白酶疾病的看法,该疾病介绍了基于CNN和LSTM的混合技术。我们已经考虑了用户推文的所有三个可能的极性:正,负和中立。使用CNN和LSTM构建的架构来确定预测模型的准确性。推荐模型的准确性在Monkeypox Tweet数据集上为94%。其他性能指标(例如准确性,召回和F1得分)也用于测试我们的模型和最大程度和资源有效的方式。然后将发现与更传统的机器学习方法进行比较。这项研究的发现有助于提高对普通人群中蒙基托感染的认识。
translated by 谷歌翻译