Open World Object Detection (OWOD) is a new and challenging computer vision task that bridges the gap between classic object detection (OD) benchmarks and object detection in the real world. In addition to detecting and classifying seen/labeled objects, OWOD algorithms are expected to detect novel/unknown objects - which can be classified and incrementally learned. In standard OD, object proposals not overlapping with a labeled object are automatically classified as background. Therefore, simply applying OD methods to OWOD fails as unknown objects would be predicted as background. The challenge of detecting unknown objects stems from the lack of supervision in distinguishing unknown objects and background object proposals. Previous OWOD methods have attempted to overcome this issue by generating supervision using pseudo-labeling - however, unknown object detection has remained low. Probabilistic/generative models may provide a solution for this challenge. Herein, we introduce a novel probabilistic framework for objectness estimation, where we alternate between probability distribution estimation and objectness likelihood maximization of known objects in the embedded feature space - ultimately allowing us to estimate the objectness probability of different proposals. The resulting Probabilistic Objectness transformer-based open-world detector, PROB, integrates our framework into traditional object detection models, adapting them for the open-world setting. Comprehensive experiments on OWOD benchmarks show that PROB outperforms all existing OWOD methods in both unknown object detection ($\sim 2\times$ unknown recall) and known object detection ($\sim 10\%$ mAP). Our code will be made available upon publication at https://github.com/orrzohar/PROB.
Foundation Models (FMs) are models trained on large corpora of data that, at very large scale, can generalize to new tasks without any task-specific finetuning. As these models continue to grow in size, innovations continue to push the boundaries of what these models can do on language and image tasks. This paper aims to understand an underexplored area of FMs: classical data tasks like cleaning and integration. As a proof-of-concept, we cast five data cleaning and integration tasks as prompting tasks and evaluate the performance of FMs on these tasks. We find that large FMs generalize and achieve SoTA performance on data cleaning and integration tasks, even though they are not trained for these data tasks. We identify specific research challenges and opportunities that these models present, including challenges with private and domain specific data, and opportunities to make data management systems more accessible to non-experts. We make our code and experiments publicly available at: https://github.com/HazyResearch/fm_data_tasks.
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the FlexiBiT framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models. Additionally, we show that performance can be further improved by fine-tuning our general model on specific tasks of interest.
DNA存储的概念最早是在1959年由谁分享关于在谈话“有足够的空间在底部”纳米技术他的远见理查德·费曼建议。后来,对20世纪的结束,在基于DNA分子的存储解决方案的兴趣是随着人类基因组计划这反过来又导致了测序和组装方法显著进步的结果。 DNA存储在成熟的磁和光存储解决方案中享有重大优势。相对于磁性溶液,DNA存储不需要电力供应,以保持数据的完整性和优于在密度和耐用性的存储解决方案。鉴于趋势成本DNA合成和测序的降低,现在承认,在未来10 - 15年DNA存储内可能会成为一个高度竞争的归档技术,可能以后的主要这样的技术。随着中说,基于DNA的存储系统的当前实施方式是非常有限,并且不完全优化解决表征合成和测序过程错误的独特图案。在这项工作中,我们提出了一个强大,高效且可扩展的解决方案,以实现基于DNA的存储系统。我们的方法其部署重建的字母基于通过合成和测序过程中产生的拷贝不完善群集上的序列深神经网络(DNN)。特制的纠错码(ECC)被用来在此过程中发生的错误的作战模式。由于我们的重建方法适于不完善簇,我们的方法允许使用一种快速,可扩展的伪聚类而不是克服了嘈杂DNA拷贝聚类处理时的瓶颈。我们的回旋和变压器块和使用真实数据统计仿照合成数据进行训练之间架构整合。
背景:人类的思想是多式联的。然而,大多数行为研究依赖于百年历史措施,例如任务准确性和延迟。为了更好地了解人类行为和大脑功能,我们应该引入其他措施并分析各个方面的行为。然而,它在技术上复杂且昂贵地设计和实施记录多种措施的实验。要解决此问题,需要一个允许从人类行为同步多种措施的平台。方法:本文介绍了名为OpenSync的OpenSource平台,可用于在神经科学实验中同步多种措施。该平台有助于自动集成,同步和记录生理测量(例如,脑电图(EEG),电流性皮肤响应(GSR),眼睛跟踪,身体运动等),用户输入响应(例如,来自鼠标,键盘,操纵杆等)和任务相关信息(刺激标记)。在本文中,我们解释了Opensync的结构和细节,提供了两种在精神病和团结的案例研究。与现有工具的比较:与专有系统(例如,审核)不同,OpenSync是免费的,它可以在任何替换实验设计软件(例如,Fleare,Openseame,Unity等,https://pypi.org/project/中使用OpenSync /和https://github.com/moeinrazavi/opensync_unity)。结果:我们的实验结果表明,OpenSync平台能够使用微秒分辨率同步多种措施。
Research on automated essay scoring has become increasing important because it serves as a method for evaluating students' written-responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods than can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern automated essay scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included as part of the Automated Student Assessment Prize competition that were then classified using a scoring model that was training with the bidirectional encoder representations from transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.
This paper presents a novel framework for planning in unknown and occluded urban spaces. We specifically focus on turns and intersections where occlusions significantly impact navigability. Our approach uses an inpainting model to fill in a sparse, occluded, semantic lidar point cloud and plans dynamically feasible paths for a vehicle to traverse through the open and inpainted spaces. We demonstrate our approach using a car's lidar data with real-time occlusions, and show that by inpainting occluded areas, we can plan longer paths, with more turn options compared to without inpainting; in addition, our approach more closely follows paths derived from a planner with no occlusions (called the ground truth) compared to other state of the art approaches.
Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. Other approaches formulated the problem as a Markov Decision Process (MDP) and applied reinforcement learning based algorithms. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and Deep Q-Network algorithms as benchmark, we show the effectiveness of our proposed approach with experimental study.
