To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.
translated by 谷歌翻译
The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.
translated by 谷歌翻译
We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily. The PiggyBack supports the full stack of visual question answering tasks, specifically data processing, model fine-tuning, and result visualisation. We integrate visual-language models, pretrained by HuggingFace, an open-source API platform of deep learning technologies; however, it cannot be runnable without programming skills or deep learning understanding. Hence, our PiggyBack supports an easy-to-use browser-based user interface with several deep learning visual language pretrained models for general users and domain experts. The PiggyBack includes the following benefits: Free availability under the MIT License, Portability due to web-based and thus runs on almost any platform, A comprehensive data creation and processing technique, and ease of use on deep learning-based visual language pretrained models. The demo video is available on YouTube and can be found at https://youtu.be/iz44RZ1lF4s.
translated by 谷歌翻译
Myocardial pathology segmentation (MyoPS) can be a prerequisite for the accurate diagnosis and treatment planning of myocardial infarction. However, achieving this segmentation is challenging, mainly due to the inadequate and indistinct information from an image. In this work, we develop an end-to-end deep neural network, referred to as MyoPS-Net, to flexibly combine five-sequence cardiac magnetic resonance (CMR) images for MyoPS. To extract precise and adequate information, we design an effective yet flexible architecture to extract and fuse cross-modal features. This architecture can tackle different numbers of CMR images and complex combinations of modalities, with output branches targeting specific pathologies. To impose anatomical knowledge on the segmentation results, we first propose a module to regularize myocardium consistency and localize the pathologies, and then introduce an inclusiveness loss to utilize relations between myocardial scars and edema. We evaluated the proposed MyoPS-Net on two datasets, i.e., a private one consisting of 50 paired multi-sequence CMR images and a public one from MICCAI2020 MyoPS Challenge. Experimental results showed that MyoPS-Net could achieve state-of-the-art performance in various scenarios. Note that in practical clinics, the subjects may not have full sequences, such as missing LGE CMR or mapping CMR scans. We therefore conducted extensive experiments to investigate the performance of the proposed method in dealing with such complex combinations of different CMR sequences. Results proved the superiority and generalizability of MyoPS-Net, and more importantly, indicated a practical clinical application.
translated by 谷歌翻译
位置识别是自动驾驶汽车实现循环结束或全球本地化的重要组成部分。在本文中,我们根据机上激光雷达传感器获得的顺序3D激光扫描解决了位置识别问题。我们提出了一个名为SEQOT的基于变压器的网络,以利用由LIDAR数据生成的顺序范围图像提供的时间和空间信息。它使用多尺度变压器以端到端的方式为每一个LiDAR范围图像生成一个全局描述符。在线操作期间,我们的SEQOT通过在当前查询序列和地图中存储的描述符之间匹配此类描述符来找到相似的位置。我们在不同类型的不同环境中使用不同类型的LIDAR传感器收集的四个数据集评估了我们的方法。实验结果表明,我们的方法优于最新的基于激光痛的位置识别方法,并在不同环境中概括了。此外,我们的方法比传感器的帧速率更快地在线运行。我们的方法的实现以开放源形式发布,网址为:https://github.com/bit-mjy/seqot。
translated by 谷歌翻译
开发对手挑战NLP系统的方法是提高模型性能和解释性的有前途的途径。在这里,我们描述了团队在第一个动态对抗数据收集(DADC)的任务1中“长角牛”的方法,该研讨会要求团队手动欺骗一个模型,以挖掘出挖掘的问题回答任务。我们的团队首先结束,模型错误率为62%。我们主张采用系统的,语言知情的方法来制定对抗性问题,并描述了试点实验的结果以及我们的官方提交。
translated by 谷歌翻译
机器人需要多种互动模式来与人类在复杂的工业任务中进行稳健合作。我们开发了共存和共存(可可)人类机器人协作系统。共存模式使机器人能够在共享空间中独立地与人类在不同子任务上合作。合作模式使机器人能够遵循人类的指导并恢复失败。人类意图跟踪算法将人类和机器人运动测量作为输入,并提供了交互模式的开关。我们证明了可可系统在用例中类似于现实世界多步组件任务的有效性。
translated by 谷歌翻译
协作机器人需要有效的人类意图估算,以便在诸如人类意图不断变化的工业集会等结构化任务中安全,平稳地与人类合作。我们提出了意图跟踪的概念,并引入了一个协作机器人系统,该系统同时跟踪层次级别的意图。跟踪高级意图以估计人类的相互作用模式,并使机器人能够(1)避免与人碰撞以最大程度地减少中断或(2)帮助人类纠正失败。低级意图估算为机器人提供了特定任务的信息,以进行并发执行。我们在UR5E机器人上实现了该系统,并通过消融试验性研究在组装用例中展示了强大的,无缝和人体工程学的人类机器人协作。
translated by 谷歌翻译
This paper presents the ARCAD simulator for the rapid development of Unmanned Aerial Systems (UAS), including underactuated and fully-actuated multirotors, fixed-wing aircraft, and Vertical Take-Off and Landing (VTOL) hybrid vehicles. The simulator is designed to accelerate these aircraft's modeling and control design. It provides various analyses of the design and operation, such as wrench-set computation, controller response, and flight optimization. In addition to simulating free flight, it can simulate the physical interaction of the aircraft with its environment. The simulator is written in MATLAB to allow rapid prototyping and is capable of generating graphical visualization of the aircraft and the environment in addition to generating the desired plots. It has been used to develop several real-world multirotor and VTOL applications. The source code is available at https://github.com/keipour/aircraft-simulator-matlab.
translated by 谷歌翻译
This work presents Time-reversal Equivariant Neural Network (TENN) framework. With TENN, the time-reversal symmetry is considered in the equivariant neural network (ENN), which generalizes the ENN to consider physical quantities related to time-reversal symmetry such as spin and velocity of atoms. TENN-e3, as the time-reversal-extension of E(3) equivariant neural network, is developed to keep the Time-reversal E(3) equivariant with consideration of whether to include the spin-orbit effect for both collinear and non-collinear magnetic moments situations for magnetic material. TENN-e3 can construct spin neural network potential and the Hamiltonian of magnetic material from ab-initio calculations. Time-reversal-E(3)-equivariant convolutions for interactions of spinor and geometric tensors are employed in TENN-e3. Compared to the popular ENN, TENN-e3 can describe the complex spin-lattice coupling with high accuracy and keep time-reversal symmetry which is not preserved in the existing E(3)-equivariant model. Also, the Hamiltonian of magnetic material with time-reversal symmetry can be built with TENN-e3. TENN paves a new way to spin-lattice dynamics simulations over long-time scales and electronic structure calculations of large-scale magnetic materials.
translated by 谷歌翻译