智能论文笔记

DeepGen: Diverse Search Ad Generation and Real-Time Customization

Konstantin Golobokov , Junyi Chai , Victor Ye Dong , Mandy Gu , Bingyu Chi , Jie Cao , Yulan Yan , Yi Liu

分类：自然语言处理

2022-08-06

我们介绍了DeepGen，这是一个在网络范围内部署的系统，用于自动为宾果派客户创建赞助的搜索广告（ADS）。我们利用最新的自然语言生成（NLG）模型以抽象的方式从广告商的网页中生成流利的广告，并解决了实际问题，例如事实和推理速度。此外，我们的系统可实时创建自定义的广告，以响应用户的搜索查询，因此根据用户所需的内容突出显示了同一产品的不同方面。为了实现这一目标，我们的系统会提前生成各种较小广告的选择，并在查询时间选择最相关的广告选择，以将其缝合为完整的广告。我们通过培训可控的NLG模型来改善发电多样性，以生成相同网页的多个广告，突出显示不同的销售点。我们的系统设计通过首先运行具有不同目标训练的生成模型的合奏，然后使用多样性采样算法来选择各种各样的生成结果以进行在线选择，从而进一步改善了多样性。实验结果显示了我们提出的系统设计的有效性。我们的系统目前已在生产中部署，为Bing提供的全球广告提供$ {\ sim} 4 \％$。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Artificial Intelligence Security Competition (AISC)

Yinpeng Dong , Peng Chen , Senyou Deng , Lianji L , Yi Sun , Hanyu Zhao , Jiaxing Li , Yunteng Tan , Xinyu Liu , Yangyi Dong

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-07

The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Fast DistilBERT on CPUs

Haihao Shen , Ofir Zafrir , Bo Dong , Hengyu Meng , Xinyu Ye , Zhe Wang , Yi Ding , Hanwen Chang , Guy Boudoukh , Moshe Wasserblat

分类：自然语言处理 | 人工智能 | 机器学习

2022-10-27

Transformer-based language models have become the standard approach to solving natural language processing tasks. However, industry adoption usually requires the maximum throughput to comply with certain latency constraints that prevents Transformer models from being used in production. To address this gap, model compression techniques such as quantization and pruning may be used to improve inference efficiency. However, these compression techniques require specialized software to apply and deploy at scale. In this work, we propose a new pipeline for creating and running Fast Transformer models on CPUs, utilizing hardware-aware pruning, knowledge distillation, quantization, and our own Transformer inference runtime engine with optimized kernels for sparse and quantized operators. We demonstrate the efficiency of our pipeline by creating a Fast DistilBERT model showing minimal accuracy loss on the question-answering SQuADv1.1 benchmark, and throughput results under typical production constraints and environments. Our results outperform existing state-of-the-art Neural Magic's DeepSparse runtime performance by up to 50% and up to 4.1x performance speedup over ONNX Runtime. Source code is publicly available at https://github.com/intel/intel-extension-for-transformers.

translated by 谷歌翻译

A CT-Based Airway Segmentation Using U$^2$-net Trained by the Dice Loss Function

Kunpeng Wang , Yuexi Dong , Yunpu Zeng , Zhichun Ye , Yangzhe Wang

分类：计算机视觉

2022-09-22

胸部计算机断层扫描的气道分割在肺部疾病诊断中起着至关重要的作用。与手动分割相比，基于U-NET体系结构的计算机辅助气道分割更有效，更准确。在本文中，我们采用了由骰子损失功能训练的U $^2 $ -NET，以基于ATM'22提供的299次培训CT扫描，对多站点CT扫描的气道树进行建模。从训练中将派生的显着性概率图应用于验证数据以提取相应的气道树。该观察结果表明，大多数分割的气道树从准确性和连通性的角度表现出色。将诸如非航空区域标签和去除之类的改进应用于某些获得的气道树模型，以显示二进制结果的最大组成部分。

translated by 谷歌翻译

Enabling Massage Actions: An Interactive Parallel Robot with Compliant Joints

Huixu Dong , Yue Feng , Chen Qiu , Ye Pan , Miaoying He , I-Ming Chen

分类：机器人

2022-08-26

我们提出了一个基于串联弹性执行器（SEA）的平行按摩机器人，提供统一的力量控制方法。首先，建立了运动和静态力模型，以获得相应的控制变量。然后，提出了一种新型的力位控制策略，以在不需要机器人动力学模型的情况下分别控制沿表面正常方向的力位和另一个两方向位移。为了评估其性能，我们实施了一系列机器人按摩实验。结果表明，所提出的按摩操纵器可以成功实现按摩任务的所需力和运动模式，从而达到高得分用户体验。

translated by 谷歌翻译

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

Ren Yang , Radu Timofte , Xin Li , Qi Zhang , Lin Zhang , Fanglong Liu , Dongliang He , Fu li , He Zheng , Weihang Yuan

分类：计算机视觉

2022-08-23

本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率，轨迹〜2靶向压缩视频的超分辨率。在轨道1中，我们使用流行的数据集DIV2K作为培训，验证和测试集。在轨道2中，我们提出了LDV 3.0数据集，其中包含365个视频，包括LDV 2.0数据集（335个视频）和30个其他视频。在这一挑战中，有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。

translated by 谷歌翻译

HighlightNet: Highlighting Low-Light Potential Features for Real-Time UAV Tracking

Changhong Fu , Haolin Dong , Junjie Ye , Guangze Zheng , Sihang Li , Jilin Zhao

分类：机器人 | 计算机视觉

2022-08-14

低光环境对强大的无人驾驶汽车（UAV）跟踪也构成了巨大的挑战，即使使用最新的（SOTA）跟踪器，由于潜在的图像特征在不利的光条件下很难提取。此外，由于可见性较低，人类监视器的准确在线选择也极为难以在地面控制站中初始化无人机跟踪。为了解决这些问题，这项工作提出了一个新颖的增强剂，即凸线网，以点燃人类操作员和无人机跟踪器的潜在对象。通过采用变压器，LightlightNet可以根据全局特征调整增强参数，因此可以适应照明变化。引入了像素级范围掩模，以使光明网络更加专注于没有光源的跟踪对象和区域的增强。此外，建立了一种软截断机制，以防止背景噪声被误认为关键特征。对图像增强基准测试的评估表明，光明网络在促进人类感知方面具有优势。公共Uavdark135基准进行的实验表明，HightlightNet比其他SOTA低光增强剂更适合无人机跟踪任务。此外，在典型的无人机平台上进行的现实世界测试验证了HightlightNet在夜间航空跟踪相关应用中的实用性和效率。代码和演示视频可在https://github.com/vision4robotics/highlightnet上找到。

translated by 谷歌翻译

BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers

Zhiliang Peng , Li Dong , Hangbo Bao , Qixiang Ye , Furu Wei

分类：计算机视觉

2022-08-12

蒙版图像建模（MIM）通过恢复损坏的图像补丁，在自我监督的表示学习中表现出了令人印象深刻的结果。但是，大多数方法仍在低级图像像素上运行，这阻碍了对表示模型的高级语义的开发。在这项研究中，我们建议将富含语义的视觉令牌用作掩盖预测的重建目标，从而提供了一种系统的方式来促进MIM从像素级到语义级别。具体而言，我们引入了矢量定量的知识蒸馏以训练令牌仪，该蒸馏器将连续的语义空间离散为紧凑的代码。然后，我们通过预测掩盖图像贴片的原始视觉令牌来预处理变压器。此外，我们鼓励该模型将补丁信息明确汇总到全局图像表示中，该图像表示该设施线性探测。图像分类和语义分割的实验表明，我们的方法优于所有方法比较MIM方法。在ImagEnet-1K（224尺寸）上，基本大小的BEIT V2可实现85.5％的top-1精度，用于微调和80.1％的线性探测的TOP-1精度。大尺寸的BEIT V2获得了ImagEnet-1K（224尺寸）微调的最高1个TOP-1精度，用于语义分割的ADE20K上获得了56.7％MIOU。代码和预估计的模型可在https://aka.ms/beit上找到。

translated by 谷歌翻译