智能论文笔记

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

FuRPE: Learning Full-body Reconstruction from Part Experts

Zhaoxin Fan , Yuqing Pan , Hao Xu , Zhenbo Song , Zhicheng Wang , Kejian Wu , Hongyan Liu , Jun He

分类：计算机视觉

2022-11-30

Full-body reconstruction is a fundamental but challenging task. Owing to the lack of annotated data, the performances of existing methods are largely limited. In this paper, we propose a novel method named Full-body Reconstruction from Part Experts~(FuRPE) to tackle this issue. In FuRPE, the network is trained using pseudo labels and features generated from part-experts. An simple yet effective pseudo ground-truth selection scheme is proposed to extract high-quality pseudo labels. In this way, a large-scale of existing human body reconstruction datasets can be leveraged and contribute to the model training. In addition, an exponential moving average training strategy is introduced to train the network in a self-supervised manner, further boosting the performance of the model. Extensive experiments on several widely used datasets demonstrate the effectiveness of our method over the baseline. Our method achieves the state-of-the-art performance. Code will be publicly available for further research.

translated by 谷歌翻译

RDA: An Accelerated Collision-free Motion Planner for Autonomous Navigation in Cluttered Environments

Ruihua Han , Shuai Wang , Shuaijun Wang , Zeqing Zhang , Qianru Zhang , Yonina C. Eldar , Qi Hao , Jia Pan

分类：机器人

2022-10-01

Motion planning is challenging for autonomous systems in multi-obstacle environments due to nonconvex collision avoidance constraints. Directly applying numerical solvers to these nonconvex formulations fails to exploit the constraint structures, resulting in excessive computation time. In this paper, we present an accelerated collision-free motion planner, namely regularized dual alternating direction method of multipliers (RDADMM or RDA for short), for the model predictive control (MPC) based motion planning problem. The proposed RDA addresses nonconvex motion planning via solving a smooth biconvex reformulation via duality and allows the collision avoidance constraints to be computed in parallel for each obstacle to reduce computation time significantly. We validate the performance of the RDA planner through path-tracking experiments with car-like robots in simulation and real world setting. Experimental results show that the proposed methods can generate smooth collision-free trajectories with less computation time compared with other benchmarks and perform robustly in cluttered environments.

translated by 谷歌翻译

Implicit Conversion of Manifold B-Rep Solids by Neural Halfspace Representation

Hao-Xiang Guo , Yang Liu , Hao Pan , Baining Guo

分类：计算机视觉

2022-09-21

我们提出了一种新颖的隐式表示 - 神经半空间表示（NH-REP），以将歧管B-REP固体转换为隐式表示。 NH-REP是一棵布尔树木，建立在由神经网络代表的一组隐式函数上，复合布尔函数能够代表实体几何形状，同时保留锐利的特征。我们提出了一种有效的算法，以从歧管B-Rep固体中提取布尔树，并设计一种基于神经网络的优化方法来计算隐式函数。我们证明了我们的转换算法在一千个流形B-REP CAD模型上提供的高质量，这些模型包含包括NURB在内的各种弯曲斑块，以及我们学习方法优于其他代表性的隐性转换算法，在表面重建，尖锐的特征保存，尖锐的特征保存，尖锐的特征，，符号距离场的近似和对各种表面几何形状的鲁棒性以及由NH-REP支持的一组应用。

translated by 谷歌翻译

Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos

Yilin Wen , Hao Pan , Lei Yang , Jia Pan , Taku Komura , Wenping Wang

分类：计算机视觉 | 机器人

2022-09-20

由于自我批判性和歧义，了解动态的手动运动和动态动作是一项基本而又具有挑战性的任务。为了解决遮挡和歧义，我们开发了一个基于变压器的框架来利用时间信息以进行稳健的估计。注意到手部姿势估计和动作识别之间的不同时间粒度和语义相关性，我们建立了一个网络层次结构，其中有两个级联变压器编码器，其中第一个利用了短期的时间cue进行手姿势估算，而后者则每次聚集物，后者每次聚集体 - 帧姿势和对象信息在更长的时间范围内识别动作。我们的方法在两个第一人称手动作基准（即FPHA和H2O）上取得了竞争成果。广泛的消融研究验证了我们的设计选择。我们将开放源代码和数据以促进未来的研究。

translated by 谷歌翻译

TAG: Learning Circuit Spatial Embedding From Layouts

Keren Zhu , Hao Chen , Walker J. Turner , George F. Kokai , Po-Hsuan Wei , David Z. Pan , Haoxing Ren

分类：机器学习

2022-09-07

模拟和混合信号（AMS）电路设计仍然依赖于人类设计专业知识。机器学习一直通过用人工智能代替人类的体验来协助电路设计自动化。本文介绍了标签，这是一种从利用文本，自我注意力和图形的布局中学习电路表示的新范式。嵌入网络模型在无手动标签的情况下学习空间信息。我们向AMS电路学习介绍文本嵌入和自我注意的机制。实验结果表明，具有工业罚款技术基准的实例之间的布局距离的能力。通过在案例研究中显示有限数据的其他三个学习任务的转移性，可以验证电路表示的有效性：布局匹配预测，线长度估计和净寄生电容预测。

translated by 谷歌翻译

Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence

Junjie Wang , Yuxiang Zhang , Lin Zhang , Ping Yang , Xinyu Gao , Ziwei Wu , Xiaoqun Dong , Junqing He , Jianheng Zhuo , Qi Yang

分类：自然语言处理

2022-09-07

如今，基础模型已成为人工智能中的基本基础设施之一，铺平了通往通用情报的方式。但是，现实提出了两个紧急挑战：现有的基础模型由英语社区主导；用户通常会获得有限的资源，因此不能总是使用基础模型。为了支持中文社区的发展，我们介绍了一个名为Fengshenbang的开源项目，该项目由认知计算与自然语言研究中心（CCNL）领导。我们的项目具有全面的功能，包括大型预培训模型，用户友好的API，基准，数据集等。我们将所有这些都包装在三个子项目中：风水次模型，风水框架和狂热基准。 Fengshenbang的开源路线图旨在重新评估中国预培训的大型大型模型的开源社区，促使整个中国大型模型社区的发展。我们还希望构建一个以用户为中心的开源生态系统，以允许个人访问所需的模型以匹配其计算资源。此外，我们邀请公司，大学和研究机构与我们合作建立大型开源模型的生态系统。我们希望这个项目将成为中国认知情报的基础。

translated by 谷歌翻译

Vector Quantized Diffusion Model with CodeUnet for Text-to-Sign Pose Sequences Generation

Pan Xie , Qipeng Zhang , Zexian Li , Hao Tang , Yao Du , Xiaohui Hu

分类：计算机视觉

2022-08-19

手语制作（SLP）旨在将口语语言自动转化为符号序列。 SLP的核心过程是将符号光泽序列转换为其相应的标志姿势序列（G2P）。大多数现有的G2P模型通常以自回归方式执行这种条件的远程生成，这不可避免地导致错误的积累。为了解决这个问题，我们提出了一种量化量子序列序列的生成的矢量量化扩散方法，称为poseVQ扩散，这是一种迭代性非自动入学方法。具体而言，我们首先引入量化量化变量自动编码器（姿势VQVAE）模型，以表示姿势序列作为一系列潜在代码。然后，我们通过最近开发的扩散体系结构的扩展来对潜在离散空间进行建模。为了更好地利用时空信息，我们介绍了一种新颖的体系结构，即CodeUnet，以在离散空间中生成更高质量的姿势序列。此外，利用学习的代码，我们开发了一种新型的顺序k-nearest-neighbours方法，以预测相应的光泽序列的姿势序列的可变长度。因此，与自回旋G2P模型相比，我们的模型具有更快的采样速度，并产生明显更好的结果。与以前的非自动入学G2P方法相比，PoseVQ扩散通过迭代改进改善了预测的结果，从而在SLP评估基准上获得了最新的结果。

translated by 谷歌翻译

Energy and Spectrum Efficient Federated Learning via High-Precision Over-the-Air Computation

Liang Li , Chenpei Huang , Dian Shi , Hao Wang , Xiangwei Zhou , Minglei Shu , Miao Pan

分类：机器学习 | 人工智能

2022-08-15

联合学习（FL）使移动设备能够在保留本地数据的同时协作学习共享的预测模型。但是，实际上在移动设备上部署FL存在两个主要的研究挑战：（i）频繁的无线梯度更新v.s.频谱资源有限，以及（ii）培训期间渴望的FL通信和本地计算V.S.电池约束的移动设备。为了应对这些挑战，在本文中，我们提出了一种新型的多位空天空计算（MAIRCOMP）方法，用于FL中本地模型更新的频谱有效聚合，并进一步介绍用于移动的能源有效的FL设计设备。具体而言，高精度数字调制方案是在MAIRCOMP中设计和合并的，允许移动设备同时在多访问通道中同时在所选位置上传模型更新。此外，我们理论上分析了FL算法的收敛性。在FL收敛分析的指导下，我们制定了联合传输概率和局部计算控制优化，旨在最大程度地减少FL移动设备的总体能源消耗（即迭代局部计算 +多轮通信）。广泛的仿真结果表明，我们提出的方案在频谱利用率，能源效率和学习准确性方面优于现有计划。

translated by 谷歌翻译

Design What You Desire: Icon Generation from Orthogonal Application and Theme Labels

Yinpeng Chen , Zhiyu Pan , Min Shi , Hao Lu , Zhiguo Cao , Weicai Zhong

分类：计算机视觉

2022-07-31

生成的对抗网络（GAN）已受过培训，成为能够创作出令人惊叹的艺术品（例如面部生成和图像样式转移）的专业艺术家。在本文中，我们专注于现实的业务方案：具有所需的移动应用程序和主题样式的可自定义图标的自动生成。我们首先引入一个主题应用图标数据集，称为Appicon，每个图标都有两个正交主题和应用标签。通过研究强大的基线样式，我们观察到由正交标签的纠缠引起的模式崩溃。为了解决这一挑战，我们提出了由有条件的发电机和双重歧视器组成的ICONGAN，具有正交扩大，并且进一步设计了对比的特征分离策略，以使两个歧视器的特征空间正常。与其他方法相比，ICONGAN在Appicon基准测试中表明了优势。进一步的分析还证明了解开应用程序和主题表示的有效性。我们的项目将在以下网址发布：https：//github.com/architect-road/icongan。

translated by 谷歌翻译