We present MeshLeTemp, a powerful method for 3D human pose and mesh reconstruction from a single image. In terms of human body priors encoding, we propose using a learnable template human mesh instead of a constant template as utilized by previous state-of-the-art methods. The proposed learnable template reflects not only vertex-vertex interactions but also the human pose and body shape, being able to adapt to diverse images. We conduct extensive experiments to show the generalizability of our method on unseen scenarios.
translated by 谷歌翻译
Calculating an Air Quality Index (AQI) typically uses data streams from air quality sensors deployed at fixed locations and the calculation is a real time process. If one or a number of sensors are broken or offline, then the real time AQI value cannot be computed. Estimating AQI values for some point in the future is a predictive process and uses historical AQI values to train and build models. In this work we focus on gap filling in air quality data where the task is to predict the AQI at 1, 5 and 7 days into the future. The scenario is where one or a number of air, weather and traffic sensors are offline and explores prediction accuracy under such situations. The work is part of the MediaEval'2022 Urban Air: Urban Life and Air Pollution task submitted by the DCU-Insight-AQ team and uses multimodal and crossmodal data consisting of AQI, weather and CCTV traffic images for air pollution prediction.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The introduction of high-quality image generation models, particularly the StyleGAN family, provides a powerful tool to synthesize and manipulate images. However, existing models are built upon high-quality (HQ) data as desired outputs, making them unfit for in-the-wild low-quality (LQ) images, which are common inputs for manipulation. In this work, we bridge this gap by proposing a novel GAN structure that allows for generating images with controllable quality. The network can synthesize various image degradation and restore the sharp image via a quality control code. Our proposed QC-StyleGAN can directly edit LQ images without altering their quality by applying GAN inversion and manipulation techniques. It also provides for free an image restoration solution that can handle various degradations, including noise, blur, compression artifacts, and their mixtures. Finally, we demonstrate numerous other applications such as image degradation synthesis, transfer, and interpolation.
translated by 谷歌翻译
跟踪球员和团队运动中的球是分析表现或增强游戏体验的关键。当这些数据的唯一来源是广播视频时,需要运动场注册系统来估算同型并重新投影球或从图像空间到场地的球员。本文描述了在MMSPorts 2022 Camera Callibration Challenge的背景下,一个新的篮球法庭注册框架。该方法基于通过用透视感知约束采样的关键点的位置的编码器编码网络的估计。篮子位置的回归和重型数据增强技术使该模型稳健地对不同的领域。消融研究表明,我们的贡献对挑战测试集的积极影响。与挑战基线相比,我们的方法将平方误差除以4.7。
translated by 谷歌翻译
小麦是全球主要的主食之一。因此,必须衡量,维护和改善人类消费的小麦质量。传统的小麦质量测量方法主要具有侵入性,破坏性,并且仅限于小麦样本。在典型的小麦供应链中,有许多接收点,散装小麦到来,根据要求将其存储和转发。在此接受点,传统质量测量方法的应用非常困难,而且通常非常昂贵。因此,需要非侵入性,无损的实时方法来进行小麦质量评估。满足上述标准的一种这样的方法是用于食品质量测量的高光谱成像(HSI),也可以应用于批量样品。在本文中,我们研究了如何在文献中使用HSI来评估储存的小麦质量。因此,可以在单个紧凑的文档中提供所需的信息,以在澳大利亚供应链的不同阶段实施实时数字质量评估方法。
translated by 谷歌翻译
我们分析了通过从源到目标任务转移学习训练的深度学习模型的新泛化界限。我们的边界利用一个称为多数预测器准确性的数量,可以从数据中有效地计算出来。我们表明我们的理论在实践中很有用,因为这意味着大多数预测指标的准确性可以用作可转移性度量,这一事实也通过我们的实验验证。
translated by 谷歌翻译
深度学习方法实现了对放射学图像进行分类的最新性能,但依赖于需要专家资源密集型注释的大型标签数据集。半监督学习和积极学习都可以用于减轻这种注释负担。但是,对于多标签医学图像分类,将半监督和主动学习方法的优势结合起来的工作有限。在这里,我们介绍了一种基于一致性的新型半监督证据活跃学习框架(CSEAL)。具体而言,我们利用基于证据和主观逻辑理论的预测不确定性来开发一种端到端的综合方法,该方法将基于一致性的半监督学习与基于不确定性的主动学习相结合。我们采用我们的方法来增强四种基于一致性的半监督学习方法:伪标记,虚拟对抗性培训,卑鄙的老师和不老师。对多标签胸部X射线分类任务的广泛评估表明,CSEAL在两个领先的半监督活跃学习基线方面取得了实质性改进。此外,班级分解的结果表明,我们的方法可以大大提高标记样品较少的稀有异常的准确性。
translated by 谷歌翻译
在深度学习的生态系统中,嘈杂的标签是不可避免的,但很麻烦,因为模型可以轻松地过度拟合它们。标签噪声有许多类型,例如对称,不对称和实例依赖性噪声(IDN),而IDN是唯一取决于图像信息的类型。鉴于标签错误很大程度上是由于图像中存在的视觉类别不足或模棱两可的信息引起的,因此对图像信息的这种依赖性使IDN成为可研究标签噪声的关键类型。为了提供一种有效的技术来解决IDN,我们提出了一种称为InstanceGM的新图形建模方法,该方法结合了判别和生成模型。实例GM的主要贡献是:i)使用连续的Bernoulli分布来培训生成模型,提供了重要的培训优势,ii)探索最先进的噪声标签歧视分类器来生成清洁标签来自实例依赖性嘈杂标签样品。 InstanceGM具有当前嘈杂的学习方法的竞争力,尤其是在使用合成和现实世界数据集的IDN基准测试中,我们的方法比大多数实验中的竞争对手都表现出更好的准确性。
translated by 谷歌翻译
元学习是一种处理不平衡和嘈杂标签学习的有效方法,但它取决于验证集,其中包含随机选择,手动标记和平衡的分布式样品。该验证集的随机选择和手动标记和平衡不仅是元学习的最佳选择,而且随着类的数量,它的缩放范围也很差。因此,最近的元学习论文提出了临时启发式方法来自动构建和标记此验证集,但是这些启发式方法仍然是元学习的最佳选择。在本文中,我们分析了元学习算法,并提出了新的标准来表征验证集的实用性,基于:1)验证集的信息性; 2)集合的班级分配余额; 3)集合标签的正确性。此外,我们提出了一种新的不平衡的嘈杂标签元学习(INOLML)算法,该算法会自动构建通过上面的标准最大化其实用程序来构建验证。我们的方法比以前的元学习方法显示出显着改进,并在几个基准上设定了新的最新技术。
translated by 谷歌翻译