Out-of-distribution (OOD) detection has attracted a large amount of attention from the machine learning research community in recent years due to its importance in deployed systems. Most of the previous studies focused on the detection of OOD samples in the multi-class classification task. However, OOD detection in the multi-label classification task remains an underexplored domain. In this research, we propose YolOOD - a method that utilizes concepts from the object detection domain to perform OOD detection in the multi-label classification task. Object detection models have an inherent ability to distinguish between objects of interest (in-distribution) and irrelevant objects (e.g., OOD objects) on images that contain multiple objects from different categories. These abilities allow us to convert a regular object detection model into an image classifier with inherent OOD detection capabilities with just minor changes. We compare our approach to state-of-the-art OOD detection methods and demonstrate YolOOD's ability to outperform these methods on a comprehensive suite of in-distribution and OOD benchmark datasets.
translated by 谷歌翻译
We construct a corpus of Japanese a cappella vocal ensembles (jaCappella corpus) for vocal ensemble separation and synthesis. It consists of 35 copyright-cleared vocal ensemble songs and their audio recordings of individual voice parts. These songs were arranged from out-of-copyright Japanese children's songs and have six voice parts (lead vocal, soprano, alto, tenor, bass, and vocal percussion). They are divided into seven subsets, each of which features typical characteristics of a music genre such as jazz and enka. The variety in genre and voice part match vocal ensembles recently widespread in social media services such as YouTube, although the main targets of conventional vocal ensemble datasets are choral singing made up of soprano, alto, tenor, and bass. Experimental evaluation demonstrates that our corpus is a challenging resource for vocal ensemble separation. Our corpus is available on our project page (https://tomohikonakamura.github.io/jaCappella_corpus/).
translated by 谷歌翻译
确定点过程(DPP)的最大后验(MAP)推断对于在许多机器学习应用中选择多种项目至关重要。尽管DPP地图推断是NP-HARD,但贪婪的算法通常会发现高质量的解决方案,许多研究人员已经研究了其有效的实施。一种经典且实用的方法是懒惰的贪婪算法,适用于一般的下函数最大化,而基于Cholesky的最新快速贪婪算法对于DPP MAP推断更有效。本文介绍了如何结合“懒惰”和“快速”的想法,这些思想在文献中被认为是不兼容的。我们懒惰且快速的贪婪算法与当前最好的算法几乎具有相同的时间复杂性,并且在实践中运行速度更快。 “懒惰 +快速”的想法可扩展到其他贪婪型算法。我们还为无约束的DPP地图推断提供了双贪婪算法的快速版本。实验验证了我们加速思想的有效性。
translated by 谷歌翻译
内窥镜图像通常包含几个伪像。伪影显着影响图像分析导致计算机辅助诊断。卷积神经网络(CNNS),一种深度学习,可以去除这样的伪像。已经提出了各种架构,用于CNNS,并且伪像去除的准确性根据架构的选择而变化。因此,需要根据所选择的架构确定伪影删除精度。在这项研究中,我们专注于内窥镜手术器械作为伪影,并使用七种不同的CNN架构确定和讨论伪影去除精度。
translated by 谷歌翻译
分布式推理(DI)框架已经获得了牵引力作为用于实时应用的技术,用于在资源受限的内容(物联网)设备上的尖端深机学习(ML)。在DI中,计算任务通过IOT设备通过有损的物联网网络从物联网设备卸载到边缘服务器。然而,通常,在通信延迟和可靠性之间存在通信系统级权衡;因此,为了提供准确的DI结果,需要一种可靠和高等待的通信系统来调整,这导致DI的不可忽略的端到端潜伏期。这激励我们通过ML技术的努力来改善通信延迟与准确性之间的权衡。具体而言,我们提出了一种以通信为导向的模型调谐(ComTune),其旨在通过低延迟但不可靠的通信链路实现高度精确的DI。在Comtune中,关键的想法是通过应用辍学技术的应用来微调不可靠通信链路的效果。这使得DI系统能够针对不可靠的通信链路获得鲁棒性。我们的ML实验表明,ComTune使得能够以低延迟和有损网络在低延迟和损失网络下准确预测。
translated by 谷歌翻译