智能论文笔记

Evacuation Shelter Scheduling Problem

Hitoshi Shimizu , Hirohiko Suwa , Tomoharu Iwata , Akinori Fujino , Hiroshi Sawada , Keiichi Yasumoto

分类：人工智能

2021-11-26

在自然灾害期间迫切需要的疏散避难所旨在尽量减少对人类幸存者的疏散负担。然而，灾难的规模越大，操作避难所的成本越高。当疏散物的数量减少时，通过将剩余的疏散物移动到其他避难所和尽可能快地关闭挡板来减小操作成本。另一方面，庇护所之间的搬迁对疏散者造成了巨大的情感负担。在这项研究中，我们制定了“疏散避难所调度问题”，它以避难所分配避难所，以尽量减少避难所的运动成本和避难所的运营成本。由于很难直接解决这一二次编程问题，因此我们将其转换为0-1整数编程问题。此外，这种配方努力计算从历史数据中重新安置它们的负担，因为实际没有付款。为了解决这个问题，我们提出了一种方法，该方法根据实际灾难期间基于撤离者和庇护所的数量估算运动成本。仿真实验与神户地震（Great Hanshin-Awaji地震）的记录表明，我们的建议方法将运营成本减少3370万美元：32％。

translated by 谷歌翻译

Edema Estimation From Facial Images Taken Before and After Dialysis via Contrastive Multi-Patient Pre-Training

Yusuke Akamatsu , Yoshifumi Onishi , Hitoshi Imaoka , Junko Kameyama , Hideo Tsurushima

分类：计算机视觉

2022-12-15

Edema is a common symptom of kidney disease, and quantitative measurement of edema is desired. This paper presents a method to estimate the degree of edema from facial images taken before and after dialysis of renal failure patients. As tasks to estimate the degree of edema, we perform pre- and post-dialysis classification and body weight prediction. We develop a multi-patient pre-training framework for acquiring knowledge of edema and transfer the pre-trained model to a model for each patient. For effective pre-training, we propose a novel contrastive representation learning, called weight-aware supervised momentum contrast (WeightSupMoCo). WeightSupMoCo aims to make feature representations of facial images closer in similarity of patient weight when the pre- and post-dialysis labels are the same. Experimental results show that our pre-training approach improves the accuracy of pre- and post-dialysis classification by 15.1% and reduces the mean absolute error of weight prediction by 0.243 kg compared with training from scratch. The proposed method accurately estimate the degree of edema from facial images; our edema estimation system could thus be beneficial to dialysis patients.

translated by 谷歌翻译

Blood Oxygen Saturation Estimation from Facial Video via DC and AC components of Spatio-temporal Map

Yusuke Akamatsu , Yoshifumi Onishi , Hitoshi Imaoka

分类：计算机视觉

2022-12-14

Peripheral blood oxygen saturation (SpO2), an indicator of oxygen levels in the blood, is one of the most important physiological parameters. Although SpO2 is usually measured using a pulse oximeter, non-contact SpO2 estimation methods from facial or hand videos have been attracting attention in recent years. In this paper, we propose an SpO2 estimation method from facial videos based on convolutional neural networks (CNN). Our method constructs CNN models that consider the direct current (DC) and alternating current (AC) components extracted from the RGB signals of facial videos, which are important in the principle of SpO2 estimation. Specifically, we extract the DC and AC components from the spatio-temporal map using filtering processes and train CNN models to predict SpO2 from these components. We also propose an end-to-end model that predicts SpO2 directly from the spatio-temporal map by extracting the DC and AC components via convolutional layers. Experiments using facial videos and SpO2 data from 50 subjects demonstrate that the proposed method achieves a better estimation performance than current state-of-the-art SpO2 estimation methods.

translated by 谷歌翻译

Counterfactual Learning with General Data-generating Policies

Yusuke Narita , Kyohei Okumura , Akihiro Shimizu , Kohei Yata

分类：机器学习 | 人工智能 | (统计)机器学习

2022-12-04

Off-policy evaluation (OPE) attempts to predict the performance of counterfactual policies using log data from a different policy. We extend its applicability by developing an OPE method for a class of both full support and deficient support logging policies in contextual-bandit settings. This class includes deterministic bandit (such as Upper Confidence Bound) as well as deterministic decision-making based on supervised and unsupervised learning. We prove that our method's prediction converges in probability to the true performance of a counterfactual policy as the sample size increases. We validate our method with experiments on partly and entirely deterministic logging policies. Finally, we apply it to evaluate coupon targeting policies by a major online platform and show how to improve the existing policy.

translated by 谷歌翻译

GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant Instance Conditioning

Gaku Narita , Junichi Shimizu , Taketo Akama

分类：机器学习

2022-11-10

We propose GANStrument, a generative adversarial model for instrument sound synthesis. Given a one-shot sound as input, it is able to generate pitched instrument sounds that reflect the timbre of the input within an interactive time. By exploiting instance conditioning, GANStrument achieves better fidelity and diversity of synthesized sounds and generalization ability to various inputs. In addition, we introduce an adversarial training scheme for a pitch-invariant feature extractor that significantly improves the pitch accuracy and timbre consistency. Experimental results show that GANStrument outperforms strong baselines that do not use instance conditioning in terms of generation quality and input editability. Qualitative examples are available online.

translated by 谷歌翻译

Fashion-Specific Attributes Interpretation via Dual Gaussian Visual-Semantic Embedding

Ryotaro Shimizu , Masanari Kimura , Masayuki Goto

分类：计算机视觉 | 机器学习

2022-10-28

Several techniques to map various types of components, such as words, attributes, and images, into the embedded space have been studied. Most of them estimate the embedded representation of target entity as a point in the projective space. Some models, such as Word2Gauss, assume a probability distribution behind the embedded representation, which enables the spread or variance of the meaning of embedded target components to be captured and considered in more detail. We examine the method of estimating embedded representations as probability distributions for the interpretation of fashion-specific abstract and difficult-to-understand terms. Terms, such as "casual," "adult-casual,'' "beauty-casual," and "formal," are extremely subjective and abstract and are difficult for both experts and non-experts to understand, which discourages users from trying new fashion. We propose an end-to-end model called dual Gaussian visual-semantic embedding, which maps images and attributes in the same projective space and enables the interpretation of the meaning of these terms by its broad applications. We demonstrate the effectiveness of the proposed method through multifaceted experiments involving image and attribute mapping, image retrieval and re-ordering techniques, and a detailed theoretical/analytical discussion of the distance measure included in the loss function.

translated by 谷歌翻译

On the Adversarial Transferability of ConvMixer Models

Ryota Iijima , Miki Tanaka , Isao Echizen , Hitoshi Kiya

分类：机器学习

2022-09-19

深度神经网络（DNN）众所周知，很容易受到对抗例子的影响（AES）。此外，AE具有对抗性可传递性，这意味着为源模型生成的AE可以以非平凡的概率欺骗另一个黑框模型（目标模型）。在本文中，我们首次研究了包括Convmixer在内的模型之间的对抗性转移性的属性。为了客观地验证可转让性的属性，使用称为AutoAttack的基准攻击方法评估模型的鲁棒性。在图像分类实验中，Convmixer被确认对对抗性转移性较弱。

translated by 谷歌翻译

StyleGAN Encoder-Based Attack for Block Scrambled Face Images

AprilPyone MaungMaung , Hitoshi Kiya

分类：计算机视觉

2022-09-16

在本文中，我们提出了一种攻击方法，以阻止炒面的面部图像，尤其是加密 - 加压（ETC）应用图像，通过首次利用现有强大的stylegan编码器和解码器。我们专注于恢复可以从加密图像中揭示可识别信息的样式，而不是从加密图像中重建相同的图像。所提出的方法通过使用特定的训练策略使用普通和加密的图像对来训练编码器。尽管最新的攻击方法无法从ETC图像中恢复任何感知信息，但该建议的方法披露了个人身份信息，例如头发颜色，肤色，眼镜，性别等。结果表明，与普通图像相比，重建的图像具有一些感知的相似性。

translated by 谷歌翻译

Model interpretation using improved local regression with variable importance

Gilson Y. Shimizu , Rafael Izbicki , Andre C. P. L. F. de Carvalho

分类： (统计)机器学习 | 机器学习

2022-09-12

关于使用ML模型的一个基本问题涉及其对提高决策透明度的预测的解释。尽管已经出现了几种可解释性方法，但已经确定了有关其解释可靠性的一些差距。例如，大多数方法都是不稳定的（这意味着它们在数据中提供了截然不同的解释），并且不能很好地应对无关的功能（即与标签无关的功能）。本文介绍了两种新的可解释性方法，即Varimp和Supclus，它们通过使用局部回归拟合的加权距离来克服这些问题，以考虑可变重要性。 Varimp生成了每个实例的解释，可以应用于具有更复杂关系的数据集，而Supclus解释了具有类似说明的实例集群，并且可以应用于可以找到群集的较简单数据集。我们将我们的方法与最先进的方法进行了比较，并表明它可以根据几个指标产生更好的解释，尤其是在具有无关特征的高维问题中，以及特征与目标之间的关系是非线性的。

translated by 谷歌翻译

DM$^2$S$^2$: Deep Multi-Modal Sequence Sets with Hierarchical Modality Attention

Shunsuke Kitada , Yuki Iwazaki , Riku Togashi , Hitoshi Iyatomi

分类：人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-09-07

在各种Web应用程序（例如数字广告和电子商务）中使用多模式数据的兴趣越来越大。从多模式数据中提取重要信息的典型方法取决于结合了来自多个编码器的特征表示的中型架构。但是，随着模态数量的增加，中融合模型结构的几个潜在问题会出现，例如串联多模式特征和缺失模态的维度增加。为了解决这些问题，我们提出了一个新概念，该概念将多模式输入视为一组序列，即深度多模式序列集（DM $^2 $ S $^2 $）。我们的设置感知概念由三个组成部分组成，这些组件捕获了多种模式之间的关系：（a）基于BERT的编码器来处理序列中元素间和内级内和内级的编码器，（b）模式内的残留物（Intramra）（Intramra））捕获元素在模态中的重要性，以及（c）模式间残留的关注（Intermra），以进一步增强具有模态水平粒度的元素的重要性。我们的概念表现出与以前的设置感知模型相当或更好的性能。此外，我们证明了学识渊博的Intermra和Intramra权重的可视化可以提供对预测结果的解释。

translated by 谷歌翻译