Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) estimation from facial videos which gives significant convenience compared with traditional contact-based measurements. In the real-world long-term health monitoring scenario, the distance of the participants and their head movements usually vary by time, resulting in the inaccurate rPPG measurement due to the varying face resolution and complex motion artifacts. Different from the previous rPPG models designed for a constant distance between camera and participants, in this paper, we propose two plug-and-play blocks (i.e., physiological signal feature extraction block (PFE) and temporal face alignment block (TFA)) to alleviate the degradation of changing distance and head motion. On one side, guided with representative-area information, PFE adaptively encodes the arbitrary resolution facial frames to the fixed-resolution facial structure features. On the other side, leveraging the estimated optical flow, TFA is able to counteract the rPPG signal confusion caused by the head movement thus benefit the motion-robust rPPG signal recovery. Besides, we also train the model with a cross-resolution constraint using a two-stream dual-resolution framework, which further helps PFE learn resolution-robust facial rPPG features. Extensive experiments on three benchmark datasets (UBFC-rPPG, COHFACE and PURE) demonstrate the superior performance of the proposed method. One highlight is that with PFE and TFA, the off-the-shelf spatio-temporal rPPG models can predict more robust rPPG signals under both varying face resolution and severe head movement scenarios. The codes are available at https://github.com/LJW-GIT/Arbitrary_Resolution_rPPG.
translated by 谷歌翻译
面对抗泡沫(FAS)和伪造探测在保护面部生物识别系统免受演示攻击(PAS)和恶性数字操作(例如,Deepfakes)中的生物识别系统中起着至关重要的作用。尽管大规模数据和强大的深层模型有希望的表现,但现有方法的概括问题仍然是一个空旷的问题。最近的大多数方法都集中在1)单峰视觉外观或生理学(即远程光摄影学(RPPG))线索;和2)用于FAS或面部伪造检测的分离特征表示。一方面,单峰外观和RPPG功能分别容易受到高保真的面孔3D面膜和视频重播攻击的影响,从而激发了我们设计可靠的多模式融合机制,用于广义面部攻击检​​测。另一方面,FAS和面部伪造探测任务(例如,定期的RPPG节奏和BONAFIDE的香草外观)都有丰富的共同特征,提供了可靠的证据来设计联合FAS和面部伪造探测系统,以多任务学习方式。在本文中,我们使用视觉外观和生理RPPG提示建立了第一个关节面欺骗和伪造的检测基准。为了增强RPPG的周期性歧视,我们使用两种面部时空时代的RPPG信号图及其连续小波转换为输入的两分支生理网络。为了减轻模态偏差并提高融合功效,我们在多模式融合之前对外观和RPPG特征进行了加权批次和层归一化。我们发现,可以通过对这两个任务的联合培训来改善单峰(外观或RPPG)和多模式(外观+RPPG)模型的概括能力。我们希望这种新的基准将促进FAS和DeepFake检测社区的未来研究。
translated by 谷歌翻译
我们向Smartwatches提出智能解决方案,以评估洗手,以提高用户在高质量洗手中的意识和培养习惯。UWASH可以识别洗手的起始/偏移,测量每个手势的持续时间,并根据谁的指导来评分每个手势以及整个过程。从技术上讲,我们将洗手评估的任务称为计算机愿景中的语义分割问题,并提出了一种轻量级的Unet-Like Network,只有496英尺,有效地实现它。超过51个科目的实验表明,UWASH对样本 - 明智的洗手手势识别的准确性为92.27 \%,每次开始/偏移检测中的$ <$ 0.5 \ textit {秒}错误,以及100次\ extitIT {points}错误的$ <$ 5在用户依赖的设置中得分,虽然在交叉用户评估和交叉用户交叉位置评估中仍然有前景。
translated by 谷歌翻译
远程光学电瓶描绘(RPPG),其目的在没有任何接触的情况下从面部视频测量心脏活动和生理信号,在许多应用中具有很大的潜力(例如,远程医疗保健和情感计算)。最近的深度学习方法专注于利用具有有限时空接收领域的卷积神经网络进行微妙的RPPG线索,这忽略了RPPG建模的远程时空感知和相互作用。在本文中,我们提出了Physformer,基于端到端的视频变换器的架构,以自适应地聚合用于RPPG表示增强的本地和全局时空特征。作为Physformer中的关键模块,时间差异变压器首先提高了具有时间差异引导的全局关注的准周期性RPPG特征,然后优化了局部时空表示免于干扰。此外,我们还提出了标签分配学习和课程学习激发了频域中的动态约束,这为Phyformer和缓解过度装备提供了精心制造的监控。在四个基准数据集上执行综合实验,以显示我们在内部和交叉数据集测试中的卓越性能。一个突出显示的是,与大多数变压器网络不同于大规模数据集预先预订,所提出的Physformer可以从RPPG数据集上从头开始培训,这使得它作为RPPG社区的新型变压器基线。该代码将在https://github.com/zitongyu/physformer释放。
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities. Prior methods successfully preserve the character of clothing images, however, occlusion remains a pernicious effect for realistic virtual try-on. In this work, we first present a comprehensive analysis of the occlusions and categorize them into two aspects: i) Inherent-Occlusion: the ghost of the former cloth still exists in the try-on image; ii) Acquired-Occlusion: the target cloth warps to the unreasonable body part. Based on the in-depth analysis, we find that the occlusions can be simulated by a novel semantically-guided mixup module, which can generate semantic-specific occluded images that work together with the try-on images to facilitate training a de-occlusion try-on (DOC-VTON) framework. Specifically, DOC-VTON first conducts a sharpened semantic parsing on the try-on person. Aided by semantics guidance and pose prior, various complexities of texture are selectively blending with human parts in a copy-and-paste manner. Then, the Generative Module (GM) is utilized to take charge of synthesizing the final try-on image and learning to de-occlusion jointly. In comparison to the state-of-the-art methods, DOC-VTON achieves better perceptual quality by reducing occlusion effects.
translated by 谷歌翻译
Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.
translated by 谷歌翻译
As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy. In some cases, there may be unmeasured variables that can confound the action-reward or action-next-state relationships, rendering many existing OPE approaches ineffective. This paper develops an instrumental variable (IV)-based method for consistent OPE in confounded Markov decision processes (MDPs). Similar to single-stage decision making, we show that IV enables us to correctly identify the target policy's value in infinite horizon settings as well. Furthermore, we propose an efficient and robust value estimator and illustrate its effectiveness through extensive simulations and analysis of real data from a world-leading short-video platform.
translated by 谷歌翻译
Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline data generated by a potentially different behavior policy. It is critical in a number of sequential decision making problems ranging from healthcare to technology industries. Most of the work in existing literature is focused on evaluating the mean outcome of a given policy, and ignores the variability of the outcome. However, in a variety of applications, criteria other than the mean may be more sensible. For example, when the reward distribution is skewed and asymmetric, quantile-based metrics are often preferred for their robustness. In this paper, we propose a doubly-robust inference procedure for quantile OPE in sequential decision making and study its asymptotic properties. In particular, we propose utilizing state-of-the-art deep conditional generative learning methods to handle parameter-dependent nuisance function estimation. We demonstrate the advantages of this proposed estimator through both simulations and a real-world dataset from a short-video platform. In particular, we find that our proposed estimator outperforms classical OPE estimators for the mean in settings with heavy-tailed reward distributions.
translated by 谷歌翻译