We study estimation and testing in the Poisson regression model with noisy high dimensional covariates, which has wide applications in analyzing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a non-convex target function to minimize. Treating the high dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive the L1 and L2 convergence rates of the estimator and prove the variable selection consistency. We further establish the asymptotic normality of any subset of the parameters, where the subset can have infinitely many components as long as its cardinality grows sufficiently slow. We develop Wald and score tests based on the asymptotic normality of the estimator, which permits testing of linear functions of the members if the subset. We examine the finite sample performance of the proposed tests by extensive simulation. Finally, the proposed method is successfully applied to the Alzheimer's Disease Neuroimaging Initiative study, which motivated this work initially.
translated by 谷歌翻译
The detection of human body and its related parts (e.g., face, head or hands) have been intensively studied and greatly improved since the breakthrough of deep CNNs. However, most of these detectors are trained independently, making it a challenging task to associate detected body parts with people. This paper focuses on the problem of joint detection of human body and its corresponding parts. Specifically, we propose a novel extended object representation that integrates the center location offsets of body or its parts, and construct a dense single-stage anchor-based Body-Part Joint Detector (BPJDet). Body-part associations in BPJDet are embedded into the unified representation which contains both the semantic and geometric information. Therefore, BPJDet does not suffer from error-prone association post-matching, and has a better accuracy-speed trade-off. Furthermore, BPJDet can be seamlessly generalized to jointly detect any body part. To verify the effectiveness and superiority of our method, we conduct extensive experiments on the CityPersons, CrowdHuman and BodyHands datasets. The proposed BPJDet detector achieves state-of-the-art association performance on these three benchmarks while maintains high accuracy of detection. Code will be released to facilitate further studies.
translated by 谷歌翻译
Most recent head pose estimation (HPE) methods are dominated by the Euler angle representation. To avoid its inherent ambiguity problem of rotation labels, alternative quaternion-based and vector-based representations are introduced. However, they both are not visually intuitive, and often derived from equivocal Euler angle labels. In this paper, we present a novel single-stage keypoint-based method via an {\it intuitive} and {\it unconstrained} 2D cube representation for joint head detection and pose estimation. The 2D cube is an orthogonal projection of the 3D regular hexahedron label roughly surrounding one head, and itself contains the head location. It can reflect the head orientation straightforwardly and unambiguously in any rotation angle. Unlike the general 6-DoF object pose estimation, our 2D cube ignores the 3-DoF of head size but retains the 3-DoF of head pose. Based on the prior of equal side length, we can effortlessly obtain the closed-form solution of Euler angles from predicted 2D head cube instead of applying the error-prone PnP algorithm. In experiments, our proposed method achieves comparable results with other representative methods on the public AFLW2000 and BIWI datasets. Besides, a novel test on the CMU panoptic dataset shows that our method can be seamlessly adapted to the unconstrained full-view HPE task without modification.
translated by 谷歌翻译
The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet. However, it also means that users' private data may be collected by commercial organizations without consent and used to train their models. Therefore, it's important and necessary to develop a method or tool to prevent unauthorized data exploitation. In this paper, we propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. Specifically, the noise produced by the generator for each image has the confounder property. It can build spurious correlations between images and labels, so that the model cannot learn the correct mapping from images to labels in this noise-added dataset. Meanwhile, the discriminator is used to ensure that the generated noise is small and imperceptible, thereby remaining the normal utility of the encrypted image for humans. The experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets. The results demonstrate that our method not only outperforms state-of-the-art methods in standard settings, but can also be applied to fast encryption scenarios. Moreover, we show a series of transferability and stability experiments to further illustrate the effectiveness and superiority of our method.
translated by 谷歌翻译
Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.
translated by 谷歌翻译
Each student matters, but it is hardly for instructors to observe all the students during the courses and provide helps to the needed ones immediately. In this paper, we present StuArt, a novel automatic system designed for the individualized classroom observation, which empowers instructors to concern the learning status of each student. StuArt can recognize five representative student behaviors (hand-raising, standing, sleeping, yawning, and smiling) that are highly related to the engagement and track their variation trends during the course. To protect the privacy of students, all the variation trends are indexed by the seat numbers without any personal identification information. Furthermore, StuArt adopts various user-friendly visualization designs to help instructors quickly understand the individual and whole learning status. Experimental results on real classroom videos have demonstrated the superiority and robustness of the embedded algorithms. We expect our system promoting the development of large-scale individualized guidance of students.
translated by 谷歌翻译
Domain adaptive object detection (DAOD) aims to alleviate transfer performance degradation caused by the cross-domain discrepancy. However, most existing DAOD methods are dominated by computationally intensive two-stage detectors, which are not the first choice for industrial applications. In this paper, we propose a novel semi-supervised domain adaptive YOLO (SSDA-YOLO) based method to improve cross-domain detection performance by integrating the compact one-stage detector YOLOv5 with domain adaptation. Specifically, we adapt the knowledge distillation framework with the Mean Teacher model to assist the student model in obtaining instance-level features of the unlabeled target domain. We also utilize the scene style transfer to cross-generate pseudo images in different domains for remedying image-level differences. In addition, an intuitive consistency loss is proposed to further align cross-domain predictions. We evaluate our proposed SSDA-YOLO on public benchmarks including PascalVOC, Clipart1k, Cityscapes, and Foggy Cityscapes. Moreover, to verify its generalization, we conduct experiments on yawning detection datasets collected from various classrooms. The results show considerable improvements of our method in these DAOD tasks. Our code is available on \url{https://github.com/hnuzhy/SSDA-YOLO}.
translated by 谷歌翻译
小鼠的自动社会行为分析已成为行为神经科学中越来越流行的研究领域。最近,已使用姿势信息(即关键点或骨骼的位置)来解释小鼠的社会行为。然而,很少在现有方法中研究了小鼠关键点基础的社会互动信息的有效编码和解码。特别是,由于高度变形的身体形状和模棱两可的运动模式,建模小鼠之间复杂的社交互动是一项挑战。为了处理交互建模问题,我们在这里提出了一个跨骨骼相互作用图聚合网络(CS-IGANET),以学习自由相互作用的小鼠的丰富动力学,其中使用了跨骨骼节点级交互模块(CS-NLI)建模多级相互作用(即内部,间和跨骨骼相互作用)。此外,我们设计了一种新颖的互动感知变压器(IAT),以动态学习社交行为的图形表示,并更新节点级表示,并在我们提出的互动意识到的自我注意力下的机制的指导下。最后,为了增强我们的模型的表示能力,提出了辅助自我监督的学习任务来衡量跨骨骼节点之间的相似性。标准CRMI13-SKERTON和我们的PDMB-Skeleton数据集的实验结果表明,我们所提出的模型的表现优于其他几种最先进的方法。
translated by 谷歌翻译
用于对象检测的常规知识蒸馏(KD)方法主要集中于同质的教师学生探测器。但是,用于部署的轻质检测器的设计通常与高容量探测器显着不同。因此,我们研究了异构教师对之间的KD,以进行广泛的应用。我们观察到,异质KD(异核KD)的核心难度是由于不同优化的方式而导致异质探测器的主链特征之间的显着语义差距。常规的同质KD(HOMO-KD)方法遭受了这种差距的影响,并且很难直接获得异性KD的令人满意的性能。在本文中,我们提出了异助剂蒸馏(Head)框架,利用异质检测头作为助手来指导学生探测器的优化以减少此间隙。在头上,助手是一个额外的探测头,其建筑与学生骨干的老师负责人同质。因此,将异源KD转变为同性恋,从而可以从老师到学生的有效知识转移。此外,当训练有素的教师探测器不可用时,我们将头部扩展到一个无教师的头(TF-Head)框架。与当前检测KD方法相比,我们的方法已取得了显着改善。例如,在MS-COCO数据集上,TF-Head帮助R18视网膜实现33.9 MAP(+2.2),而Head将极限进一步推到36.2 MAP(+4.5)。
translated by 谷歌翻译
瀑布推荐系统(RS)是移动应用程序中RS的流行形式,是推荐的项目流,这些项目由连续页面组成,可以通过滚动浏览。在Waterfall RS中,当用户完成浏览页面时,Edge(例如,手机)将向Cloud Server发送请求,以获取新的建议页面,称为分页请求机制。 RSS通常将大量项目放入一页中,以减少众多分页请求中的过度资源消耗,但是,这将降低RSS根据用户的实时兴趣及时续订建议的能力,并导致贫穷的用户。经验。直观地,在页面内插入其他请求以更新频率的建议可以减轻问题。但是,以前的尝试,包括非自适应策略(例如,统一插入请求)最终会导致资源过度消费。为此,我们设想了一项名为智能请求策略设计(IRSD)的Edge Intelligence的新学习任务。它旨在通过根据用户的实时意图确定请求插入的适当情况来提高瀑布RSS的有效性。此外,我们提出了一种新的自适应请求插入策略的范式,名为基于Uplift的On-Ending Smart请求框架(AdareQuest)。 AdareQuest 1)通过将实时行为与基于基于注意力的神经网络相匹配的历史兴趣来捕获用户意图的动态变化。 2)估计根据因果推理插入的请求带来的用户购买的反事实提升。 3)通过在在线资源约束下最大化效用功能来确定最终请求插入策略。我们在离线数据集和在线A/B测试上进行了广泛的实验,以验证AdareQuest的有效性。
translated by 谷歌翻译