社交媒体平台难以通过内容审核来保护用户免受有害内容的影响。这些平台最近利用机器学习模型来应对每天大量的用户生成内容。由于节制政策因国家和产品类型而异,因此每项政策训练和部署模型是很常见的。但是,这种方法效率很低,尤其是当策略发生变化时,需要在移动的数据分布上重新标记并重新训练数据集。为了减轻这种成本降低,社交媒体平台经常采用第三方内容审核服务,这些服务提供了多个子任务的预测分数,例如预测未成年人,粗鲁的手势或武器的存在,而不是直接提供最终的调节决策。但是,还没有广泛探索从多个子任务的预测分数中做出可靠的自动审核决策。在这项研究中,我们制定了内容节制的现实情况,并引入了一种简单而有效的阈值优化方法,该方法搜索了多个子任务的最佳阈值,以以具有成本效益的方式做出可靠的适度决策。广泛的实验表明,与现有的阈值优化方法和启发式方法相比,我们的方法在内容节制中表现出更好的性能。
translated by 谷歌翻译
用于开放式对话的基于示例基础的生成模型,基于由猎犬提供的示例,利用生成模型和检索模型来产生响应。然而,它们经常忽略所检索的示例,同时产生响应或产生超过拟接到检索的示例的响应。在本文中,我们认为这些缺点是从开放域对话的一对多问题中衍生出来的。当检索的示例与与金响应显着不同的给定上下文相关时,基于示例的基础生成模型验证以忽略示例,因为示例对于产生金响应并不有用。另一方面,当检索到的示例性与金响应类似时,经过高度的,生成模型训练以依赖于示例。因此,我们提出了一种选择与金响应有语义相关的示例性的训练方法,而是从黄金响应的词汇偏移以减轻上述缺点。在培训阶段,我们建议的培训方法首先使用黄金响应而不是对话背景作为查询,以选择与金响应有关的语义相关的样本。然后,它消除了这种示例性,其中词汇类似于金响应,以减轻生成模型对该示例性的依赖性。剩余的示例可以与给定的上下文无关紧要,因为它们是根据金响应搜索的。因此,我们建议的培训方法进一步利用了给定的上下文与示例之间的相关评分,以惩罚不相关的样权。广泛的实验表明,我们所提出的培训方法减轻了现有的基于示例的生成模型的缺点,并显着提高了适当性和信息性方面的性能。
translated by 谷歌翻译
最近关于使用嘈杂标签的学习的研究通过利用小型干净数据集来显示出色的性能。特别是,基于模型不可知的元学习的标签校正方法进一步提高了性能,通过纠正了嘈杂的标签。但是,标签错误矫予没有保障措施,导致不可避免的性能下降。此外,每个训练步骤都需要至少三个背部传播,显着减慢训练速度。为了缓解这些问题,我们提出了一种强大而有效的方法,可以在飞行中学习标签转换矩阵。采用转换矩阵使分类器对所有校正样本持怀疑态度,这减轻了错误的错误问题。我们还介绍了一个双头架构,以便在单个反向传播中有效地估计标签转换矩阵,使得估计的矩阵紧密地遵循由标签校正引起的移位噪声分布。广泛的实验表明,我们的方法在训练效率方面表现出比现有方法相当或更好的准确性。
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io
translated by 谷歌翻译
An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-balanced Re-weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing drastic dropping recall scores, i.e., losing the majority predicate performances. It has not yet correctly analyzed the trade-off between majority and minority predicate performances in the limited SGG datasets. In this paper, to alleviate the issue, the Skew Class-balanced Re-weighting (SCR) loss function is considered for the unbiased SGG models. Leveraged by the skewness of biased predicate predictions, the SCR estimates the target predicate weight coefficient and then re-weights more to the biased predicates for better trading-off between the majority predicates and the minority ones. Extensive experiments conducted on the standard Visual Genome dataset and Open Image V4 \& V6 show the performances and generality of the SCR with the traditional SGG models.
translated by 谷歌翻译
With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few training examples. It has been a new trend exploring ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress, challenges, and future work in ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques of ICL, including training strategies, prompting strategies, and so on. Finally, we present the challenges of ICL and provide potential directions for further research. We hope our work can encourage more research on uncovering how ICL works and improving ICL in future work.
translated by 谷歌翻译
Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.
translated by 谷歌翻译
Deep neural networks are vulnerable to adversarial attacks. In this paper, we take the role of investigators who want to trace the attack and identify the source, that is, the particular model which the adversarial examples are generated from. Techniques derived would aid forensic investigation of attack incidents and serve as deterrence to potential attacks. We consider the buyers-seller setting where a machine learning model is to be distributed to various buyers and each buyer receives a slightly different copy with same functionality. A malicious buyer generates adversarial examples from a particular copy $\mathcal{M}_i$ and uses them to attack other copies. From these adversarial examples, the investigator wants to identify the source $\mathcal{M}_i$. To address this problem, we propose a two-stage separate-and-trace framework. The model separation stage generates multiple copies of a model for a same classification task. This process injects unique characteristics into each copy so that adversarial examples generated have distinct and traceable features. We give a parallel structure which embeds a ``tracer'' in each copy, and a noise-sensitive training loss to achieve this goal. The tracing stage takes in adversarial examples and a few candidate models, and identifies the likely source. Based on the unique features induced by the noise-sensitive loss function, we could effectively trace the potential adversarial copy by considering the output logits from each tracer. Empirical results show that it is possible to trace the origin of the adversarial example and the mechanism can be applied to a wide range of architectures and datasets.
translated by 谷歌翻译
In this paper, we introduce a novel variation of model-agnostic meta-learning, where an extra multiplicative parameter is introduced in the inner-loop adaptation. Our variation creates a shortcut in the parameter space for the inner-loop adaptation and increases model expressivity in a highly controllable manner. We show both theoretically and numerically that our variation alleviates the problem of conflicting gradients and improves training dynamics. We conduct experiments on 3 distinctive problems, including a toy classification problem for threshold comparison, a regression problem for wavelet transform, and a classification problem on MNIST. We also discuss ways to generalize our method to a broader class of problems.
translated by 谷歌翻译