智能论文笔记

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

Kaixuan Huang , Yu Wu , Xuezhou Zhang , Shenyinying Tu , Qingyun Wu , Mengdi Wang , Huazheng Wang

分类：机器学习 | (统计)机器学习

2022-06-29

在线影响最大化旨在通过选择一些种子节点，最大程度地利用未知网络模型的社交网络中内容的影响。最近的研究遵循非自适应设置，在扩散过程开始之前选择种子节点，并且在扩散停止时更新网络参数。我们考虑了与内容相关的在线影响最大化问题的自适应版本，其中种子节点是根据实时反馈依次激活的。在本文中，我们将问题提出为无限马在线性扩散过程中的折扣MDP，并提出了基于模型的增强学习解决方案。我们的算法维护网络模型估算，并适应种子用户，探索社交网络，同时乐观地改善最佳策略。我们建立了$ \ widetilde o（\ sqrt {t}）$遗憾的算法。合成网络的经验评估证明了我们的算法效率。

translated by 谷歌翻译

Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks

Shuoguang Yang , Xuezhou Zhang , Mengdi Wang

分类： (统计)机器学习 | 机器学习

2022-06-22

二重性优化已获得越来越多的兴趣，在元学习，微型型游戏，增强学习和嵌套组成优化中发现了许多应用。本文研究了通过网络上分布式双层优化的问题，在该网络中，代理只能与邻居进行交流，包括来自多任务，多项式学习和联合学习的示例。在本文中，我们提出了一种基于八卦的分布式双层学习算法，该算法允许网络代理在单个时间表中解决内部和外部优化问题，并通过网络传播共享信息。我们表明，我们的算法享受$ \ Mathcal {o}（\ frac {1} {k \ epsilon^2}）$ thement sample sample复杂性，用于一般nonConvex Bilevel优化和$ \ Mathcal {o}（\ frac {1 \ frac {1 } {k \ epsilon}）$用于强烈凸目标，实现了与网络大小线性扩展的加速。样品复杂性在$ \ epsilon $和$ k $中都是最佳的。我们在高参数调整和分散的强化学习的示例中测试算法。模拟实验证实，我们的算法达到了最先进的训练效率和测试准确性。

translated by 谷歌翻译

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Chengzhuo Ni , Ruiqi Zhang , Xiang Ji , Xuezhou Zhang , Mengdi Wang

分类： (统计)机器学习 | 机器学习

2022-01-31

当我们不允许我们使用目标策略进行采样，而只能访问某些未知行为策略生成的数据集时，策略梯度（PG）估计就成为一个挑战。用于支付政策PG估计的常规方法通常会遭受明显的偏差或指数较大的差异。在本文中，我们提出了双拟合的PG估计（FPG）算法。假设访问Bellman-Complete值函数类，FPG可以与任意策略参数化一起工作。在线性值函数近似的情况下，我们在策略梯度估计误差上提供了一个紧密的有限样本上限，该界限受特征空间中测量的分布不匹配量的控制。我们还建立了FPG估计误差的渐近正态性，并具有精确的协方差表征，这进一步证明在统计上是最佳的，具有匹配的Cramer-Rao下限。从经验上讲，我们使用SoftMax表格或RELU策略网络评估FPG在策略梯度估计和策略优化方面的性能。在各种指标下，我们的结果表明，基于重要性采样和降低方差技术，FPG显着优于现有的非政策PG估计方法。

translated by 谷歌翻译

Representation Learning for Online and Offline RL in Low-rank MDPs

Masatoshi Uehara , Xuezhou Zhang , Wen Sun

分类：机器学习 | 人工智能 | (统计)机器学习

2021-10-09

这项工作研究了RL中的代表性学习问题：我们如何学习紧凑的低维表示，使得在代表之上，我们可以以示例有效的方式执行诸如勘探和开发的RL程序。我们专注于低级马尔可夫决策过程（MDP），其中转换动态对应于低秩转换矩阵。与假设表示的事先作品（例如，线性MDP）不同，这里我们需要学习低秩MDP的表示。我们研究在线RL和离线RL设置。对于在线设置，在Flambe（Agarwal et.al）中使用相同的计算oracells操作，用于在低级MDP中学习表示的最先进的算法，我们提出了一种算法Rep-UCB上部置信束缚的驱动表示学习对于RL），这显着提高了$ \ widetilde {o}的样本复杂性（a ^ 9 d ^ 7 /（\ epsilon ^ {10}（1- \ gamma）^ {22}），因为flambe到$ \ widetilde {o}（a ^ 4 d ^ 4 /（\ epsilon ^ 2（1- \ gamma）^ {3}）$ d $是转换矩阵的等级（或地面真相表示的维度），$ a $是行动次数，而$ \ gamma $是折扣因素。值得注意的是，rep-ucb比flambe更简单，因为它直接余额余额表示学习，探索和剥削之间的相互作用，而Flambe是一种探索的探索式风格方法，并且必须逐步执行无奖励探索及时。对于离线RL设置，我们开发了一种利用悲观主义在部分覆盖条件下学习的算法：我们的算法能够与脱机分布所涵盖的策略进行竞争。

translated by 谷歌翻译

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

Sucheng Ren , Fangyun Wei , Zheng Zhang , Han Hu

分类：计算机视觉

2023-01-03

Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

Backdoor Attacks Against Dataset Distillation

Yugeng Liu , Zheng Li , Michael Backes , Yun Shen , Yang Zhang

分类：机器学习

2023-01-03

Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.

translated by 谷歌翻译

PMT-IQA: Progressive Multi-task Learning for Blind Image Quality Assessment

Qingyi Pan , Ning Guo , Letu Qingge , Jingyi Zhang , Pei Yang

分类：计算机视觉

2023-01-03

Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.

translated by 谷歌翻译

Language Models are Drummers: Drum Composition with Natural Language Pre-Training

Li Zhang , Chris Callison-Burch

分类：自然语言处理

2023-01-03

Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments. To tackle this issue, we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances. We show that by doing so, one of the largest, state-of-the-art models (GPT3) is capable of generating reasonable drum grooves, while models that are not pre-trained (Transformer) shows no such ability beyond naive repetition. Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method and analyze drum grooves produced by GPT3 compared to those played by human professionals, exposing the strengths and weaknesses of such generation by language-to-music transfer. Our findings suggest that language-to-music transfer learning with large language models is viable and promising.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译