智能论文笔记

Hybrid Learning- and Model-Based Planning and Control of In-Hand Manipulation

Rana Soltani Zarrin , Katsu Yamane , Rianna Jitosho

分类：机器人

2022-09-20

本文提出了一个层次结构框架，用于计划和控制涉及使用完全插入的多指机器人手的掌握变化的刚性对象的操纵。尽管该框架可以应用于一般的灵巧操作，但我们专注于对手持操作的更复杂的定义，在该目标下，目标姿势必须达到适合使用该对象作为工具的掌握。高级别的计划者确定对象轨迹以及掌握更改，即添加，卸下或滑动手指，由低级控制器执行。尽管基于学习的策略可以适应变化，但GRASP序列是在线计划的，但用于对象跟踪和接触力控制的轨迹规划师和低级控制器仅基于模型，以稳健地实现该计划。通过将有关问题的物理和低级控制器的知识注入GRASP规划师中，它将学会成功生成类似于基于模型的优化方法生成的grasps，从而消除了此类方法的高计算成本到该方法的高度计算成本到解释变化。通过在物理模拟中进行实验，以实现现实工具使用方案，我们将在不同的工具使用任务和灵活的手模型上展示了方法的成功。此外，我们表明，与基于模型的方法相比，这种混合方法为轨迹和任务变化提供了更大的鲁棒性。

translated by 谷歌翻译

3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions

Dale Decatur , Itai Lang , Rana Hanocka

分类：计算机视觉

2022-12-21

We present 3D Highlighter, a technique for localizing semantic regions on a mesh using text as input. A key feature of our system is the ability to interpret "out-of-domain" localizations. Our system demonstrates the ability to reason about where to place non-obviously related concepts on an input 3D shape, such as adding clothing to a bare 3D animal model. Our method contextualizes the text description using a neural field and colors the corresponding region of the shape using a probability-weighted blend. Our neural optimization is guided by a pre-trained CLIP encoder, which bypasses the need for any 3D datasets or 3D annotations. Thus, 3D Highlighter is highly flexible, general, and capable of producing localizations on a myriad of input shapes. Our code is publicly available at https://github.com/threedle/3DHighlighter.

translated by 谷歌翻译

DA Wand: Distortion-Aware Selection using Neural Mesh Parameterization

Richard Liu , Noam Aigerman , Vladimir G. Kim , Rana Hanocka

分类：计算机视觉

2022-12-13

We present a neural technique for learning to select a local sub-region around a point which can be used for mesh parameterization. The motivation for our framework is driven by interactive workflows used for decaling, texturing, or painting on surfaces. Our key idea is to incorporate segmentation probabilities as weights of a classical parameterization method, implemented as a novel differentiable parameterization layer within a neural network framework. We train a segmentation network to select 3D regions that are parameterized into 2D and penalized by the resulting distortion, giving rise to segmentations which are distortion-aware. Following training, a user can use our system to interactively select a point on the mesh and obtain a large, meaningful region around the selection which induces a low-distortion parameterization. Our code and project page are currently available.

translated by 谷歌翻译

"I think this is the most disruptive technology": Exploring Sentiments of ChatGPT Early Adopters using Twitter Data

Mubin Ul Haque , Isuru Dharmadasa , Zarrin Tasnim Sworna , Roshan Namal Rajapakse , Hussain Ahmad

分类：自然语言处理

2022-12-12

Large language models have recently attracted significant attention due to their impressive performance on a variety of tasks. ChatGPT developed by OpenAI is one such implementation of a large, pre-trained language model that has gained immense popularity among early adopters, where certain users go to the extent of characterizing it as a disruptive technology in many domains. Understanding such early adopters' sentiments is important because it can provide insights into the potential success or failure of the technology, as well as its strengths and weaknesses. In this paper, we conduct a mixed-method study using 10,732 tweets from early ChatGPT users. We first use topic modelling to identify the main topics and then perform an in-depth qualitative sentiment analysis of each topic. Our results show that the majority of the early adopters have expressed overwhelmingly positive sentiments related to topics such as Disruptions to software development, Entertainment and exercising creativity. Only a limited percentage of users expressed concerns about issues such as the potential for misuse of ChatGPT, especially regarding topics such as Impact on educational aspects. We discuss these findings by providing specific examples for each topic and then detail implications related to addressing these concerns for both researchers and users.

translated by 谷歌翻译

LoopDraw: a Loop-Based Autoregressive Model for Shape Synthesis and Editing

Nam Anh Dinh , Haochen Wang , Greg Shakhnarovich , Rana Hanocka

分类：计算机视觉

2022-12-09

There is no settled universal 3D representation for geometry with many alternatives such as point clouds, meshes, implicit functions, and voxels to name a few. In this work, we present a new, compelling alternative for representing shapes using a sequence of cross-sectional closed loops. The loops across all planes form an organizational hierarchy which we leverage for autoregressive shape synthesis and editing. Loops are a non-local description of the underlying shape, as simple loop manipulations (such as shifts) result in significant structural changes to the geometry. This is in contrast to manipulating local primitives such as points in a point cloud or a triangle in a triangle mesh. We further demonstrate that loops are intuitive and natural primitive for analyzing and editing shapes, both computationally and for users.

translated by 谷歌翻译

RANA: Relightable Articulated Neural Avatars

Umar Iqbal , Akin Caliskan , Koki Nagano , Sameh Khamis , Pavlo Molchanov , Jan Kautz

分类：计算机视觉

2022-12-06

We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting. We only require a short video clip of the person to create the avatar and assume no knowledge about the lighting environment. We present a novel framework to model humans while disentangling their geometry, texture, and also lighting environment from monocular RGB videos. To simplify this otherwise ill-posed task we first estimate the coarse geometry and texture of the person via SMPL+D model fitting and then learn an articulated neural representation for photorealistic image generation. RANA first generates the normal and albedo maps of the person in any given target body pose and then uses spherical harmonics lighting to generate the shaded image in the target lighting environment. We also propose to pretrain RANA using synthetic images and demonstrate that it leads to better disentanglement between geometry and texture while also improving robustness to novel body poses. Finally, we also present a new photorealistic synthetic dataset, Relighting Humans, to quantitatively evaluate the performance of the proposed approach.

translated by 谷歌翻译

RadFormer: Transformers with Global-Local Attention for Interpretable and Accurate Gallbladder Cancer Detection

Soumen Basu , Mayank Gupta , Pratyaksha Rana , Pankaj Gupta , Chetan Arora

分类：计算机视觉

2022-11-09

We propose a novel deep neural network architecture to learn interpretable representation for medical image analysis. Our architecture generates a global attention for region of interest, and then learns bag of words style deep feature embeddings with local attention. The global, and local feature maps are combined using a contemporary transformer architecture for highly accurate Gallbladder Cancer (GBC) detection from Ultrasound (USG) images. Our experiments indicate that the detection accuracy of our model beats even human radiologists, and advocates its use as the second reader for GBC diagnosis. Bag of words embeddings allow our model to be probed for generating interpretable explanations for GBC detection consistent with the ones reported in medical literature. We show that the proposed model not only helps understand decisions of neural network models but also aids in discovery of new visual features relevant to the diagnosis of GBC. Source-code and model will be available at https://github.com/sbasu276/RadFormer

translated by 谷歌翻译

Impact Learning: A Learning Method from Features Impact and Competition

Nusrat Jahan Prottasha , Saydul Akbar Murad , Abu Jafar Md Muzahid , Masud Rana , Md Kowsher , Apurba Adhikary , Sujit Biswas , Anupam Kumar Bairagi

分类：机器学习 | 人工智能

2022-11-04

Machine learning is the study of computer algorithms that can automatically improve based on data and experience. Machine learning algorithms build a model from sample data, called training data, to make predictions or judgments without being explicitly programmed to do so. A variety of wellknown machine learning algorithms have been developed for use in the field of computer science to analyze data. This paper introduced a new machine learning algorithm called impact learning. Impact learning is a supervised learning algorithm that can be consolidated in both classification and regression problems. It can furthermore manifest its superiority in analyzing competitive data. This algorithm is remarkable for learning from the competitive situation and the competition comes from the effects of autonomous features. It is prepared by the impacts of the highlights from the intrinsic rate of natural increase (RNI). We, moreover, manifest the prevalence of the impact learning over the conventional machine learning algorithm.

translated by 谷歌翻译

Unintended Memorization and Timing Attacks in Named Entity Recognition Models

Rana Salal Ali , Benjamin Zi Hao Zhao , Hassan Jameel Asghar , Tham Nguyen , Ian David Wood , Dali Kaafar

分类：人工智能 | 机器学习

2022-11-04

Named entity recognition models (NER), are widely used for identifying named entities (e.g., individuals, locations, and other information) in text documents. Machine learning based NER models are increasingly being applied in privacy-sensitive applications that need automatic and scalable identification of sensitive information to redact text for data sharing. In this paper, we study the setting when NER models are available as a black-box service for identifying sensitive information in user documents and show that these models are vulnerable to membership inference on their training datasets. With updated pre-trained NER models from spaCy, we demonstrate two distinct membership attacks on these models. Our first attack capitalizes on unintended memorization in the NER's underlying neural network, a phenomenon NNs are known to be vulnerable to. Our second attack leverages a timing side-channel to target NER models that maintain vocabularies constructed from the training data. We show that different functional paths of words within the training dataset in contrast to words not previously seen have measurable differences in execution time. Revealing membership status of training samples has clear privacy implications, e.g., in text redaction, sensitive words or phrases to be found and removed, are at risk of being detected in the training dataset. Our experimental evaluation includes the redaction of both password and health data, presenting both security risks and privacy/regulatory issues. This is exacerbated by results that show memorization with only a single phrase. We achieved 70% AUC in our first attack on a text redaction use-case. We also show overwhelming success in the timing attack with 99.23% AUC. Finally we discuss potential mitigation approaches to realize the safe use of NER models in light of the privacy and security implications of membership inference attacks.

translated by 谷歌翻译

Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics

Krishan Rana , Ming Xu , Brendan Tidd , Michael Milford , Niko Sünderhauf

分类：机器人 | 人工智能 | 机器学习

2022-11-04

Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.

translated by 谷歌翻译